首页 > 最新文献

SIAM journal on mathematics of data science最新文献

英文 中文
Convergence of a Constrained Vector Extrapolation Scheme 约束向量外推方案的收敛性
Q1 MATHEMATICS, APPLIED Pub Date : 2022-07-11 DOI: 10.1137/21m1428030
Mathieu Barré, Adrien B. Taylor, A. d’Aspremont
{"title":"Convergence of a Constrained Vector Extrapolation Scheme","authors":"Mathieu Barré, Adrien B. Taylor, A. d’Aspremont","doi":"10.1137/21m1428030","DOIUrl":"https://doi.org/10.1137/21m1428030","url":null,"abstract":"","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"22 1","pages":"979-1002"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72852432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Numerical Considerations and a New Implementation for Invariant Coordinate Selection 不变坐标选择的数值考虑与新实现
Q1 MATHEMATICS, APPLIED Pub Date : 2022-07-05 DOI: 10.1137/22M1498759
A. Archimbaud, Z. Drmač, K. Nordhausen, Una Radojicic, A. Ruiz-Gazen
Invariant Coordinate Selection (ICS) is a multivariate data transformation and a dimension reduction method that can be useful in many different contexts. It can be used for outlier detection or cluster identification, and can be seen as an independent component or a non-Gaussian component analysis method. The usual implementation of ICS is based on a joint diagonalization of two scatter matrices, and may be numerically unstable in some ill-conditioned situations. We focus on one-step M-scatter matrices and propose a new implementation of ICS based on a pivoted QR factorization of the centered data set. This factorization avoids the direct computation of the scatter matrices and their inverse and brings numerical stability to the algorithm. Furthermore, the row and column pivoting leads to a rank revealing procedure that allows computation of ICS when the scatter matrices are not full rank. Several artificial and real data sets illustrate the interest of using the new implementation compared to the original one.
不变坐标选择(ICS)是一种多变量数据转换和降维方法,在许多不同的情况下都很有用。它可以用于异常值检测或聚类识别,也可以看作是一种独立分量或非高斯分量分析方法。ICS的通常实现是基于两个散射矩阵的联合对角化,并且在某些病态情况下可能在数值上不稳定。我们专注于一步M-散射矩阵,并提出了一种基于中心数据集的枢轴QR因子分解的ICS的新实现。这种因子分解避免了散射矩阵及其逆矩阵的直接计算,并为算法带来了数值稳定性。此外,行和列的枢轴转动导致秩揭示过程,该过程允许在散射矩阵不是满秩时计算ICS。几个人工和真实的数据集说明了与原始实现相比使用新实现的兴趣。
{"title":"Numerical Considerations and a New Implementation for Invariant Coordinate Selection","authors":"A. Archimbaud, Z. Drmač, K. Nordhausen, Una Radojicic, A. Ruiz-Gazen","doi":"10.1137/22M1498759","DOIUrl":"https://doi.org/10.1137/22M1498759","url":null,"abstract":"Invariant Coordinate Selection (ICS) is a multivariate data transformation and a dimension reduction method that can be useful in many different contexts. It can be used for outlier detection or cluster identification, and can be seen as an independent component or a non-Gaussian component analysis method. The usual implementation of ICS is based on a joint diagonalization of two scatter matrices, and may be numerically unstable in some ill-conditioned situations. We focus on one-step M-scatter matrices and propose a new implementation of ICS based on a pivoted QR factorization of the centered data set. This factorization avoids the direct computation of the scatter matrices and their inverse and brings numerical stability to the algorithm. Furthermore, the row and column pivoting leads to a rank revealing procedure that allows computation of ICS when the scatter matrices are not full rank. Several artificial and real data sets illustrate the interest of using the new implementation compared to the original one.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44460331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-Driven Mirror Descent with Input-Convex Neural Networks 基于输入凸神经网络的数据驱动镜像下降
Q1 MATHEMATICS, APPLIED Pub Date : 2022-06-14 DOI: 10.1137/22m1508613
Hongwei Tan, Subhadip Mukherjee, Junqi Tang, C. Schonlieb
Learning-to-optimize is an emerging framework that seeks to speed up the solution of certain optimization problems by leveraging training data. Learned optimization solvers have been shown to outperform classical optimization algorithms in terms of convergence speed, especially for convex problems. Many existing data-driven optimization methods are based on parameterizing the update step and learning the optimal parameters (typically scalars) from the available data. We propose a novel functional parameterization approach for learned convex optimization solvers based on the classical mirror descent (MD) algorithm. Specifically, we seek to learn the optimal Bregman distance in MD by modeling the underlying convex function using an input-convex neural network (ICNN). The parameters of the ICNN are learned by minimizing the target objective function evaluated at the MD iterate after a predetermined number of iterations. The inverse of the mirror map is modeled approximately using another neural network, as the exact inverse is intractable to compute. We derive convergence rate bounds for the proposed learned mirror descent (LMD) approach with an approximate inverse mirror map and perform extensive numerical evaluation on various convex problems such as image inpainting, denoising, learning a two-class support vector machine (SVM) classifier and a multi-class linear classifier on fixed features.
学习优化是一个新兴的框架,旨在通过利用训练数据来加速某些优化问题的解决方案。学习优化解已被证明在收敛速度方面优于经典优化算法,特别是对于凸问题。许多现有的数据驱动优化方法都是基于参数化更新步骤并从可用数据中学习最优参数(通常是标量)。在经典镜像下降算法的基础上,提出了一种新的泛函参数化方法。具体来说,我们试图通过使用输入-凸神经网络(ICNN)对底层凸函数建模来学习MD中的最优Bregman距离。ICNN的参数是通过在预定次数的迭代后最小化在MD迭代中评估的目标目标函数来学习的。由于精确的逆是难以计算的,因此使用另一个神经网络对镜像映射的逆进行近似建模。我们推导了基于近似逆镜像映射的学习镜像下降(LMD)方法的收敛率界限,并对各种凸问题进行了广泛的数值评估,例如图像的涂漆、去噪、学习两类支持向量机(SVM)分类器和基于固定特征的多类线性分类器。
{"title":"Data-Driven Mirror Descent with Input-Convex Neural Networks","authors":"Hongwei Tan, Subhadip Mukherjee, Junqi Tang, C. Schonlieb","doi":"10.1137/22m1508613","DOIUrl":"https://doi.org/10.1137/22m1508613","url":null,"abstract":"Learning-to-optimize is an emerging framework that seeks to speed up the solution of certain optimization problems by leveraging training data. Learned optimization solvers have been shown to outperform classical optimization algorithms in terms of convergence speed, especially for convex problems. Many existing data-driven optimization methods are based on parameterizing the update step and learning the optimal parameters (typically scalars) from the available data. We propose a novel functional parameterization approach for learned convex optimization solvers based on the classical mirror descent (MD) algorithm. Specifically, we seek to learn the optimal Bregman distance in MD by modeling the underlying convex function using an input-convex neural network (ICNN). The parameters of the ICNN are learned by minimizing the target objective function evaluated at the MD iterate after a predetermined number of iterations. The inverse of the mirror map is modeled approximately using another neural network, as the exact inverse is intractable to compute. We derive convergence rate bounds for the proposed learned mirror descent (LMD) approach with an approximate inverse mirror map and perform extensive numerical evaluation on various convex problems such as image inpainting, denoising, learning a two-class support vector machine (SVM) classifier and a multi-class linear classifier on fixed features.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47238537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Speedy Categorical Distributional Reinforcement Learning and Complexity Analysis 快速分类分布强化学习与复杂性分析
Q1 MATHEMATICS, APPLIED Pub Date : 2022-06-01 DOI: 10.1137/20m1364436
Markus Böck, C. Heitzinger
. In distributional reinforcement learning, the entire distribution of the return instead of just the expected return is modeled. The approach with categorical distributions as the approximation method is well-known in Q-learning, and convergence results have been established in the tabular case. In this work, speedy Q-learning is extended to categorical distributions, a finite-time analysis is performed, and probably approximately correct bounds in terms of the Cram´er distance are established. It is shown that also in the distributional case the new update rule yields faster policy evaluation in comparison to the standard Q-learning one and that the sample complexity is essentially the same as the one of the value-based algorithmic counterpart. Without the need for more state-action-reward samples, one gains significantly more information about the return with categorical distributions. Even though the results do not easily extend to the case of policy control, a slight modification to the update rule yields promising numerical results.
. 在分布式强化学习中,建模的是整个收益的分布,而不仅仅是预期收益。在q学习中,以分类分布作为近似方法的方法是众所周知的,并且在表格情况下已经建立了收敛结果。在这项工作中,快速q -学习扩展到分类分布,执行有限时间分析,并根据克拉姆距离建立了可能近似正确的界限。结果表明,在分布式情况下,与标准q -学习规则相比,新的更新规则产生更快的策略评估,并且样本复杂性本质上与基于值的算法相同。不需要更多的状态-行动-奖励样本,就可以通过分类分布获得更多关于回报的信息。尽管结果不容易扩展到策略控制的情况,但对更新规则的稍微修改会产生有希望的数值结果。
{"title":"Speedy Categorical Distributional Reinforcement Learning and Complexity Analysis","authors":"Markus Böck, C. Heitzinger","doi":"10.1137/20m1364436","DOIUrl":"https://doi.org/10.1137/20m1364436","url":null,"abstract":". In distributional reinforcement learning, the entire distribution of the return instead of just the expected return is modeled. The approach with categorical distributions as the approximation method is well-known in Q-learning, and convergence results have been established in the tabular case. In this work, speedy Q-learning is extended to categorical distributions, a finite-time analysis is performed, and probably approximately correct bounds in terms of the Cram´er distance are established. It is shown that also in the distributional case the new update rule yields faster policy evaluation in comparison to the standard Q-learning one and that the sample complexity is essentially the same as the one of the value-based algorithmic counterpart. Without the need for more state-action-reward samples, one gains significantly more information about the return with categorical distributions. Even though the results do not easily extend to the case of policy control, a slight modification to the update rule yields promising numerical results.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"11 1","pages":"675-693"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86834086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Wasserstein-Based Projections with Applications to Inverse Problems 基于wasserstein的投影及其在逆问题中的应用
Q1 MATHEMATICS, APPLIED Pub Date : 2022-05-05 DOI: 10.1137/20m1376790
Howard Heaton, Samy Wu Fung, A. Lin, S. Osher, W. Yin
. Inverse problems consist of recovering a signal from a collection of noisy measurements. These are typically cast as optimization problems, with classic approaches using a data fidelity term and an analytic regularizer that stabilizes recovery. Recent Plug-and-Play (PnP) works propose replacing the operator for analytic regularization in optimization methods by a data-driven denoiser. These schemes obtain state of the art results, but at the cost of limited theoretical guarantees. To bridge this gap, we present a new algorithm that takes samples from the manifold of true data as input and outputs an approximation of the projection operator onto this manifold. Under standard assumptions, we prove this algorithm generates a learned operator, called Wasserstein-based projection (WP), that approximates the true projection with high probability. Thus, WPs can be inserted into optimization methods in the same manner as PnP, but now with theoretical guarantees. Provided numerical examples show WPs obtain state of the art results for unsupervised PnP signal recovery. 1
. 逆问题包括从噪声测量集合中恢复信号。这些通常被视为优化问题,使用经典方法使用数据保真度项和稳定恢复的解析正则化器。最近的即插即用(PnP)工作建议用数据驱动的去噪器取代优化方法中解析正则化的算子。这些方案获得了最先进的结果,但以有限的理论保证为代价。为了弥补这一差距,我们提出了一种新的算法,该算法从真实数据的流形中获取样本作为输入,并将投影算子的近似值输出到该流形上。在标准假设下,我们证明了该算法生成一个学习算子,称为Wasserstein-based投影(WP),它以高概率逼近真实投影。因此,WPs可以以与PnP相同的方式插入到优化方法中,但现在有了理论上的保证。提供的数值例子表明,WPs获得了无监督PnP信号恢复的最新结果。1
{"title":"Wasserstein-Based Projections with Applications to Inverse Problems","authors":"Howard Heaton, Samy Wu Fung, A. Lin, S. Osher, W. Yin","doi":"10.1137/20m1376790","DOIUrl":"https://doi.org/10.1137/20m1376790","url":null,"abstract":". Inverse problems consist of recovering a signal from a collection of noisy measurements. These are typically cast as optimization problems, with classic approaches using a data fidelity term and an analytic regularizer that stabilizes recovery. Recent Plug-and-Play (PnP) works propose replacing the operator for analytic regularization in optimization methods by a data-driven denoiser. These schemes obtain state of the art results, but at the cost of limited theoretical guarantees. To bridge this gap, we present a new algorithm that takes samples from the manifold of true data as input and outputs an approximation of the projection operator onto this manifold. Under standard assumptions, we prove this algorithm generates a learned operator, called Wasserstein-based projection (WP), that approximates the true projection with high probability. Thus, WPs can be inserted into optimization methods in the same manner as PnP, but now with theoretical guarantees. Provided numerical examples show WPs obtain state of the art results for unsupervised PnP signal recovery. 1","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"9 1","pages":"581-603"},"PeriodicalIF":0.0,"publicationDate":"2022-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75034311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Nonbacktracking spectral clustering of nonuniform hypergraphs 非均匀超图的非回溯谱聚类
Q1 MATHEMATICS, APPLIED Pub Date : 2022-04-27 DOI: 10.48550/arXiv.2204.13586
Philip S. Chodrow, Nicole Eikmeier, Jamie Haddock
Spectral methods offer a tractable, global framework for clustering in graphs via eigenvector computations on graph matrices. Hypergraph data, in which entities interact on edges of arbitrary size, poses challenges for matrix representations and therefore for spectral clustering. We study spectral clustering for nonuniform hypergraphs based on the hypergraph nonbacktracking operator. After reviewing the definition of this operator and its basic properties, we prove a theorem of Ihara-Bass type which allows eigenpair computations to take place on a smaller matrix, often enabling faster computation. We then propose an alternating algorithm for inference in a hypergraph stochastic blockmodel via linearized belief-propagation which involves a spectral clustering step again using nonbacktracking operators. We provide proofs related to this algorithm that both formalize and extend several previous results. We pose several conjectures about the limits of spectral methods and detectability in hypergraph stochastic blockmodels in general, supporting these with in-expectation analysis of the eigeinpairs of our studied operators. We perform experiments in real and synthetic data that demonstrate the benefits of hypergraph methods over graph-based ones when interactions of different sizes carry different information about cluster structure.
谱方法通过对图矩阵的特征向量计算,为图的聚类提供了一个易于处理的全局框架。超图数据中实体在任意大小的边缘上相互作用,这对矩阵表示提出了挑战,因此对谱聚类提出了挑战。基于超图非回溯算子,研究了非均匀超图的谱聚类问题。在回顾了该算子的定义及其基本性质之后,我们证明了Ihara-Bass型定理,该定理允许在较小的矩阵上进行特征对计算,通常可以实现更快的计算。然后,我们提出了一种交替算法,通过线性化的信念传播在超图随机块模型中进行推理,该算法再次使用非回溯算子进行谱聚类步骤。我们提供了与该算法相关的证明,这些证明形式化并扩展了先前的几个结果。我们对超图随机块模型中的谱方法和可检测性的极限提出了几个猜想,并通过对我们研究的算子的特征对的期望内分析来支持这些猜想。我们在真实数据和合成数据中进行了实验,证明了当不同大小的交互携带有关簇结构的不同信息时,超图方法优于基于图的方法。
{"title":"Nonbacktracking spectral clustering of nonuniform hypergraphs","authors":"Philip S. Chodrow, Nicole Eikmeier, Jamie Haddock","doi":"10.48550/arXiv.2204.13586","DOIUrl":"https://doi.org/10.48550/arXiv.2204.13586","url":null,"abstract":"Spectral methods offer a tractable, global framework for clustering in graphs via eigenvector computations on graph matrices. Hypergraph data, in which entities interact on edges of arbitrary size, poses challenges for matrix representations and therefore for spectral clustering. We study spectral clustering for nonuniform hypergraphs based on the hypergraph nonbacktracking operator. After reviewing the definition of this operator and its basic properties, we prove a theorem of Ihara-Bass type which allows eigenpair computations to take place on a smaller matrix, often enabling faster computation. We then propose an alternating algorithm for inference in a hypergraph stochastic blockmodel via linearized belief-propagation which involves a spectral clustering step again using nonbacktracking operators. We provide proofs related to this algorithm that both formalize and extend several previous results. We pose several conjectures about the limits of spectral methods and detectability in hypergraph stochastic blockmodels in general, supporting these with in-expectation analysis of the eigeinpairs of our studied operators. We perform experiments in real and synthetic data that demonstrate the benefits of hypergraph methods over graph-based ones when interactions of different sizes carry different information about cluster structure.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"11 1","pages":"251-279"},"PeriodicalIF":0.0,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81460300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
An improved central limit theorem and fast convergence rates for entropic transportation costs 熵运输成本的一个改进的中心极限定理和快速收敛速率
Q1 MATHEMATICS, APPLIED Pub Date : 2022-04-19 DOI: 10.48550/arXiv.2204.09105
E. Barrio, Alberto González-Sanz, Jean-Michel Loubes, Jonathan Niles-Weed
We prove a central limit theorem for the entropic transportation cost between subgaussian probability measures, centered at the population cost. This is the first result which allows for asymptotically valid inference for entropic optimal transport between measures which are not necessarily discrete. In the compactly supported case, we complement these results with new, faster, convergence rates for the expected entropic transportation cost between empirical measures. Our proof is based on strengthening convergence results for dual solutions to the entropic optimal transport problem.
我们证明了以总体成本为中心的亚高斯概率测度间熵运输成本的中心极限定理。这是第一个允许对不一定是离散的测度之间的熵最优输运进行渐近有效推断的结果。在紧密支持的情况下,我们用新的、更快的、经验测量之间预期熵运输成本的收敛率来补充这些结果。我们的证明是基于熵最优运输问题对偶解的强化收敛结果。
{"title":"An improved central limit theorem and fast convergence rates for entropic transportation costs","authors":"E. Barrio, Alberto González-Sanz, Jean-Michel Loubes, Jonathan Niles-Weed","doi":"10.48550/arXiv.2204.09105","DOIUrl":"https://doi.org/10.48550/arXiv.2204.09105","url":null,"abstract":"We prove a central limit theorem for the entropic transportation cost between subgaussian probability measures, centered at the population cost. This is the first result which allows for asymptotically valid inference for entropic optimal transport between measures which are not necessarily discrete. In the compactly supported case, we complement these results with new, faster, convergence rates for the expected entropic transportation cost between empirical measures. Our proof is based on strengthening convergence results for dual solutions to the entropic optimal transport problem.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"4 1","pages":"639-669"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87397189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Statistical Analysis of Random Objects Via Metric Measure Laplacians 基于度量拉普拉斯算子的随机物体的统计分析
Q1 MATHEMATICS, APPLIED Pub Date : 2022-04-13 DOI: 10.1137/22m1491022
Gilles Mordant, A. Munk
In this paper, we consider a certain convolutional Laplacian for metric measure spaces and investigate its potential for the statistical analysis of complex objects. The spectrum of that Laplacian serves as a signature of the space under consideration and the eigenvectors provide the principal directions of the shape, its harmonics. These concepts are used to assess the similarity of objects or understand their most important features in a principled way which is illustrated in various examples. Adopting a statistical point of view, we define a mean spectral measure and its empirical counterpart. The corresponding limiting process of interest is derived and statistical applications are discussed.
在本文中,我们考虑度量测度空间的某个卷积拉普拉斯算子,并研究它在复杂对象统计分析中的潜力。拉普拉斯算子的频谱是所考虑空间的特征,特征向量提供了形状的主要方向及其谐波。这些概念用于评估对象的相似性或以原则的方式理解其最重要的特征,如各种示例所示。采用统计学的观点,我们定义了一个平均谱测度及其经验对应物。推导了相应的感兴趣的极限过程,并讨论了统计应用。
{"title":"Statistical Analysis of Random Objects Via Metric Measure Laplacians","authors":"Gilles Mordant, A. Munk","doi":"10.1137/22m1491022","DOIUrl":"https://doi.org/10.1137/22m1491022","url":null,"abstract":"In this paper, we consider a certain convolutional Laplacian for metric measure spaces and investigate its potential for the statistical analysis of complex objects. The spectrum of that Laplacian serves as a signature of the space under consideration and the eigenvectors provide the principal directions of the shape, its harmonics. These concepts are used to assess the similarity of objects or understand their most important features in a principled way which is illustrated in various examples. Adopting a statistical point of view, we define a mean spectral measure and its empirical counterpart. The corresponding limiting process of interest is derived and statistical applications are discussed.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47214825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Approximation of Lipschitz Functions using Deep Spline Neural Networks 利用深度样条神经网络逼近Lipschitz函数
Q1 MATHEMATICS, APPLIED Pub Date : 2022-04-13 DOI: 10.48550/arXiv.2204.06233
Sebastian Neumayer, Alexis Goujon, Pakshal Bohra, M. Unser
Lipschitz-constrained neural networks have many applications in machine learning. Since designing and training expressive Lipschitz-constrained networks is very challenging, there is a need for improved methods and a better theoretical understanding. Unfortunately, it turns out that ReLU networks have provable disadvantages in this setting. Hence, we propose to use learnable spline activation functions with at least 3 linear regions instead. We prove that this choice is optimal among all component-wise $1$-Lipschitz activation functions in the sense that no other weight constrained architecture can approximate a larger class of functions. Additionally, this choice is at least as expressive as the recently introduced non component-wise Groupsort activation function for spectral-norm-constrained weights. Previously published numerical results support our theoretical findings.
lipschitz约束神经网络在机器学习中有很多应用。由于设计和训练富有表现力的lipschitz约束网络是非常具有挑战性的,因此需要改进方法和更好的理论理解。不幸的是,事实证明,ReLU网络在这种情况下存在可证明的缺点。因此,我们建议使用具有至少3个线性区域的可学习样条激活函数来代替。我们证明了这种选择在所有组件智能$1$-Lipschitz激活函数中是最优的,因为没有其他权重约束架构可以近似更大的函数类。此外,这种选择至少与最近引入的用于频谱范数约束权重的非组件分组排序激活函数一样具有表现力。先前发表的数值结果支持我们的理论发现。
{"title":"Approximation of Lipschitz Functions using Deep Spline Neural Networks","authors":"Sebastian Neumayer, Alexis Goujon, Pakshal Bohra, M. Unser","doi":"10.48550/arXiv.2204.06233","DOIUrl":"https://doi.org/10.48550/arXiv.2204.06233","url":null,"abstract":"Lipschitz-constrained neural networks have many applications in machine learning. Since designing and training expressive Lipschitz-constrained networks is very challenging, there is a need for improved methods and a better theoretical understanding. Unfortunately, it turns out that ReLU networks have provable disadvantages in this setting. Hence, we propose to use learnable spline activation functions with at least 3 linear regions instead. We prove that this choice is optimal among all component-wise $1$-Lipschitz activation functions in the sense that no other weight constrained architecture can approximate a larger class of functions. Additionally, this choice is at least as expressive as the recently introduced non component-wise Groupsort activation function for spectral-norm-constrained weights. Previously published numerical results support our theoretical findings.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"72 1","pages":"306-322"},"PeriodicalIF":0.0,"publicationDate":"2022-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86873781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A Nonlinear Matrix Decomposition for Mining the Zeros of Sparse Data 一种用于稀疏数据零点挖掘的非线性矩阵分解
Q1 MATHEMATICS, APPLIED Pub Date : 2022-04-07 DOI: 10.1137/21m1405769
L. Saul
. We describe a simple iterative solution to a widely recurring problem in multivariate data analysis: given a sparse nonnegative matrix X , how to estimate a low-rank matrix Θ such that X ≈ f ( Θ ), where f is an elementwise nonlinearity? We develop a latent variable model for this problem and consider those sparsifying nonlinearities, popular in neural networks, that map all negative values to zero. The model seeks to explain the variability of sparse high-dimensional data in terms of a smaller number of degrees of freedom. We show that exact inference in this model is tractable and derive an expectation-maximization (EM) algorithm to estimate the low-rank matrix Θ . Notably, we do not parameterize Θ as a product of smaller matrices to be alternately optimized; instead, we estimate Θ directly via the singular value decomposition of matrices that are repeatedly inferred (at each iteration of the EM algorithm) from the model’s posterior distribution. We use the model to analyze large sparse matrices that arise from data sets of binary, grayscale, and color images. In all of these cases, we find that the model discovers much lower-rank decompositions than purely linear approaches.
. 我们描述了一个在多元数据分析中广泛重复出现的问题的简单迭代解决方案:给定一个稀疏的非负矩阵X,如何估计一个低秩矩阵Θ使得X≈f (Θ),其中f是一个元素非线性?我们为这个问题开发了一个潜在变量模型,并考虑那些在神经网络中流行的稀疏非线性,它们将所有的负值映射为零。该模型试图用较少的自由度来解释稀疏高维数据的可变性。我们证明了该模型中的精确推理是可处理的,并推导了一种期望最大化(EM)算法来估计低秩矩阵Θ。值得注意的是,我们没有将Θ参数化为需要交替优化的较小矩阵的乘积;相反,我们直接通过从模型的后验分布中反复推断(在EM算法的每次迭代中)的矩阵的奇异值分解来估计Θ。我们使用该模型来分析由二值、灰度和彩色图像数据集产生的大型稀疏矩阵。在所有这些情况下,我们发现该模型发现了比纯线性方法低得多的秩分解。
{"title":"A Nonlinear Matrix Decomposition for Mining the Zeros of Sparse Data","authors":"L. Saul","doi":"10.1137/21m1405769","DOIUrl":"https://doi.org/10.1137/21m1405769","url":null,"abstract":". We describe a simple iterative solution to a widely recurring problem in multivariate data analysis: given a sparse nonnegative matrix X , how to estimate a low-rank matrix Θ such that X ≈ f ( Θ ), where f is an elementwise nonlinearity? We develop a latent variable model for this problem and consider those sparsifying nonlinearities, popular in neural networks, that map all negative values to zero. The model seeks to explain the variability of sparse high-dimensional data in terms of a smaller number of degrees of freedom. We show that exact inference in this model is tractable and derive an expectation-maximization (EM) algorithm to estimate the low-rank matrix Θ . Notably, we do not parameterize Θ as a product of smaller matrices to be alternately optimized; instead, we estimate Θ directly via the singular value decomposition of matrices that are repeatedly inferred (at each iteration of the EM algorithm) from the model’s posterior distribution. We use the model to analyze large sparse matrices that arise from data sets of binary, grayscale, and color images. In all of these cases, we find that the model discovers much lower-rank decompositions than purely linear approaches.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"147 7 1","pages":"431-463"},"PeriodicalIF":0.0,"publicationDate":"2022-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83112621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
SIAM journal on mathematics of data science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1