SIAM journal on mathematics of data science最新文献

英文中文

Efficient Identification of Butterfly Sparse Matrix Factorizations 蝴蝶稀疏矩阵分解的高效识别

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-10-04 DOI: 10.1137/22m1488727

Léon Zheng, E. Riccietti, R. Gribonval

Fast transforms correspond to factorizations of the form $mathbf{Z} = mathbf{X}^{(1)} ldots mathbf{X}^{(J)}$, where each factor $ mathbf{X}^{(ell)}$ is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations, i.e., uniqueness up to unavoidable scaling ambiguities. Our main contribution is to prove that any $N times N$ matrix having the so-called butterfly structure admits an essentially unique factorization into $J$ butterfly factors (where $N = 2^{J}$), and that the factors can be recovered by a hierarchical factorization method, which consists in recursively factorizing the considered matrix into two factors. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting. This approach contrasts with existing ones that fit the product of butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorization of the Hadamard or the discrete Fourier transform matrices of size $N=2^J$. Computing such factorizations costs $mathcal{O}(N^{2})$, which is of the order of dense matrix-vector multiplication, while the obtained factorizations enable fast $mathcal{O}(N log N)$ matrix-vector multiplications and have the potential to be applied to compress deep neural networks.

快速变换对应于$mathbf{Z} = mathbf{X}^{(1)} ldots mathbf{X}^{(J)}$的分解形式，其中每个因子$mathbf{X}^{( well)}$是稀疏的并且可能是结构化的。本文研究了这种分解的本质唯一性，即唯一性到不可避免的尺度歧义。我们的主要贡献是证明了任何具有所谓蝴蝶结构的$N * N$矩阵都可以被唯一地分解为$J$蝴蝶因子(其中$N = 2^{J}$)，并且这些因子可以通过分层分解方法恢复，该方法包括将所考虑的矩阵递归分解为两个因子。这种分层的可识别属性依赖于两层固定支持设置中的一个简单的可识别条件。这种方法与现有的通过梯度下降将蝴蝶因子的乘积拟合到给定矩阵的方法形成了对比。该方法特别适用于检索大小为$N=2^J$的Hadamard或离散傅里叶变换矩阵的因式分解。计算这样的因数分解花费$mathcal{O}(N^{2})$，这是密集矩阵-向量乘法的顺序，而获得的因数分解实现了快速的$mathcal{O}(N log N)$矩阵-向量乘法，并且具有应用于压缩深度神经网络的潜力。

{"title":"Efficient Identification of Butterfly Sparse Matrix Factorizations","authors":"Léon Zheng, E. Riccietti, R. Gribonval","doi":"10.1137/22m1488727","DOIUrl":"https://doi.org/10.1137/22m1488727","url":null,"abstract":"Fast transforms correspond to factorizations of the form $mathbf{Z} = mathbf{X}^{(1)} ldots mathbf{X}^{(J)}$, where each factor $ mathbf{X}^{(ell)}$ is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations, i.e., uniqueness up to unavoidable scaling ambiguities. Our main contribution is to prove that any $N times N$ matrix having the so-called butterfly structure admits an essentially unique factorization into $J$ butterfly factors (where $N = 2^{J}$), and that the factors can be recovered by a hierarchical factorization method, which consists in recursively factorizing the considered matrix into two factors. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting. This approach contrasts with existing ones that fit the product of butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorization of the Hadamard or the discrete Fourier transform matrices of size $N=2^J$. Computing such factorizations costs $mathcal{O}(N^{2})$, which is of the order of dense matrix-vector multiplication, while the obtained factorizations enable fast $mathcal{O}(N log N)$ matrix-vector multiplications and have the potential to be applied to compress deep neural networks.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"50 1","pages":"22-49"},"PeriodicalIF":0.0,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75817474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization 重点:计算最优和通信高效的分散非凸有限和优化

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-10-04 DOI: 10.1137/21m1450677

Boyue Li, Zhize Li, Yuejie Chi

Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle (IFO) complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including randomly activated stochastic recursive gradient updates with mini-batches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for per-iteration communication, together with careful choices of hyper-parameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.

多智能体环境中的新兴应用，如物联网、网络传感、自主系统和联邦学习，需要分散的算法来进行有限和优化，在计算和通信方面都是资源高效的。在本文中，我们考虑了一个原型设置，其中智能体通过在预定的网络拓扑上仅与邻居通信来协同工作以最小化局部损失函数的总和。我们开发了一种新的算法，称为分散随机递归梯度方法(DESTRESS)，用于非凸有限和优化，它与寻找一阶平稳点的集中式算法的最优增量一阶oracle (IFO)复杂度相匹配，同时保持通信效率。详细的理论和数值比较证实，在广泛的参数范围内，与先前的分散算法相比，DESTRESS的资源效率有所提高。DESTRESS利用了几个关键的算法设计思想，包括随机激活的随机递归梯度更新，用于局部计算的小批量，用于每次迭代通信的额外混合(即多个八卦轮)的梯度跟踪，以及超参数的仔细选择和新的分析框架，以证明实现理想的计算-通信权衡。

{"title":"DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization","authors":"Boyue Li, Zhize Li, Yuejie Chi","doi":"10.1137/21m1450677","DOIUrl":"https://doi.org/10.1137/21m1450677","url":null,"abstract":"Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle (IFO) complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including randomly activated stochastic recursive gradient updates with mini-batches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for per-iteration communication, together with careful choices of hyper-parameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"6 1","pages":"1031-1051"},"PeriodicalIF":0.0,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84079619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Local versions of sum-of-norms clustering 规范和聚类的局部版本

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-09-20 DOI: 10.1137/21m1448732

Alexander Dunlap, J. Mourrat

. Sum-of-norms clustering is a convex optimization problem whose solution can be used for the clustering of multivariate data. We propose and study a localized version of this method, and show in particular that it can separate arbitrarily close balls in the stochastic ball model. More precisely, we prove a quantitative bound on the error incurred in the clustering of disjoint connected sets. Our bound is expressed in terms of the number of datapoints and the localization length of the functional.

．范数和聚类是一个凸优化问题，其解可用于多变量数据的聚类。我们提出并研究了该方法的一个局部化版本，并特别证明了它可以在随机球模型中分离任意接近的球。更准确地说，我们证明了不相交连通集聚类误差的定量界。我们的界是用数据点的个数和泛函的局部化长度来表示的。

引用次数: 3

Moving Up the Cluster Tree with the Gradient Flow 用梯度流向上移动集群树

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-09-17 DOI: 10.1137/22m1469869

E. Arias-Castro, Wanli Qiao

The paper establishes a strong correspondence between two important clustering approaches that emerged in the 1970's: clustering by level sets or cluster tree as proposed by Hartigan and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hostetler. We do so by showing that we can move up the cluster tree by following the gradient ascent flow.

本文建立了20世纪70年代出现的两种重要聚类方法之间的强烈对应关系:Hartigan提出的水平集或聚类树聚类和Fukunaga和Hostetler提出的梯度线或梯度流聚类。我们通过展示我们可以沿着梯度上升流向上移动聚类树来做到这一点。

引用次数: 1

Analysis of Spatial and Spatiotemporal Anomalies Using Persistent Homology: Case Studies with COVID-19 Data 基于持续同源性的时空异常分析:以COVID-19数据为例

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-07-19 DOI: 10.1137/21m1435033

Abigail Hickok, D. Needell, M. A. Porter

We develop a method for analyzing spatial and spatiotemporal anomalies in geospatial data using topological data analysis (TDA). To do this, we use persistent homology (PH), which allows one to algorithmically detect geometric voids in a data set and quantify the persistence of such voids. We construct an efficient filtered simplicial complex (FSC) such that the voids in our FSC are in one-to-one correspondence with the anomalies. Our approach goes beyond simply identifying anomalies;it also encodes information about the relationships between anomalies. We use vineyards, which one can interpret as time-varying persistence diagrams (which are an approach for visualizing PH), to track how the locations of the anomalies change with time. We conduct two case studies using spatially heterogeneous COVID-19 data. First, we examine vaccination rates in New York City by zip code at a single point in time. Second, we study a year-long data set of COVID-19 case rates in neighborhoods of the city of Los Angeles.

我们开发了一种使用拓扑数据分析（TDA）来分析地理空间数据中的空间和时空异常的方法。为此，我们使用持久同源性（PH），它允许人们通过算法检测数据集中的几何空洞，并量化这些空洞的持久性。我们构造了一个有效的滤波单纯复形（FSC），使得FSC中的空隙与异常一一对应。我们的方法不仅仅是识别异常现象；它还对异常之间关系的信息进行编码。我们使用葡萄园，可以将其解释为时变持久图（这是一种可视化PH的方法），来跟踪异常位置如何随时间变化。我们使用空间异质的新冠肺炎数据进行了两个案例研究。首先，我们通过邮政编码在一个时间点检查纽约市的疫苗接种率。其次，我们研究了洛杉矶市社区新冠肺炎病例率的一年数据集。

引用次数: 6

Intrinsic Dimension Adaptive Partitioning for Kernel Methods 核方法的内维数自适应划分

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-07-16 DOI: 10.1137/21m1435690

Thomas Hamm, Ingo Steinwart

We prove minimax optimal learning rates for kernel ridge regression, resp. support vector machines based on a data dependent partition of the input space, where the dependence of the dimension of the input space is replaced by the fractal dimension of the support of the data generating distribution. We further show that these optimal rates can be achieved by a training validation procedure without any prior knowledge on this intrinsic dimension of the data. Finally, we conduct extensive experiments which demonstrate that our considered learning methods are actually able to generalize from a dataset that is non-trivially embedded in a much higher dimensional space just as well as from the original dataset.

我们证明了核岭回归的极小极大最优学习率。支持向量机基于一个数据依赖的输入空间分区，其中输入空间的依赖维数被支持数据生成分布的分形维数所取代。我们进一步表明，这些最优率可以通过训练验证程序来实现，而不需要对数据的内在维度有任何先验知识。最后，我们进行了大量的实验，证明我们所考虑的学习方法实际上能够从嵌入在更高维度空间中的非平凡数据集中进行泛化，就像从原始数据集中一样。

引用次数: 3

Block Alternating Bregman Majorization Minimization with Extrapolation 块交替布雷格曼最大化最小化与外推

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-07-09 DOI: 10.1137/21M1432661

L. Hien, D. Phan, Nicolas Gillis, Masoud Ahookhosh, Panagiotis Patrinos

In this paper, we consider a class of nonsmooth nonconvex optimization problems whose objective is the sum of a block relative smooth function and a proper and lower semicontinuous block separable function. Although the analysis of block proximal gradient (BPG) methods for the class of block $L$-smooth functions have been successfully extended to Bregman BPG methods that deal with the class of block relative smooth functions, accelerated Bregman BPG methods are scarce and challenging to design. Taking our inspiration from Nesterov-type acceleration and the majorization-minimization scheme, we propose a block alternating Bregman Majorization-Minimization framework with Extrapolation (BMME). We prove subsequential convergence of BMME to a first-order stationary point under mild assumptions, and study its global convergence under stronger conditions. We illustrate the effectiveness of BMME on the penalized orthogonal nonnegative matrix factorization problem.

本文研究了一类非光滑非凸优化问题，其目标是块相对光滑函数与适当半连续块可分离函数的和。虽然块的近端梯度(BPG)分析方法已成功地扩展到处理块的相对光滑函数的Bregman BPG方法，但加速的Bregman BPG方法很少，设计难度很大。受nesterov型加速和最大化最小化方案的启发，我们提出了一种带外推的块交替Bregman最大化最小化框架(BMME)。在较温和的假设条件下，证明了BMME对一阶平稳点的次收敛性，并在较强的条件下研究了它的全局收敛性。我们证明了BMME在惩罚正交非负矩阵分解问题上的有效性。

引用次数: 5

A Generalized CUR decomposition for matrix pairs 矩阵对的广义CUR分解

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-07-07 DOI: 10.1137/21m1432119

Perfect Y. Gidisu, M. Hochstenbach

We propose a generalized CUR (GCUR) decomposition for matrix pairs (A,B). Given matrices A and B with the same number of columns, such a decomposition provides low-rank approximations of both matrices simultaneously, in terms of some of their rows and columns. We obtain the indices for selecting the subset of rows and columns of the original matrices using the discrete empirical interpolation method (DEIM) on the generalized singular vectors. When B is square and nonsingular, there are close connections between the GCUR of (A,B) and the DEIM-induced CUR of AB−1. When B is the identity, the GCUR decomposition of A coincides with the DEIM-induced CUR decomposition of A. We also show similar connection between the GCUR of (A,B) and the CUR of AB for a nonsquare but full-rank matrix B, where B denotes the Moore–Penrose pseudoinverse of B. While a CUR decomposition acts on one data set, a GCUR factorization jointly decomposes two data sets. The algorithm may be suitable for applications where one is interested in extracting the most discriminative features from one data set relative to another data set. In numerical experiments, we demonstrate the advantages of the new method over the standard CUR approximation; for recovering data perturbed with colored noise and subgroup discovery.

我们提出了矩阵对(a,B)的广义CUR (GCUR)分解。给定具有相同列数的矩阵A和B，这样的分解同时提供两个矩阵的低秩近似，就它们的一些行和列而言。利用广义奇异向量上的离散经验插值方法(DEIM)，得到了选择原始矩阵行和列子集的指标。当B为方形且非奇异时，(A,B)的GCUR与AB−1的deim诱导的CUR之间存在密切联系。当B是单位矩阵时，A的GCUR分解与A的deim诱导的CUR分解是一致的。对于非平方全秩矩阵B， (A,B)的GCUR与AB的CUR之间也有类似的联系，其中B表示B的Moore-Penrose伪逆。而CUR分解作用于一个数据集，而GCUR分解联合分解两个数据集。该算法可能适用于从一个数据集相对于另一个数据集提取最具区别性特征的应用。在数值实验中，我们证明了新方法相对于标准CUR近似的优点;用于受彩色噪声干扰的数据恢复和子群发现。

{"title":"A Generalized CUR decomposition for matrix pairs","authors":"Perfect Y. Gidisu, M. Hochstenbach","doi":"10.1137/21m1432119","DOIUrl":"https://doi.org/10.1137/21m1432119","url":null,"abstract":"We propose a generalized CUR (GCUR) decomposition for matrix pairs (A,B). Given matrices A and B with the same number of columns, such a decomposition provides low-rank approximations of both matrices simultaneously, in terms of some of their rows and columns. We obtain the indices for selecting the subset of rows and columns of the original matrices using the discrete empirical interpolation method (DEIM) on the generalized singular vectors. When B is square and nonsingular, there are close connections between the GCUR of (A,B) and the DEIM-induced CUR of AB−1. When B is the identity, the GCUR decomposition of A coincides with the DEIM-induced CUR decomposition of A. We also show similar connection between the GCUR of (A,B) and the CUR of AB for a nonsquare but full-rank matrix B, where B denotes the Moore–Penrose pseudoinverse of B. While a CUR decomposition acts on one data set, a GCUR factorization jointly decomposes two data sets. The algorithm may be suitable for applications where one is interested in extracting the most discriminative features from one data set relative to another data set. In numerical experiments, we demonstrate the advantages of the new method over the standard CUR approximation; for recovering data perturbed with colored noise and subgroup discovery.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"40 1","pages":"386-409"},"PeriodicalIF":0.0,"publicationDate":"2021-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81598226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A Generative Variational Model for Inverse Problems in Imaging 成像反问题的生成变分模型

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-04-26 DOI: 10.1137/21m1414978

Andreas Habring, M. Holler

This paper is concerned with the development, analysis and numerical realization of a novel variational model for the regularization of inverse problems in imaging. The proposed model is inspired by the architecture of generative convolutional neural networks; it aims to generate the unknown from variables in a latent space via multi-layer convolutions and non-linear penalties, and penalizes an associated cost. In contrast to conventional neural-network-based approaches, however, the convolution kernels are learned directly from the measured data such that no training is required. The present work provides a mathematical analysis of the proposed model in a function space setting, including proofs for regularity and existence/stability of solutions, and convergence for vanishing noise. Moreover, in a discretized setting, a numerical algorithm for solving various types of inverse problems with the proposed model is derived. Numerical results are provided for applications in inpainting, denoising, deblurring under noise, super-resolution and JPEG decompression with multiple test images.

本文研究了成像反问题正则化的一种新的变分模型的发展、分析和数值实现。该模型受生成卷积神经网络架构的启发;它旨在通过多层卷积和非线性惩罚从潜在空间中的变量生成未知，并惩罚相关成本。然而，与传统的基于神经网络的方法相比，卷积核是直接从测量数据中学习的，因此不需要训练。本工作提供了在函数空间设置中提出的模型的数学分析，包括证明解的正则性和存在性/稳定性，以及消除噪声的收敛性。此外，在离散环境下，推导了用该模型求解各种类型逆问题的数值算法。给出了在多幅测试图像的上色、去噪、去模糊、超分辨率和JPEG解压缩等方面的应用数值结果。

引用次数: 11

Operator Shifting for General Noisy Matrix Systems 一般有噪声矩阵系统的算子移位

Q1 MATHEMATICS, APPLIED

SIAM journal on mathematics of data science

Pub Date : 2021-04-22 DOI: 10.1137/21m1416849

Philip A. Etter, Lexing Ying

. In the computational sciences, one must often estimate model parameters from data subject to noise and uncertainty, leading to inaccurate results. In order to improve the accuracy of models with noisy parameters, we consider the problem of reducing error in a linear system with the operator corrupted by noise. Our contribution in this paper is to extend the elliptic operator shifting framework from Etter, Ying ’20 to the general nonsymmetric matrix case. Roughly, the operator shifting technique is a matrix analogue of the James-Stein estimator. The key insight is that a shift of the matrix inverse estimate in an appropriately chosen direction will reduce average error. In our extension, we interrogate a number of questions — namely, whether or not shifting towards the origin for general matrix inverses always reduces error as it does in the elliptic case. We show that this is usually the case, but that there are three key features of the general nonsingular matrices that allow for adversarial examples not possible in the symmetric case. We prove that when these adversarial possibilities are eliminated by the assumption of noise symmetry and the use of the residual norm as the error metric, the optimal shift is always towards the origin, mirroring results from Etter, Ying ’20. We also investigate behavior in the small noise regime and other scenarios. We conclude by presenting numerical experiments (with accompanying source code) inspired by Reinforcement Learning to demonstrate that operator shifting can yield substantial reductions in error.

．在计算科学中，人们必须经常从受噪声和不确定性影响的数据中估计模型参数，从而导致不准确的结果。为了提高带有噪声参数的模型的精度，我们考虑了算子被噪声破坏的线性系统的误差减小问题。本文的贡献是将椭圆算子移位框架从Etter, Ying ' 20推广到一般的非对称矩阵情况。粗略地说，算子移位技术是詹姆斯-斯坦估计的矩阵模拟。关键的观点是，矩阵逆估计在适当选择的方向上的移位将减少平均误差。在我们的扩展中，我们询问了一些问题-即，对于一般矩阵逆，向原点移动是否总是像在椭圆情况下那样减少误差。我们表明这通常是这种情况，但一般非奇异矩阵有三个关键特征，这些特征允许在对称情况下不可能出现对抗性示例。我们证明，当通过噪声对称假设和使用残差范数作为误差度量来消除这些对抗可能性时，最优位移总是朝向原点，反映了Etter, Ying ' 20的结果。我们还研究了小噪音环境和其他情况下的行为。最后，我们提出了受强化学习启发的数值实验(附带源代码)，以证明算子移位可以大大减少误差。

{"title":"Operator Shifting for General Noisy Matrix Systems","authors":"Philip A. Etter, Lexing Ying","doi":"10.1137/21m1416849","DOIUrl":"https://doi.org/10.1137/21m1416849","url":null,"abstract":". In the computational sciences, one must often estimate model parameters from data subject to noise and uncertainty, leading to inaccurate results. In order to improve the accuracy of models with noisy parameters, we consider the problem of reducing error in a linear system with the operator corrupted by noise. Our contribution in this paper is to extend the elliptic operator shifting framework from Etter, Ying ’20 to the general nonsymmetric matrix case. Roughly, the operator shifting technique is a matrix analogue of the James-Stein estimator. The key insight is that a shift of the matrix inverse estimate in an appropriately chosen direction will reduce average error. In our extension, we interrogate a number of questions — namely, whether or not shifting towards the origin for general matrix inverses always reduces error as it does in the elliptic case. We show that this is usually the case, but that there are three key features of the general nonsingular matrices that allow for adversarial examples not possible in the symmetric case. We prove that when these adversarial possibilities are eliminated by the assumption of noise symmetry and the use of the residual norm as the error metric, the optimal shift is always towards the origin, mirroring results from Etter, Ying ’20. We also investigate behavior in the small noise regime and other scenarios. We conclude by presenting numerical experiments (with accompanying source code) inspired by Reinforcement Learning to demonstrate that operator shifting can yield substantial reductions in error.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42222215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

SIAM journal on mathematics of data science

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀