Information and Inference-A Journal of the Ima最新文献

英文中文

Spectral top-down recovery of latent tree models. 潜在树模型的光谱自上而下复原。

IF 1.4 4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-08-16 eCollection Date: 2023-09-01 DOI: 10.1093/imaiai/iaad032

Yariv Aizenbud, Ariel Jaffe, Meng Wang, Amber Hu, Noah Amsel, Boaz Nadler, Joseph T Chang, Yuval Kluger

Modeling the distribution of high-dimensional data by a latent tree graphical model is a prevalent approach in multiple scientific domains. A common task is to infer the underlying tree structure, given only observations of its terminal nodes. Many algorithms for tree recovery are computationally intensive, which limits their applicability to trees of moderate size. For large trees, a common approach, termed divide-and-conquer, is to recover the tree structure in two steps. First, separately recover the structure of multiple, possibly random subsets of the terminal nodes. Second, merge the resulting subtrees to form a full tree. Here, we develop spectral top-down recovery (STDR), a deterministic divide-and-conquer approach to infer large latent tree models. Unlike previous methods, STDR partitions the terminal nodes in a non random way, based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes. We prove that under certain conditions, this partitioning is consistent with the tree structure. This, in turn, leads to a significantly simpler merging procedure of the small subtrees. We prove that STDR is statistically consistent and bound the number of samples required to accurately recover the tree with high probability. Using simulated data from several common tree models in phylogenetics, we demonstrate that STDR has a significant advantage in terms of runtime, with improved or similar accuracy.

用潜在树状图模型来模拟高维数据的分布是多个科学领域的普遍方法。一项常见的任务是，在仅观察到末端节点的情况下，推断出底层树形结构。许多树恢复算法的计算量都很大，这限制了它们对中等大小树的适用性。对于大树，一种被称为 "分而治之 "的常用方法是分两步恢复树结构。首先，分别恢复多个（可能是随机的）终端节点子集的结构。其次，合并得到的子树，形成完整的树。在这里，我们开发了光谱自上而下恢复法（STDR），这是一种推断大型潜在树模型的确定性分而治之法。与之前的方法不同，STDR 基于与观测节点相关的合适拉普拉斯矩阵的费德勒向量，以非随机的方式分割终端节点。我们证明，在某些条件下，这种分区与树结构是一致的。这反过来又大大简化了小子树的合并过程。我们证明 STDR 在统计上是一致的，并限定了高概率准确恢复树所需的样本数量。利用系统发育学中几种常见树模型的模拟数据，我们证明了 STDR 在运行时间方面具有显著优势，同时准确性也有所提高或相似。

{"title":"Spectral top-down recovery of latent tree models.","authors":"Yariv Aizenbud, Ariel Jaffe, Meng Wang, Amber Hu, Noah Amsel, Boaz Nadler, Joseph T Chang, Yuval Kluger","doi":"10.1093/imaiai/iaad032","DOIUrl":"10.1093/imaiai/iaad032","url":null,"abstract":"Modeling the distribution of high-dimensional data by a latent tree graphical model is a prevalent approach in multiple scientific domains. A common task is to infer the underlying tree structure, given only observations of its terminal nodes. Many algorithms for tree recovery are computationally intensive, which limits their applicability to trees of moderate size. For large trees, a common approach, termed divide-and-conquer, is to recover the tree structure in two steps. First, separately recover the structure of multiple, possibly random subsets of the terminal nodes. Second, merge the resulting subtrees to form a full tree. Here, we develop spectral top-down recovery (STDR), a deterministic divide-and-conquer approach to infer large latent tree models. Unlike previous methods, STDR partitions the terminal nodes in a non random way, based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes. We prove that under certain conditions, this partitioning is consistent with the tree structure. This, in turn, leads to a significantly simpler merging procedure of the small subtrees. We prove that STDR is statistically consistent and bound the number of samples required to accurately recover the tree with high probability. Using simulated data from several common tree models in phylogenetics, we demonstrate that STDR has a significant advantage in terms of runtime, with improved or similar accuracy.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 3","pages":"iaad032"},"PeriodicalIF":1.4,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10431953/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10233338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Minimax optimal regression over Sobolev spaces via Laplacian Eigenmaps on neighbourhood graphs 邻域图上拉普拉斯特征映射在Sobolev空间上的极大极小最优回归

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad034

Alden Green, Sivaraman Balakrishnan, Ryan J Tibshirani

Abstract In this paper, we study the statistical properties of Principal Components Regression with Laplacian Eigenmaps (PCR-LE), a method for non-parametric regression based on Laplacian Eigenmaps (LE). PCR-LE works by projecting a vector of observed responses ${textbf Y} = (Y_1,ldots ,Y_n)$ onto a subspace spanned by certain eigenvectors of a neighbourhood graph Laplacian. We show that PCR-LE achieves minimax rates of convergence for random design regression over Sobolev spaces. Under sufficient smoothness conditions on the design density $p$, PCR-LE achieves the optimal rates for both estimation (where the optimal rate in squared $L^2$ norm is known to be $n^{-2s/(2s + d)}$) and goodness-of-fit testing ($n^{-4s/(4s + d)}$). We also consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence. Interestingly, these rates are almost always much faster than the known rates of convergence of graph Laplacian eigenvectors to their population-level limits; in other words, for this problem regression with estimated features appears to be much easier, statistically speaking, than estimating the features itself. We support these theoretical results with empirical evidence.

摘要本文研究了基于拉普拉斯特征映射的非参数回归方法——主成分回归与拉普拉斯特征映射(PCR-LE)的统计性质。PCR-LE的工作原理是将观察到的响应向量${textbf Y} = (Y_1，ldots,Y_n)$投影到由邻域图拉普拉斯算子的某些特征向量张成的子空间上。我们证明了PCR-LE在Sobolev空间上实现了随机设计回归的极小极大收敛速率。在设计密度$p$的充分平滑条件下，PCR-LE实现了估计(其中最优率的平方$L^2$范数已知为$n^{-2s/(2s + d)}$)和拟合优度检验($n^{-4s/(4s + d)}$)的最优率。我们还考虑了在小内维数$m$的流形上支持设计的情况，并给出了上界，证明PCR-LE实现了更快的极小极大估计($n^{-2s/(2s + m)}$)和测试($n^{-4s/(4s + m)}$)收敛速度。有趣的是，这些速率几乎总是比已知的图拉普拉斯特征向量收敛到其种群水平极限的速率快得多;换句话说，对于这个问题，用估计的特征进行回归似乎比估计特征本身要容易得多。我们用经验证据来支持这些理论结果。

{"title":"Minimax optimal regression over Sobolev spaces via Laplacian Eigenmaps on neighbourhood graphs","authors":"Alden Green, Sivaraman Balakrishnan, Ryan J Tibshirani","doi":"10.1093/imaiai/iaad034","DOIUrl":"https://doi.org/10.1093/imaiai/iaad034","url":null,"abstract":"Abstract In this paper, we study the statistical properties of Principal Components Regression with Laplacian Eigenmaps (PCR-LE), a method for non-parametric regression based on Laplacian Eigenmaps (LE). PCR-LE works by projecting a vector of observed responses ${textbf Y} = (Y_1,ldots ,Y_n)$ onto a subspace spanned by certain eigenvectors of a neighbourhood graph Laplacian. We show that PCR-LE achieves minimax rates of convergence for random design regression over Sobolev spaces. Under sufficient smoothness conditions on the design density $p$, PCR-LE achieves the optimal rates for both estimation (where the optimal rate in squared $L^2$ norm is known to be $n^{-2s/(2s + d)}$) and goodness-of-fit testing ($n^{-4s/(4s + d)}$). We also consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence. Interestingly, these rates are almost always much faster than the known rates of convergence of graph Laplacian eigenvectors to their population-level limits; in other words, for this problem regression with estimated features appears to be much easier, statistically speaking, than estimating the features itself. We support these theoretical results with empirical evidence.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136266565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Statistical characterization of the chordal product determinant of Grassmannian codes 格拉斯曼码弦积行列式的统计表征

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad035

Javier Álvarez-Vizoso, Carlos Beltrán, Diego Cuevas, Ignacio Santamaría, Vít Tuček, Gunnar Peters

Abstract We consider the chordal product determinant, a measure of the distance between two subspaces of the same dimension. In information theory, collections of elements in the complex Grassmannian are searched with the property that their pairwise chordal products are as large as possible. We characterize this function from an statistical perspective, which allows us to obtain bounds for the minimal chordal product and related energy of such collections.

摘要:我们考虑弦积行列式，它是两个相同维数的子空间之间距离的度量。在信息论中，搜索复杂格拉斯曼群中的元素集合时，要求它们的成对弦积尽可能大。我们从统计的角度描述了这个函数，这使我们能够获得这些集合的最小弦积和相关能量的界限。

引用次数: 0

Analysis of the ratio of ℓ1 and ℓ2 norms for signal recovery with partial support information 具有部分支持信息的信号恢复的1和2范数之比分析

IF 1.6 4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad015

Huanmin Ge, Wengu Chen, Michael K. Ng

The ratio of $ell _{1}$ and $ell _{2}$ norms, denoted as $ell _{1}/ell _{2}$, has presented prominent performance in promoting sparsity. By adding partial support information to the standard $ell _{1}/ell _{2}$ minimization, in this paper, we introduce a novel model, i.e. the weighted $ell _{1}/ell _{2}$ minimization, to recover sparse signals from the linear measurements. The restricted isometry property based conditions for sparse signal recovery in both noiseless and noisy cases through the weighted $ell _{1}/ell _{2}$ minimization are established. And we show that the proposed conditions are weaker than the analogous conditions for standard $ell _{1}/ell _{2}$ minimization when the accuracy of the partial support information is at least $50%$. Moreover, we develop effective algorithms and illustrate our results via extensive numerical experiments on synthetic data in both noiseless and noisy cases.

$ well _{1}$和$ well _{2}$规范的比值，表示为$ well _{1}/ well _{2}$，在促进稀疏性方面表现出突出的性能。本文通过在标准的$ well _{1}/ well _{2}$最小化中加入部分支持信息，引入加权$ well _{1}/ well _{2}$最小化模型，从线性测量中恢复稀疏信号。通过加权的$ well _{1}/ well _{2}$最小化，建立了在无噪声和有噪声情况下稀疏信号恢复的基于限制等距性质的条件。当部分支持信息的精度至少为50%时，所提出的条件比标准的$ well _{1}/ well _{2}$最小化的类似条件弱。此外，我们开发了有效的算法，并通过在无噪声和有噪声情况下对合成数据进行广泛的数值实验来说明我们的结果。

引用次数: 1

Approximately low-rank recovery from noisy and local measurements by convex program 用凸规划从噪声和局部测量中近似低秩恢复

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad013

Kiryung Lee, Rakshith Sharma Srinivasa, Marius Junge, Justin Romberg

Abstract Low-rank matrix models have been universally useful for numerous applications, from classical system identification to more modern matrix completion in signal processing and statistics. The nuclear norm has been employed as a convex surrogate of the low-rankness since it induces a low-rank solution to inverse problems. While the nuclear norm for low rankness has an excellent analogy with the $ell _1$ norm for sparsity through the singular value decomposition, other matrix norms also induce low-rankness. Particularly as one interprets a matrix as a linear operator between Banach spaces, various tensor product norms generalize the role of the nuclear norm. We provide a tensor-norm-constrained estimator for the recovery of approximately low-rank matrices from local measurements corrupted with noise. A tensor-norm regularizer is designed to adapt to the local structure. We derive statistical analysis of the estimator over matrix completion and decentralized sketching by applying Maurey’s empirical method to tensor products of Banach spaces. The estimator provides a near-optimal error bound in a minimax sense and admits a polynomial-time algorithm for these applications.

从经典的系统辨识到现代信号处理和统计中的矩阵补全，低秩矩阵模型在许多应用中都有广泛的应用。核范数被用作低秩的凸替代物，因为它可以诱导逆问题的低秩解。通过奇异值分解，低秩核范数与稀疏性范数有很好的相似之处，其他矩阵范数也会导致低秩。特别是当一个人将矩阵解释为巴拿赫空间之间的线性算子时，各种张量积范数概括了核范数的作用。我们提供了一个张量-范数约束估计，用于从被噪声破坏的局部测量中恢复近似低秩矩阵。设计了适应局部结构的张量范数正则化器。将Maurey的经验方法应用于Banach空间的张量积，得到了矩阵补全和分散写生上估计量的统计分析。该估计器在极小极大意义上提供了一个近似最优误差界，并允许多项式时间算法用于这些应用。

{"title":"Approximately low-rank recovery from noisy and local measurements by convex program","authors":"Kiryung Lee, Rakshith Sharma Srinivasa, Marius Junge, Justin Romberg","doi":"10.1093/imaiai/iaad013","DOIUrl":"https://doi.org/10.1093/imaiai/iaad013","url":null,"abstract":"Abstract Low-rank matrix models have been universally useful for numerous applications, from classical system identification to more modern matrix completion in signal processing and statistics. The nuclear norm has been employed as a convex surrogate of the low-rankness since it induces a low-rank solution to inverse problems. While the nuclear norm for low rankness has an excellent analogy with the $ell _1$ norm for sparsity through the singular value decomposition, other matrix norms also induce low-rankness. Particularly as one interprets a matrix as a linear operator between Banach spaces, various tensor product norms generalize the role of the nuclear norm. We provide a tensor-norm-constrained estimator for the recovery of approximately low-rank matrices from local measurements corrupted with noise. A tensor-norm regularizer is designed to adapt to the local structure. We derive statistical analysis of the estimator over matrix completion and decentralized sketching by applying Maurey’s empirical method to tensor products of Banach spaces. The estimator provides a near-optimal error bound in a minimax sense and admits a polynomial-time algorithm for these applications.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"337 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136085521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Universal consistency of Wasserstein k-NN classifier: a negative and some positive results Wasserstein k-NN分类器的普遍一致性:一个否定和一些肯定的结果

IF 1.6 4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad027

Donlapark Ponnoprat

We study the $k$-nearest neighbour classifier ($k$-NN) of probability measures under the Wasserstein distance. We show that the $k$-NN classifier is not universally consistent on the space of measures supported in $(0,1)$. As any Euclidean ball contains a copy of $(0,1)$, one should not expect to obtain universal consistency without some restriction on the base metric space, or the Wasserstein space itself. To this end, via the notion of $sigma $-finite metric dimension, we show that the $k$-NN classifier is universally consistent on spaces of discrete measures (and more generally, $sigma $-finite uniformly discrete measures) with rational mass. In addition, by studying the geodesic structures of the Wasserstein spaces for $p=1$ and $p=2$, we show that the $k$-NN classifier is universally consistent on spaces of measures supported on a finite set, the space of Gaussian measures and spaces of measures with finite wavelet series densities.

研究了Wasserstein距离下概率测度的k近邻分类器(k -NN)。我们证明了$k$-NN分类器在$(0,1)$中支持的测度空间上不是普遍一致的。由于任何欧几里得球都包含$(0,1)$的副本，因此不应期望在基本度量空间或Wasserstein空间本身没有某些限制的情况下获得全称一致性。为此，通过$sigma $-有限度量维的概念，我们证明了$k$-NN分类器在具有有理质量的离散测度(更一般地说，$sigma $-有限一致离散测度)的空间上是普遍一致的。此外，通过研究$p=1$和$p=2$的Wasserstein空间的测地线结构，我们证明了$k$-NN分类器在有限集合支持的测度空间、高斯测度空间和小波序列密度有限的测度空间上是普遍一致的。

引用次数: 0

Lossy compression of general random variables 一般随机变量的有损压缩

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaac035

Erwin Riegler, Helmut Bölcskei, Günther Koliander

Abstract This paper is concerned with the lossy compression of general random variables, specifically with rate-distortion theory and quantization of random variables taking values in general measurable spaces such as, e.g. manifolds and fractal sets. Manifold structures are prevalent in data science, e.g. in compressed sensing, machine learning, image processing and handwritten digit recognition. Fractal sets find application in image compression and in the modeling of Ethernet traffic. Our main contributions are bounds on the rate-distortion function and the quantization error. These bounds are very general and essentially only require the existence of reference measures satisfying certain regularity conditions in terms of small ball probabilities. To illustrate the wide applicability of our results, we particularize them to random variables taking values in (i) manifolds, namely, hyperspheres and Grassmannians and (ii) self-similar sets characterized by iterated function systems satisfying the weak separation property.

摘要:本文主要研究一般随机变量的有损压缩问题，特别是在流形和分形集合等一般可测空间中取值的随机变量的率畸变理论和量化问题。流形结构在数据科学中很普遍，例如压缩感知、机器学习、图像处理和手写数字识别。分形集在图像压缩和以太网流量建模中得到了应用。我们的主要贡献是率失真函数的边界和量化误差。这些界限是非常一般的，本质上只要求存在满足小球概率的某些规则条件的参考测度。为了说明我们的结果的广泛适用性，我们将它们具体到(i)流形中取值的随机变量，即超球和Grassmannians，以及(ii)由满足弱分离性质的迭代函数系统表征的自相似集。

引用次数: 0

Multivariate super-resolution without separation 无分离的多元超分辨率

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad024

Bakytzhan Kurmanbek, Elina Robeva

Abstract In this paper, we study the high-dimensional super-resolution imaging problem. Here, we are given an image of a number of point sources of light whose locations and intensities are unknown. The image is pixelized and is blurred by a known point-spread function arising from the imaging device. We encode the unknown point sources and their intensities via a non-negative measure and we propose a convex optimization program to find it. Assuming the device’s point-spread function is componentwise decomposable, we show that the optimal solution is the true measure in the noiseless case, and it approximates the true measure well in the noisy case with respect to the generalized Wasserstein distance. Our main assumption is that the components of the point-spread function form a Tchebychev system ($T$-system) in the noiseless case and a $T^{*}$-system in the noisy case, mild conditions that are satisfied by Gaussian point-spread functions. Our work is a generalization to all dimensions of the work [14] where the same analysis is carried out in two dimensions. We also extend results in [27] to the high-dimensional case when the point-spread function decomposes.

摘要本文研究了高维超分辨率成像问题。在这里，我们得到了一些点光源的图像，它们的位置和强度都是未知的。图像被像素化，并由成像装置产生的已知点扩散函数模糊。我们通过非负测度对未知点源及其强度进行编码，并提出了一个凸优化程序来寻找未知点源及其强度。假设装置的点扩散函数是可分解的，我们证明了最优解是无噪声情况下的真测度，并且它很好地近似于有噪声情况下的广义Wasserstein距离的真测度。我们的主要假设是，点扩展函数的分量在无噪声情况下形成一个Tchebychev系统($T$-system)，在有噪声情况下形成一个$T^{*}$-系统，高斯点扩展函数满足温和的条件。我们的工作是对工作的所有维度的推广[14]，其中在二维中进行了相同的分析。我们还将[27]中的结果推广到点扩散函数分解时的高维情况。

{"title":"Multivariate super-resolution without separation","authors":"Bakytzhan Kurmanbek, Elina Robeva","doi":"10.1093/imaiai/iaad024","DOIUrl":"https://doi.org/10.1093/imaiai/iaad024","url":null,"abstract":"Abstract In this paper, we study the high-dimensional super-resolution imaging problem. Here, we are given an image of a number of point sources of light whose locations and intensities are unknown. The image is pixelized and is blurred by a known point-spread function arising from the imaging device. We encode the unknown point sources and their intensities via a non-negative measure and we propose a convex optimization program to find it. Assuming the device’s point-spread function is componentwise decomposable, we show that the optimal solution is the true measure in the noiseless case, and it approximates the true measure well in the noisy case with respect to the generalized Wasserstein distance. Our main assumption is that the components of the point-spread function form a Tchebychev system ($T$-system) in the noiseless case and a $T^{*}$-system in the noisy case, mild conditions that are satisfied by Gaussian point-spread functions. Our work is a generalization to all dimensions of the work [14] where the same analysis is carried out in two dimensions. We also extend results in [27] to the high-dimensional case when the point-spread function decomposes.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136266723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A manifold two-sample test study: integral probability metric with neural networks 流形双样本检验研究:基于神经网络的积分概率度量

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad018

Jie Wang, Minshuo Chen, Tuo Zhao, Wenjing Liao, Yao Xie

Abstract Two-sample tests are important areas aiming to determine whether two collections of observations follow the same distribution or not. We propose two-sample tests based on integral probability metric (IPM) for high-dimensional samples supported on a low-dimensional manifold. We characterize the properties of proposed tests with respect to the number of samples $n$ and the structure of the manifold with intrinsic dimension $d$. When an atlas is given, we propose a two-step test to identify the difference between general distributions, which achieves the type-II risk in the order of $n^{-1/max {d,2}}$. When an atlas is not given, we propose Hölder IPM test that applies for data distributions with $(s,beta )$-Hölder densities, which achieves the type-II risk in the order of $n^{-(s+beta )/d}$. To mitigate the heavy computation burden of evaluating the Hölder IPM, we approximate the Hölder function class using neural networks. Based on the approximation theory of neural networks, we show that the neural network IPM test has the type-II risk in the order of $n^{-(s+beta )/d}$, which is in the same order of the type-II risk as the Hölder IPM test. Our proposed tests are adaptive to low-dimensional geometric structure because their performance crucially depends on the intrinsic dimension instead of the data dimension.

摘要双样本检验是确定两个观测值集合是否遵循相同分布的重要领域。我们提出了基于积分概率度量(IPM)的两样本测试，用于支持在低维流形上的高维样本。我们根据样本数$n$和具有固有维数$d$的流形的结构来描述所提出的测试的性质。当给定地图集时，我们提出了两步检验来识别一般分布之间的差异，从而实现了$n^{-1/max {d,2}}$顺序的ii型风险。在没有给出地图集的情况下，我们提出了Hölder IPM检验，适用于密度为$(s,beta )$ -Hölder的数据分布，达到了以$n^{-(s+beta )/d}$为顺序的ii型风险。为了减轻评估Hölder IPM的繁重计算负担，我们使用神经网络近似Hölder函数类。基于神经网络逼近理论，我们证明了神经网络IPM检验具有$n^{-(s+beta )/d}$数量级的ii型风险，与Hölder IPM检验具有相同的ii型风险数量级。我们提出的测试适合低维几何结构，因为它们的性能主要取决于内在维数而不是数据维数。

{"title":"A manifold two-sample test study: integral probability metric with neural networks","authors":"Jie Wang, Minshuo Chen, Tuo Zhao, Wenjing Liao, Yao Xie","doi":"10.1093/imaiai/iaad018","DOIUrl":"https://doi.org/10.1093/imaiai/iaad018","url":null,"abstract":"Abstract Two-sample tests are important areas aiming to determine whether two collections of observations follow the same distribution or not. We propose two-sample tests based on integral probability metric (IPM) for high-dimensional samples supported on a low-dimensional manifold. We characterize the properties of proposed tests with respect to the number of samples $n$ and the structure of the manifold with intrinsic dimension $d$. When an atlas is given, we propose a two-step test to identify the difference between general distributions, which achieves the type-II risk in the order of $n^{-1/max {d,2}}$. When an atlas is not given, we propose Hölder IPM test that applies for data distributions with $(s,beta )$-Hölder densities, which achieves the type-II risk in the order of $n^{-(s+beta )/d}$. To mitigate the heavy computation burden of evaluating the Hölder IPM, we approximate the Hölder function class using neural networks. Based on the approximation theory of neural networks, we show that the neural network IPM test has the type-II risk in the order of $n^{-(s+beta )/d}$, which is in the same order of the type-II risk as the Hölder IPM test. Our proposed tests are adaptive to low-dimensional geometric structure because their performance crucially depends on the intrinsic dimension instead of the data dimension.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136266917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast and provable tensor robust principal component analysis via scaled gradient descent 快速和可证明的张量鲁棒主成分分析通过缩放梯度下降

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad019

Harry Dong, Tian Tong, Cong Ma, Yuejie Chi

Abstract An increasing number of data science and machine learning problems rely on computation with tensors, which better capture the multi-way relationships and interactions of data than matrices. When tapping into this critical advantage, a key challenge is to develop computationally efficient and provably correct algorithms for extracting useful information from tensor data that are simultaneously robust to corruptions and ill-conditioning. This paper tackles tensor robust principal component analysis (RPCA), which aims to recover a low-rank tensor from its observations contaminated by sparse corruptions, under the Tucker decomposition. To minimize the computation and memory footprints, we propose to directly recover the low-dimensional tensor factors—starting from a tailored spectral initialization—via scaled gradient descent (ScaledGD), coupled with an iteration-varying thresholding operation to adaptively remove the impact of corruptions. Theoretically, we establish that the proposed algorithm converges linearly to the true low-rank tensor at a constant rate that is independent with its condition number, as long as the level of corruptions is not too large. Empirically, we demonstrate that the proposed algorithm achieves better and more scalable performance than state-of-the-art tensor RPCA algorithms through synthetic experiments and real-world applications.

越来越多的数据科学和机器学习问题依赖于张量计算，它比矩阵更能捕捉数据的多方向关系和相互作用。当利用这一关键优势时，一个关键的挑战是开发计算效率高且可证明正确的算法，用于从张量数据中提取有用的信息，同时对损坏和病态具有鲁棒性。本文研究了张量鲁棒主成分分析(RPCA)，其目的是在Tucker分解下从被稀疏腐蚀污染的观测中恢复低秩张量。为了最大限度地减少计算和内存占用，我们建议通过缩放梯度下降(ScaledGD)从定制谱初始化开始直接恢复低维张量因子，再加上迭代变化的阈值操作，以自适应地消除损坏的影响。从理论上讲，我们建立了该算法以与条件数无关的常数速率线性收敛到真正的低秩张量，只要破坏程度不太大。通过综合实验和实际应用，我们证明了所提出的算法比最先进的张量RPCA算法具有更好的可扩展性。

{"title":"Fast and provable tensor robust principal component analysis via scaled gradient descent","authors":"Harry Dong, Tian Tong, Cong Ma, Yuejie Chi","doi":"10.1093/imaiai/iaad019","DOIUrl":"https://doi.org/10.1093/imaiai/iaad019","url":null,"abstract":"Abstract An increasing number of data science and machine learning problems rely on computation with tensors, which better capture the multi-way relationships and interactions of data than matrices. When tapping into this critical advantage, a key challenge is to develop computationally efficient and provably correct algorithms for extracting useful information from tensor data that are simultaneously robust to corruptions and ill-conditioning. This paper tackles tensor robust principal component analysis (RPCA), which aims to recover a low-rank tensor from its observations contaminated by sparse corruptions, under the Tucker decomposition. To minimize the computation and memory footprints, we propose to directly recover the low-dimensional tensor factors—starting from a tailored spectral initialization—via scaled gradient descent (ScaledGD), coupled with an iteration-varying thresholding operation to adaptively remove the impact of corruptions. Theoretically, we establish that the proposed algorithm converges linearly to the true low-rank tensor at a constant rate that is independent with its condition number, as long as the level of corruptions is not too large. Empirically, we demonstrate that the proposed algorithm achieves better and more scalable performance than state-of-the-art tensor RPCA algorithms through synthetic experiments and real-world applications.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136267080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Information and Inference-A Journal of the Ima

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀