Information and Inference-A Journal of the Ima最新文献

英文中文

Phase transition and higher order analysis of L_q regularization under dependence. 依赖性下 Lq 正则化的相变和高阶分析。

IF 1.6 4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2024-02-20 eCollection Date: 2024-03-01 DOI: 10.1093/imaiai/iaae005

Hanwen Huang, Peng Zeng, Qinglong Yang

We study the problem of estimating a [Formula: see text]-sparse signal [Formula: see text] from a set of noisy observations [Formula: see text] under the model [Formula: see text], where [Formula: see text] is the measurement matrix the row of which is drawn from distribution [Formula: see text]. We consider the class of [Formula: see text]-regularized least squares (LQLS) given by the formulation [Formula: see text], where [Formula: see text] [Formula: see text] denotes the [Formula: see text]-norm. In the setting [Formula: see text] with fixed [Formula: see text] and [Formula: see text], we derive the asymptotic risk of [Formula: see text] for arbitrary covariance matrix [Formula: see text] that generalizes the existing results for standard Gaussian design, i.e. [Formula: see text]. The results were derived from the non-rigorous replica method. We perform a higher-order analysis for LQLS in the small-error regime in which the first dominant term can be used to determine the phase transition behavior of LQLS. Our results show that the first dominant term does not depend on the covariance structure of [Formula: see text] in the cases [Formula: see text] and [Formula: see text] which indicates that the correlations among predictors only affect the phase transition curve in the case [Formula: see text] a.k.a. LASSO. To study the influence of the covariance structure of [Formula: see text] on the performance of LQLS in the cases [Formula: see text] and [Formula: see text], we derive the explicit formulas for the second dominant term in the expansion of the asymptotic risk in terms of small error. Extensive computational experiments confirm that our analytical predictions are consistent with numerical results.

我们研究在[公式：见正文]模型下，从一组噪声观测值[公式：见正文]中估计[公式：见正文]稀疏信号[公式：见正文]的问题，其中[公式：见正文]是测量矩阵，其行从分布[公式：见正文]中抽取。我们考虑[公式：见正文]公式[公式：见正文]给出的[公式：见正文]正则化最小二乘法（LQLS），其中[公式：见正文][公式：见正文]表示[公式：见正文]正则。在固定[式：见正文]和[式：见正文]的[式：见正文]设置中，我们推导出了任意协方差矩阵[式：见正文]的[式：见正文]的渐近风险，概括了标准高斯设计的现有结果，即[式：见正文]。这些结果来自非严格复制法。我们对小误差机制下的 LQLS 进行了高阶分析，其中第一主项可用于确定 LQLS 的相变行为。我们的结果表明，在[公式：见正文]和[公式：见正文]两种情况下，第一支配项并不依赖于[公式：见正文]的协方差结构，这表明预测因子之间的相关性只影响[公式：见正文]（又称 LASSO）情况下的相变曲线。为了研究[公式：见正文]的协方差结构对[公式：见正文]和[公式：见正文]情况下 LQLS 性能的影响，我们推导出了以小误差为单位的渐近风险扩展中第二个主导项的明确公式。广泛的计算实验证实，我们的分析预测与数值结果是一致的。

{"title":"Phase transition and higher order analysis of Lq regularization under dependence.","authors":"Hanwen Huang, Peng Zeng, Qinglong Yang","doi":"10.1093/imaiai/iaae005","DOIUrl":"10.1093/imaiai/iaae005","url":null,"abstract":"We study the problem of estimating a [Formula: see text]-sparse signal [Formula: see text] from a set of noisy observations [Formula: see text] under the model [Formula: see text], where [Formula: see text] is the measurement matrix the row of which is drawn from distribution [Formula: see text]. We consider the class of [Formula: see text]-regularized least squares (LQLS) given by the formulation [Formula: see text], where [Formula: see text] [Formula: see text] denotes the [Formula: see text]-norm. In the setting [Formula: see text] with fixed [Formula: see text] and [Formula: see text], we derive the asymptotic risk of [Formula: see text] for arbitrary covariance matrix [Formula: see text] that generalizes the existing results for standard Gaussian design, i.e. [Formula: see text]. The results were derived from the non-rigorous replica method. We perform a higher-order analysis for LQLS in the small-error regime in which the first dominant term can be used to determine the phase transition behavior of LQLS. Our results show that the first dominant term does not depend on the covariance structure of [Formula: see text] in the cases [Formula: see text] and [Formula: see text] which indicates that the correlations among predictors only affect the phase transition curve in the case [Formula: see text] a.k.a. LASSO. To study the influence of the covariance structure of [Formula: see text] on the performance of LQLS in the cases [Formula: see text] and [Formula: see text], we derive the explicit formulas for the second dominant term in the expansion of the asymptotic risk in terms of small error. Extensive computational experiments confirm that our analytical predictions are consistent with numerical results.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"13 1","pages":"iaae005"},"PeriodicalIF":1.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10878746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139933465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On statistical inference with high-dimensional sparse CCA. 高维稀疏CCA的统计推断。

IF 16.4 4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-11-17 eCollection Date: 2023-12-01 DOI: 10.1093/imaiai/iaad040

Nilanjana Laha, Nathan Huey, Brent Coull, Rajarshi Mukherjee

We consider asymptotically exact inference on the leading canonical correlation directions and strengths between two high-dimensional vectors under sparsity restrictions. In this regard, our main contribution is developing a novel representation of the Canonical Correlation Analysis problem, based on which one can operationalize a one-step bias correction on reasonable initial estimators. Our analytic results in this regard are adaptive over suitable structural restrictions of the high-dimensional nuisance parameters, which, in this set-up, correspond to the covariance matrices of the variables of interest. We further supplement the theoretical guarantees behind our procedures with extensive numerical studies.

在稀疏性条件下，研究了两个高维向量间的典型相关方向和强度的渐近精确推断。在这方面，我们的主要贡献是开发了典型相关分析问题的新表示，在此基础上，可以对合理的初始估计量进行一步偏差校正。在这方面，我们的分析结果在高维干扰参数的适当结构限制下是自适应的，在这种设置中，这些参数对应于感兴趣变量的协方差矩阵。我们进一步补充理论保证背后的程序与广泛的数值研究。

引用次数: 0

Black-box tests for algorithmic stability. 算法稳定性的黑盒测试

IF 1.4 4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-10-14 eCollection Date: 2023-12-01 DOI: 10.1093/imaiai/iaad039

Byol Kim, Rina Foygel Barber

Algorithmic stability is a concept from learning theory that expresses the degree to which changes to the input data (e.g. removal of a single data point) may affect the outputs of a regression algorithm. Knowing an algorithm's stability properties is often useful for many downstream applications-for example, stability is known to lead to desirable generalization properties and predictive inference guarantees. However, many modern algorithms currently used in practice are too complex for a theoretical analysis of their stability properties, and thus we can only attempt to establish these properties through an empirical exploration of the algorithm's behaviour on various datasets. In this work, we lay out a formal statistical framework for this kind of black-box testing without any assumptions on the algorithm or the data distribution, and establish fundamental bounds on the ability of any black-box test to identify algorithmic stability.

算法稳定性是学习理论中的一个概念，它表达了输入数据的变化（例如去除单个数据点）可能影响回归算法输出的程度。知道算法的稳定性特性通常对许多下游应用有用，例如，已知稳定性会导致期望的泛化特性和预测推理保证。然而，目前在实践中使用的许多现代算法过于复杂，无法对其稳定性特性进行理论分析，因此我们只能尝试通过对算法在各种数据集上的行为进行经验探索来建立这些特性。在这项工作中，我们为这种黑箱测试制定了一个正式的统计框架，而不对算法或数据分布进行任何假设，并对任何黑箱测试识别算法稳定性的能力建立了基本界限。

引用次数: 0

Bayesian denoising of structured sources and its implications on learning-based denoising 结构化数据源的贝叶斯去噪及其在基于学习的去噪中的意义

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-09-19 DOI: 10.1093/imaiai/iaad036

Wenda Zhou, Joachim Wabnig, Shirin Jalali

Abstract Denoising a stationary process $(X_{i})_{i in mathbb{Z}}$ corrupted by additive white Gaussian noise $(Z_{i})_{i in mathbb{Z}}$ is a classic, well-studied and fundamental problem in information theory and statistical signal processing. However, finding theoretically founded computationally efficient denoising methods applicable to general sources is still an open problem. In the Bayesian set-up where the source distribution is known, a minimum mean square error (MMSE) denoiser estimates $X^{n}$ from noisy measurements $Y^{n}$ as $hat{X}^{n}=mathrm{E}[X^{n}|Y^{n}]$. However, for general sources, computing $mathrm{E}[X^{n}|Y^{n}]$ is computationally very challenging, if not infeasible. In this paper, starting from a Bayesian set-up, a novel denoising method, namely, quantized maximum a posteriori (Q-MAP) denoiser is proposed and its asymptotic performance is analysed. Both for memoryless sources, and for structured first-order Markov sources, it is shown that, asymptotically, as $sigma _{z}^{2} $ (noise variance) converges to zero, ${1over sigma _{z}^{2}} mathrm{E}[(X_{i}-hat{X}^{mathrm{QMAP}}_{i})^{2}]$ converges to the information dimension of the source. For the studied memoryless sources, this limit is known to be optimal. A key advantage of the Q-MAP denoiser, unlike an MMSE denoiser, is that it highlights the key properties of the source distribution that are to be used in its denoising. This key property leads to a new learning-based denoising approach that is applicable to generic structured sources. Using ImageNet database for training, initial simulation results exploring the performance of such a learning-based denoiser in image denoising are presented.

对受加性高斯白噪声干扰的平稳过程$(X_{i})_{i in mathbb{Z}}$去噪$(Z_{i})_{i in mathbb{Z}}$是信息论和统计信号处理中一个经典的、研究得很充分的基础问题。然而，寻找适用于一般信号源的理论基础计算高效的去噪方法仍然是一个悬而未决的问题。在已知源分布的贝叶斯设置中，最小均方误差(MMSE)去噪器从噪声测量$Y^{n}$估计$X^{n}$为$hat{X}^{n}=mathrm{E}[X^{n}|Y^{n}]$。然而，对于一般资源，计算$mathrm{E}[X^{n}|Y^{n}]$在计算上是非常具有挑战性的，如果不是不可行的。本文从贝叶斯模型出发，提出了一种新的去噪方法——量化最大后验去噪(Q-MAP)，并对其渐近性能进行了分析。对于无记忆源和结构化一阶马尔可夫源，结果表明，随着$sigma _{z}^{2} $(噪声方差)收敛于零，${1over sigma _{z}^{2}} mathrm{E}[(X_{i}-hat{X}^{mathrm{QMAP}}_{i})^{2}]$收敛于源的信息维。对于所研究的无记忆源，已知这个限制是最优的。与MMSE去噪器不同，Q-MAP去噪器的一个关键优点是，它突出了用于去噪的源分布的关键属性。这一关键特性导致了一种新的基于学习的去噪方法，适用于一般结构化源。利用ImageNet数据库进行训练，给出了初步的仿真结果，探索了这种基于学习的去噪器在图像去噪中的性能。

{"title":"Bayesian denoising of structured sources and its implications on learning-based denoising","authors":"Wenda Zhou, Joachim Wabnig, Shirin Jalali","doi":"10.1093/imaiai/iaad036","DOIUrl":"https://doi.org/10.1093/imaiai/iaad036","url":null,"abstract":"Abstract Denoising a stationary process $(X_{i})_{i in mathbb{Z}}$ corrupted by additive white Gaussian noise $(Z_{i})_{i in mathbb{Z}}$ is a classic, well-studied and fundamental problem in information theory and statistical signal processing. However, finding theoretically founded computationally efficient denoising methods applicable to general sources is still an open problem. In the Bayesian set-up where the source distribution is known, a minimum mean square error (MMSE) denoiser estimates $X^{n}$ from noisy measurements $Y^{n}$ as $hat{X}^{n}=mathrm{E}[X^{n}|Y^{n}]$. However, for general sources, computing $mathrm{E}[X^{n}|Y^{n}]$ is computationally very challenging, if not infeasible. In this paper, starting from a Bayesian set-up, a novel denoising method, namely, quantized maximum a posteriori (Q-MAP) denoiser is proposed and its asymptotic performance is analysed. Both for memoryless sources, and for structured first-order Markov sources, it is shown that, asymptotically, as $sigma _{z}^{2} $ (noise variance) converges to zero, ${1over sigma _{z}^{2}} mathrm{E}[(X_{i}-hat{X}^{mathrm{QMAP}}_{i})^{2}]$ converges to the information dimension of the source. For the studied memoryless sources, this limit is known to be optimal. A key advantage of the Q-MAP denoiser, unlike an MMSE denoiser, is that it highlights the key properties of the source distribution that are to be used in its denoising. This key property leads to a new learning-based denoising approach that is applicable to generic structured sources. Using ImageNet database for training, initial simulation results exploring the performance of such a learning-based denoiser in image denoising are presented.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135060543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Near-optimal estimation of linear functionals with log-concave observation errors 具有log-凹观测误差的线性泛函的近最优估计

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-09-19 DOI: 10.1093/imaiai/iaad038

Simon Foucart, Grigoris Paouris

Abstract This note addresses the question of optimally estimating a linear functional of an object acquired through linear observations corrupted by random noise, where optimality pertains to a worst-case setting tied to a symmetric, convex and closed model set containing the object. It complements the article ‘Statistical Estimation and Optimal Recovery’ published in the Annals of Statistics in 1994. There, Donoho showed (among other things) that, for Gaussian noise, linear maps provide near-optimal estimation schemes relatively to a performance measure relevant in Statistical Estimation. Here, we advocate for a different performance measure arguably more relevant in Optimal Recovery. We show that, relatively to this new measure, linear maps still provide near-optimal estimation schemes even if the noise is merely log-concave. Our arguments, which make a connection to the deterministic noise situation and bypass properties specific to the Gaussian case, offer an alternative to parts of Donoho’s proof.

摘要:本文解决了通过随机噪声破坏的线性观测获得的对象的线性泛函的最优估计问题，其中最优性涉及与包含该对象的对称，凸和封闭模型集相关的最坏情况设置。它补充了1994年发表在《统计年鉴》上的文章“统计估计和最佳恢复”。在那里，Donoho展示了(除其他外)，对于高斯噪声，相对于统计估计中相关的性能度量，线性映射提供了接近最优的估计方案。在这里，我们提倡一种不同的性能度量，可以说在最优恢复中更相关。我们表明，相对于这种新的测量方法，即使噪声仅仅是对数凹的，线性映射仍然提供接近最优的估计方案。我们的论点与确定性噪声情况和高斯情况特有的旁路特性有关，为多诺霍的部分证明提供了另一种选择。

引用次数: 0

Graph-based approximate message passing iterations 基于图的近似消息传递迭代

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-09-18 DOI: 10.1093/imaiai/iaad020

Cédric Gerbelot, Raphaël Berthier

Abstract Approximate message passing (AMP) algorithms have become an important element of high-dimensional statistical inference, mostly due to their adaptability and concentration properties, the state evolution (SE) equations. This is demonstrated by the growing number of new iterations proposed for increasingly complex problems, ranging from multi-layer inference to low-rank matrix estimation with elaborate priors. In this paper, we address the following questions: is there a structure underlying all AMP iterations that unifies them in a common framework? Can we use such a structure to give a modular proof of state evolution equations, adaptable to new AMP iterations without reproducing each time the full argument? We propose an answer to both questions, showing that AMP instances can be generically indexed by an oriented graph. This enables to give a unified interpretation of these iterations, independent from the problem they solve, and a way of composing them arbitrarily. We then show that all AMP iterations indexed by such a graph verify rigorous SE equations, extending the reach of previous proofs and proving a number of recent heuristic derivations of those equations. Our proof naturally includes non-separable functions and we show how existing refinements, such as spatial coupling or matrix-valued variables, can be combined with our framework.

摘要近似消息传递(AMP)算法已成为高维统计推断的重要组成部分，主要是由于其自适应性和集中性，状态演化(SE)方程。对于越来越复杂的问题，从多层推理到具有精细先验的低秩矩阵估计，提出了越来越多的新迭代，证明了这一点。在本文中，我们解决了以下问题:是否存在一个将所有AMP迭代统一在一个公共框架中的结构?我们是否可以使用这样的结构来给出状态演化方程的模块化证明，以适应新的AMP迭代，而无需每次都复制完整的参数?我们对这两个问题给出了答案，表明AMP实例可以通过面向图进行一般索引。这使得可以对这些迭代给出统一的解释，独立于它们所解决的问题，并且可以任意地组合它们。然后，我们证明了由这样一个图索引的所有AMP迭代都验证了严格的SE方程，扩展了以前证明的范围，并证明了这些方程的一些最近的启发式推导。我们的证明自然包括不可分离的函数，我们展示了如何现有的改进，如空间耦合或矩阵值变量，可以与我们的框架相结合。

{"title":"Graph-based approximate message passing iterations","authors":"Cédric Gerbelot, Raphaël Berthier","doi":"10.1093/imaiai/iaad020","DOIUrl":"https://doi.org/10.1093/imaiai/iaad020","url":null,"abstract":"Abstract Approximate message passing (AMP) algorithms have become an important element of high-dimensional statistical inference, mostly due to their adaptability and concentration properties, the state evolution (SE) equations. This is demonstrated by the growing number of new iterations proposed for increasingly complex problems, ranging from multi-layer inference to low-rank matrix estimation with elaborate priors. In this paper, we address the following questions: is there a structure underlying all AMP iterations that unifies them in a common framework? Can we use such a structure to give a modular proof of state evolution equations, adaptable to new AMP iterations without reproducing each time the full argument? We propose an answer to both questions, showing that AMP instances can be generically indexed by an oriented graph. This enables to give a unified interpretation of these iterations, independent from the problem they solve, and a way of composing them arbitrarily. We then show that all AMP iterations indexed by such a graph verify rigorous SE equations, extending the reach of previous proofs and proving a number of recent heuristic derivations of those equations. Our proof naturally includes non-separable functions and we show how existing refinements, such as spatial coupling or matrix-valued variables, can be combined with our framework.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135110705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Spectral deconvolution of matrix models: the additive case 矩阵模型的谱反褶积:加性情况

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-09-18 DOI: 10.1093/imaiai/iaad037

Pierre Tarrago

Abstract We implement a complex analytic method to build an estimator of the spectrum of a matrix perturbed by the addition of a random matrix noise in the free probabilistic regime. This method, which has been previously introduced by Arizmendi, Tarrago and Vargas, involves two steps: the first step consists in a fixed point method to compute the Stieltjes transform of the desired distribution in a certain domain, and the second step is a classical deconvolution by a Cauchy distribution, whose parameter depends on the intensity of the noise. This method thus reduces the spectral deconvolution problem to a classical one. We provide explicit bounds for the mean squared error of the first step under the assumption that the distribution of the noise is unitary invariant. In the case where the unknown measure is sparse or close to a distribution with a density with enough smoothness, we prove that the resulting estimator converges to the measure in the $1$-Wasserstein distance at speed $O(1/sqrt{N})$, where $N$ is the dimension of the matrix.

摘要利用复解析的方法，建立了在自由概率域中被随机矩阵噪声扰动的矩阵谱的估计量。该方法由Arizmendi、Tarrago和Vargas提出，分为两步:第一步采用不动点法计算期望分布在某一区域的Stieltjes变换，第二步采用柯西分布进行经典反卷积，柯西分布的参数取决于噪声的强度。该方法将光谱反褶积问题简化为经典问题。在假设噪声的分布是酉不变的情况下，我们为第一步的均方误差提供了明确的界限。在未知测度是稀疏的或接近一个密度足够光滑的分布的情况下，我们证明了所得估计量以速度$O(1/sqrt{N})$收敛于$1$-Wasserstein距离上的测度，其中$N$是矩阵的维数。

引用次数: 0

High-dimensional asymptotics of Langevin dynamics in spiked matrix models 尖刺矩阵模型中Langevin动力学的高维渐近性

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-09-18 DOI: 10.1093/imaiai/iaad042

Tengyuan Liang, Subhabrata Sen, Pragya Sur

Abstract We study Langevin dynamics for recovering the planted signal in the spiked matrix model. We provide a ‘path-wise’ characterization of the overlap between the output of the Langevin algorithm and the planted signal. This overlap is characterized in terms of a self-consistent system of integro-differential equations, usually referred to as the Crisanti–Horner–Sommers–Cugliandolo–Kurchan equations in the spin glass literature. As a second contribution, we derive an explicit formula for the limiting overlap in terms of the signal-to-noise ratio and the injected noise in the diffusion. This uncovers a sharp phase transition—in one regime, the limiting overlap is strictly positive, while in the other, the injected noise overcomes the signal, and the limiting overlap is zero.

摘要研究了刺突矩阵模型中植入信号的朗之万动力学恢复方法。我们提供了朗格万算法输出和植入信号之间重叠的“路径”表征。这种重叠是用自洽的积分-微分方程组来表征的，在自旋玻璃文献中通常被称为Crisanti-Horner-Sommers-Cugliandolo-Kurchan方程。作为第二个贡献，我们导出了一个明确的公式，用于限制重叠的信噪比和扩散中的注入噪声。这揭示了一个尖锐的相变——在一个区域，极限重叠严格为正，而在另一个区域，注入的噪声克服了信号，极限重叠为零。

引用次数: 0

Multi-marginal Gromov–Wasserstein transport and barycentres 多边缘Gromov-Wasserstein输运和质心

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-09-18 DOI: 10.1093/imaiai/iaad041

Florian Beier, Robert Beinert, Gabriele Steidl

Abstract Gromov–Wasserstein (GW) distances are combinations of Gromov–Hausdorff and Wasserstein distances that allow the comparison of two different metric measure spaces (mm-spaces). Due to their invariance under measure- and distance-preserving transformations, they are well suited for many applications in graph and shape analysis. In this paper, we introduce the concept of multi-marginal GW transport between a set of mm-spaces as well as its regularized and unbalanced versions. As a special case, we discuss multi-marginal fused variants, which combine the structure information of an mm-space with label information from an additional label space. To tackle the new formulations numerically, we consider the bi-convex relaxation of the multi-marginal GW problem, which is tight in the balanced case if the cost function is conditionally negative definite. The relaxed model can be solved by an alternating minimization, where each step can be performed by a multi-marginal Sinkhorn scheme. We show relations of our multi-marginal GW problem to (unbalanced, fused) GW barycentres and present various numerical results, which indicate the potential of the concept.

Gromov-Wasserstein (GW)距离是Gromov-Hausdorff和Wasserstein距离的组合，它允许两个不同度量度量空间(mm-spaces)的比较。由于它们在测量和距离保持变换下的不变性，它们非常适合在图和形状分析中的许多应用。本文引入了一组mm-空间间多边际GW输运的概念及其正则化和不平衡版本。作为一种特殊情况，我们讨论了多边缘融合变体，它将mm空间的结构信息与附加标签空间的标签信息相结合。为了在数值上处理新公式，我们考虑了多边际GW问题的双凸松弛，当成本函数为条件负定时，该问题在平衡情况下是紧的。松弛模型可以通过交替最小化来求解，其中每一步都可以用多边缘Sinkhorn格式来执行。我们展示了我们的多边际GW问题与(不平衡，融合)GW重心的关系，并给出了各种数值结果，表明了该概念的潜力。

{"title":"Multi-marginal Gromov–Wasserstein transport and barycentres","authors":"Florian Beier, Robert Beinert, Gabriele Steidl","doi":"10.1093/imaiai/iaad041","DOIUrl":"https://doi.org/10.1093/imaiai/iaad041","url":null,"abstract":"Abstract Gromov–Wasserstein (GW) distances are combinations of Gromov–Hausdorff and Wasserstein distances that allow the comparison of two different metric measure spaces (mm-spaces). Due to their invariance under measure- and distance-preserving transformations, they are well suited for many applications in graph and shape analysis. In this paper, we introduce the concept of multi-marginal GW transport between a set of mm-spaces as well as its regularized and unbalanced versions. As a special case, we discuss multi-marginal fused variants, which combine the structure information of an mm-space with label information from an additional label space. To tackle the new formulations numerically, we consider the bi-convex relaxation of the multi-marginal GW problem, which is tight in the balanced case if the cost function is conditionally negative definite. The relaxed model can be solved by an alternating minimization, where each step can be performed by a multi-marginal Sinkhorn scheme. We show relations of our multi-marginal GW problem to (unbalanced, fused) GW barycentres and present various numerical results, which indicate the potential of the concept.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135257493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Out-of-sample error estimation for M-estimators with convex penalty 带凸惩罚的m估计量的样本外误差估计

4区数学 Q2 MATHEMATICS, APPLIED

Information and Inference-A Journal of the Ima

Pub Date : 2023-09-18 DOI: 10.1093/imaiai/iaad031

Pierre C Bellec

Abstract A generic out-of-sample error estimate is proposed for $M$-estimators regularized with a convex penalty in high-dimensional linear regression where $(boldsymbol{X},boldsymbol{y})$ is observed and the dimension $p$ and sample size $n$ are of the same order. The out-of-sample error estimate enjoys a relative error of order $n^{-1/2}$ in a linear model with Gaussian covariates and independent noise, either non-asymptotically when $p/nle gamma $ or asymptotically in the high-dimensional asymptotic regime $p/nto gamma ^{prime}in (0,infty )$. General differentiable loss functions $rho $ are allowed provided that the derivative of the loss is 1-Lipschitz; this includes the least-squares loss as well as robust losses such as the Huber loss and its smoothed versions. The validity of the out-of-sample error estimate holds either under a strong convexity assumption, or for the L1-penalized Huber M-estimator and the Lasso under a sparsity assumption and a bound on the number of contaminated observations. For the square loss and in the absence of corruption in the response, the results additionally yield $n^{-1/2}$-consistent estimates of the noise variance and of the generalization error. This generalizes, to arbitrary convex penalty and arbitrary covariance, estimates that were previously known for the Lasso.

摘要针对高维线性回归中存在$(boldsymbol{X},boldsymbol{y})$且维数$p$和样本量$n$为同阶的凸惩罚正则化$M$ -估计量，提出了一种通用的样本外误差估计方法。在具有高斯协变量和独立噪声的线性模型中，样本外误差估计的相对误差为$n^{-1/2}$阶，在$p/nle gamma $时是非渐近的，在高维渐近区域$p/nto gamma ^{prime}in (0,infty )$时是渐近的。一般可微损失函数$rho $是允许的，只要损失的导数是1-Lipschitz;这包括最小二乘损失以及鲁棒损失，如Huber损失及其平滑版本。样本外误差估计的有效性要么在强凸性假设下成立，要么在稀疏性假设和受污染观测数的限制下，对l1惩罚的Huber m估计和Lasso估计成立。对于平方损失和响应中没有损坏的情况，结果还产生$n^{-1/2}$ -一致的噪声方差和泛化误差估计。这推广到任意凸惩罚和任意协方差，这是以前已知的Lasso估计。

{"title":"Out-of-sample error estimation for M-estimators with convex penalty","authors":"Pierre C Bellec","doi":"10.1093/imaiai/iaad031","DOIUrl":"https://doi.org/10.1093/imaiai/iaad031","url":null,"abstract":"Abstract A generic out-of-sample error estimate is proposed for $M$-estimators regularized with a convex penalty in high-dimensional linear regression where $(boldsymbol{X},boldsymbol{y})$ is observed and the dimension $p$ and sample size $n$ are of the same order. The out-of-sample error estimate enjoys a relative error of order $n^{-1/2}$ in a linear model with Gaussian covariates and independent noise, either non-asymptotically when $p/nle gamma $ or asymptotically in the high-dimensional asymptotic regime $p/nto gamma ^{prime}in (0,infty )$. General differentiable loss functions $rho $ are allowed provided that the derivative of the loss is 1-Lipschitz; this includes the least-squares loss as well as robust losses such as the Huber loss and its smoothed versions. The validity of the out-of-sample error estimate holds either under a strong convexity assumption, or for the L1-penalized Huber M-estimator and the Lasso under a sparsity assumption and a bound on the number of contaminated observations. For the square loss and in the absence of corruption in the response, the results additionally yield $n^{-1/2}$-consistent estimates of the noise variance and of the generalization error. This generalizes, to arbitrary convex penalty and arbitrary covariance, estimates that were previously known for the Lasso.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135258177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Information and Inference-A Journal of the Ima

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀