首页 > 最新文献

Information and Inference-A Journal of the Ima最新文献

英文 中文
Minimum probability of error of list M-ary hypothesis testing 列表m -玛利假设检验的最小误差概率
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2023-02-27 DOI: 10.1093/imaiai/iaad001
Ehsan Asadi Kangarshahi, A. Guillén i Fàbregas
We study a variation of Bayesian $M$-ary hypothesis testing in which the test outputs a list of $L$ candidates out of the $M$ possible upon processing the observation. We study the minimum error probability of list hypothesis testing, where an error is defined as the event where the true hypothesis is not in the list output by the test. We derive two exact expressions of the minimum probability or error. The first is expressed as the error probability of a certain non-Bayesian binary hypothesis test and is reminiscent of the meta-converse bound by Polyanskiy, Poor and Verdú (2010). The second, is expressed as the tail probability of the likelihood ratio between the two distributions involved in the aforementioned non-Bayesian binary hypothesis test. Hypothesis testing, error probability, information theory.
我们研究了贝叶斯$M$任意假设检验的一种变体,在该检验中,在处理观察结果后,该检验从$M$可能的候选中输出$L$的候选列表。我们研究了列表假设检验的最小错误概率,其中错误定义为测试输出的列表中不存在真实假设的事件。我们导出了最小概率或最小误差的两个精确表达式。第一个表示为某个非贝叶斯二元假设检验的错误概率,让人想起Polyanskiy, Poor和Verdú(2010)的元逆界。第二个,表示为上述非贝叶斯二元假设检验中涉及的两个分布之间的似然比的尾部概率。假设检验,错误概率,信息论。
{"title":"Minimum probability of error of list M-ary hypothesis testing","authors":"Ehsan Asadi Kangarshahi, A. Guillén i Fàbregas","doi":"10.1093/imaiai/iaad001","DOIUrl":"https://doi.org/10.1093/imaiai/iaad001","url":null,"abstract":"\u0000 We study a variation of Bayesian $M$-ary hypothesis testing in which the test outputs a list of $L$ candidates out of the $M$ possible upon processing the observation. We study the minimum error probability of list hypothesis testing, where an error is defined as the event where the true hypothesis is not in the list output by the test. We derive two exact expressions of the minimum probability or error. The first is expressed as the error probability of a certain non-Bayesian binary hypothesis test and is reminiscent of the meta-converse bound by Polyanskiy, Poor and Verdú (2010). The second, is expressed as the tail probability of the likelihood ratio between the two distributions involved in the aforementioned non-Bayesian binary hypothesis test. Hypothesis testing, error probability, information theory.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"14 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90910163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unifying view of modal clustering 模态聚类的统一观点
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac030
Ery Arias-Castro;Wanli Qiao
Two important non-parametric approaches to clustering emerged in the 1970s: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hostetler. In a recent paper, we draw a connection between these two approaches, in particular, by showing that the gradient flow provides a way to move along the cluster tree. Here, we argue the case that these two approaches are fundamentally the same. We do so by proposing two ways of obtaining a partition from the cluster tree—each one of them very natural in its own right—and showing that both of them reduce to the partition given by the gradient flow under standard assumptions on the sampling density.
20世纪70年代出现了两种重要的非参数聚类方法:Hartigan提出的通过水平集或聚类树进行聚类,以及Fukunaga和Hostetler提出的通过梯度线或梯度流进行聚类。在最近的一篇论文中,我们将这两种方法联系起来,特别是通过展示梯度流提供了一种沿着聚类树移动的方式。在这里,我们认为这两种方法从根本上是相同的。我们提出了两种从聚类树中获得分区的方法——每种方法都非常自然——并表明在采样密度的标准假设下,这两种方法都可以简化为梯度流给出的分区。
{"title":"A unifying view of modal clustering","authors":"Ery Arias-Castro;Wanli Qiao","doi":"10.1093/imaiai/iaac030","DOIUrl":"https://doi.org/10.1093/imaiai/iaac030","url":null,"abstract":"Two important non-parametric approaches to clustering emerged in the 1970s: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hostetler. In a recent paper, we draw a connection between these two approaches, in particular, by showing that the gradient flow provides a way to move along the cluster tree. Here, we argue the case that these two approaches are fundamentally the same. We do so by proposing two ways of obtaining a partition from the cluster tree—each one of them very natural in its own right—and showing that both of them reduce to the partition given by the gradient flow under standard assumptions on the sampling density.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"897-920"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On sharp stochastic zeroth-order Hessian estimators over Riemannian manifolds 关于黎曼流形上的零阶随机Hessian估计
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac027
Tianyu Wang
We study Hessian estimators for functions defined over an $n$-dimensional complete analytic Riemannian manifold. We introduce new stochastic zeroth-order Hessian estimators using $O (1)$ function evaluations. We show that, for an analytic real-valued function $f$, our estimator achieves a bias bound of order $ O ( gamma delta ^2 ) $, where $ gamma $ depends on both the Levi–Civita connection and function $f$, and $delta $ is the finite difference step size. To the best of our knowledge, our results provide the first bias bound for Hessian estimators that explicitly depends on the geometry of the underlying Riemannian manifold. We also study downstream computations based on our Hessian estimators. The supremacy of our method is evidenced by empirical evaluations.
我们研究了在$n$-维完全解析黎曼流形上定义的函数的Hessian估计。我们使用$O(1)$函数评估引入了新的随机零阶Hessian估计量。我们证明,对于分析实值函数$f$,我们的估计器实现了$O(gammadelta^2)$阶的偏差界,其中$gamma$依赖于Levi–Civita连接和函数$f$$$$delta$是有限差分步长。据我们所知,我们的结果为Hessian估计量提供了第一个偏差界,该估计量明确地依赖于底层黎曼流形的几何。我们还研究了基于Hessian估计量的下游计算。经验评估证明了我们方法的优越性。
{"title":"On sharp stochastic zeroth-order Hessian estimators over Riemannian manifolds","authors":"Tianyu Wang","doi":"10.1093/imaiai/iaac027","DOIUrl":"https://doi.org/10.1093/imaiai/iaac027","url":null,"abstract":"We study Hessian estimators for functions defined over an \u0000<tex>$n$</tex>\u0000-dimensional complete analytic Riemannian manifold. We introduce new stochastic zeroth-order Hessian estimators using \u0000<tex>$O (1)$</tex>\u0000 function evaluations. We show that, for an analytic real-valued function \u0000<tex>$f$</tex>\u0000, our estimator achieves a bias bound of order \u0000<tex>$ O ( gamma delta ^2 ) $</tex>\u0000, where \u0000<tex>$ gamma $</tex>\u0000 depends on both the Levi–Civita connection and function \u0000<tex>$f$</tex>\u0000, and \u0000<tex>$delta $</tex>\u0000 is the finite difference step size. To the best of our knowledge, our results provide the first bias bound for Hessian estimators that explicitly depends on the geometry of the underlying Riemannian manifold. We also study downstream computations based on our Hessian estimators. The supremacy of our method is evidenced by empirical evaluations.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"787-813"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points 鞍点附近梯度下降轨迹近似的退出时间分析
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac025
Rishabh Dixit;Mert Gürbüzbalaban;Waheed U Bajwa
This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the ‘flat’ geometry around saddle points, first-order methods can struggle to escape these regions in a fast manner due to the small magnitudes of gradients encountered. In particular, while it is known that gradient-related first-order methods escape strict-saddle neighborhoods, existing analytic techniques do not explicitly leverage the local geometry around saddle points in order to control behavior of gradient trajectories. It is in this context that this paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods using matrix perturbation theory. In doing so, it provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. In addition, the analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions, which explicitly bring out the dependence on problem dimension, conditioning of the saddle neighborhood, and more, for a class of strict-saddle functions.
本文考虑了在一些初始边界条件下,从鞍邻域理解梯度相关一阶方法轨迹的退出时间的问题。考虑到鞍点周围的“平坦”几何结构,由于遇到的梯度幅度较小,一阶方法可能难以快速逃离这些区域。特别地,虽然已知梯度相关的一阶方法避开了严格的鞍邻域,但现有的分析技术并没有明确地利用鞍点周围的局部几何来控制梯度轨迹的行为。正是在这种背景下,本文利用矩阵摄动理论对严格鞍邻域周围的梯度下降方法进行了严格的几何分析。在这样做的过程中,它提供了一个关键结果,可用于生成任何给定初始条件的近似梯度轨迹。此外,分析得出了梯度下降法在某些必要的初始条件下的线性退出时间解,明确地给出了一类严格鞍函数对问题维数、鞍邻域条件等的依赖性。
{"title":"Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points","authors":"Rishabh Dixit;Mert Gürbüzbalaban;Waheed U Bajwa","doi":"10.1093/imaiai/iaac025","DOIUrl":"https://doi.org/10.1093/imaiai/iaac025","url":null,"abstract":"This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions. Given the ‘flat’ geometry around saddle points, first-order methods can struggle to escape these regions in a fast manner due to the small magnitudes of gradients encountered. In particular, while it is known that gradient-related first-order methods escape strict-saddle neighborhoods, existing analytic techniques do not explicitly leverage the local geometry around saddle points in order to control behavior of gradient trajectories. It is in this context that this paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods using matrix perturbation theory. In doing so, it provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. In addition, the analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions, which explicitly bring out the dependence on problem dimension, conditioning of the saddle neighborhood, and more, for a class of strict-saddle functions.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"714-786"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Uncertainty quantification in the Bradley–Terry–Luce model Bradley–Terry–Luce模型中的不确定性量化
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac032
Chao Gao;Yandi Shen;Anderson Y Zhang
The Bradley–Terry–Luce (BTL) model is a benchmark model for pairwise comparisons between individuals. Despite recent progress on the first-order asymptotics of several popular procedures, the understanding of uncertainty quantification in the BTL model remains largely incomplete, especially when the underlying comparison graph is sparse. In this paper, we fill this gap by focusing on two estimators that have received much recent attention: the maximum likelihood estimator (MLE) and the spectral estimator. Using a unified proof strategy, we derive sharp and uniform non-asymptotic expansions for both estimators in the sparsest possible regime (up to some poly-logarithmic factors) of the underlying comparison graph. These expansions allow us to obtain: (i) finite-dimensional central limit theorems for both estimators; (ii) construction of confidence intervals for individual ranks; (iii) optimal constant of $ell _2$ estimation, which is achieved by the MLE but not by the spectral estimator. Our proof is based on a self-consistent equation of the second-order remainder vector and a novel leave-two-out analysis.
Bradley–Terry–Luce(BTL)模型是个体之间成对比较的基准模型。尽管最近在几种流行程序的一阶渐近性方面取得了进展,但对BTL模型中不确定性量化的理解在很大程度上仍然不完整,尤其是当基础比较图稀疏时。在本文中,我们通过关注最近备受关注的两种估计量来填补这一空白:最大似然估计量(MLE)和谱估计量。使用统一的证明策略,我们在基础比较图的最稀疏的可能状态(直到一些多对数因子)中导出了两个估计量的尖锐和一致的非渐近展开式。这些展开允许我们得到:(i)两个估计量的有限维中心极限定理;(ii)个别职级的置信区间的构造;(iii)$ell_2$估计的最优常数,其通过MLE而不是通过谱估计器来实现。我们的证明是基于一个二阶余数向量的自洽方程和一个新颖的二舍二入分析。
{"title":"Uncertainty quantification in the Bradley–Terry–Luce model","authors":"Chao Gao;Yandi Shen;Anderson Y Zhang","doi":"10.1093/imaiai/iaac032","DOIUrl":"https://doi.org/10.1093/imaiai/iaac032","url":null,"abstract":"The Bradley–Terry–Luce (BTL) model is a benchmark model for pairwise comparisons between individuals. Despite recent progress on the first-order asymptotics of several popular procedures, the understanding of uncertainty quantification in the BTL model remains largely incomplete, especially when the underlying comparison graph is sparse. In this paper, we fill this gap by focusing on two estimators that have received much recent attention: the maximum likelihood estimator (MLE) and the spectral estimator. Using a unified proof strategy, we derive sharp and uniform non-asymptotic expansions for both estimators in the sparsest possible regime (up to some poly-logarithmic factors) of the underlying comparison graph. These expansions allow us to obtain: (i) finite-dimensional central limit theorems for both estimators; (ii) construction of confidence intervals for individual ranks; (iii) optimal constant of \u0000<tex>$ell _2$</tex>\u0000 estimation, which is achieved by the MLE but not by the spectral estimator. Our proof is based on a self-consistent equation of the second-order remainder vector and a novel leave-two-out analysis.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"1073-1140"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Optimal orthogonal group synchronization and rotation group synchronization 最优正交群同步和旋转群同步
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac022
Chao Gao;Anderson Y Zhang
We study the statistical estimation problem of orthogonal group synchronization and rotation group synchronization. The model is $Y_{ij} = Z_i^* Z_j^{*T} + sigma W_{ij}in{mathbb{R}}^{dtimes d}$ where $W_{ij}$ is a Gaussian random matrix and $Z_i^*$ is either an orthogonal matrix or a rotation matrix, and each $Y_{ij}$ is observed independently with probability $p$. We analyze an iterative polar decomposition algorithm for the estimation of $Z^*$ and show it has an error of $(1+o(1))frac{sigma ^2 d(d-1)}{2np}$ when initialized by spectral methods. A matching minimax lower bound is further established that leads to the optimality of the proposed algorithm as it achieves the exact minimax risk.
研究了正交群同步和旋转群同步的统计估计问题。该模型为$Y_{ij}=Z_i^*Z_j^{*T}+mathbb{R}}^{d times d}$中的σW_{ij}$,其中$W_{ij}$是高斯随机矩阵,$Z_i^**$是正交矩阵或旋转矩阵,并且每个$Y_。我们分析了一种用于$Z^*$估计的迭代极分解算法,并表明当用谱方法初始化时,它的误差为$(1+o(1))frac{sigma^2 d(d-1)}{2np}$。进一步建立了匹配的极小极大下界,该下界导致所提出的算法的最优性,因为它实现了精确的极小极大风险。
{"title":"Optimal orthogonal group synchronization and rotation group synchronization","authors":"Chao Gao;Anderson Y Zhang","doi":"10.1093/imaiai/iaac022","DOIUrl":"https://doi.org/10.1093/imaiai/iaac022","url":null,"abstract":"We study the statistical estimation problem of orthogonal group synchronization and rotation group synchronization. The model is \u0000<tex>$Y_{ij} = Z_i^* Z_j^{*T} + sigma W_{ij}in{mathbb{R}}^{dtimes d}$</tex>\u0000 where \u0000<tex>$W_{ij}$</tex>\u0000 is a Gaussian random matrix and \u0000<tex>$Z_i^*$</tex>\u0000 is either an orthogonal matrix or a rotation matrix, and each \u0000<tex>$Y_{ij}$</tex>\u0000 is observed independently with probability \u0000<tex>$p$</tex>\u0000. We analyze an iterative polar decomposition algorithm for the estimation of \u0000<tex>$Z^*$</tex>\u0000 and show it has an error of \u0000<tex>$(1+o(1))frac{sigma ^2 d(d-1)}{2np}$</tex>\u0000 when initialized by spectral methods. A matching minimax lower bound is further established that leads to the optimality of the proposed algorithm as it achieves the exact minimax risk.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"591-632"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Fast splitting algorithms for sparsity-constrained and noisy group testing 稀疏性约束和噪声群测试的快速分裂算法
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac031
Eric Price;Jonathan Scarlett;Nelvin Tan
In group testing, the goal is to identify a subset of defective items within a larger set of items based on tests whose outcomes indicate whether at least one defective item is present. This problem is relevant in areas such as medical testing, DNA sequencing, communication protocols and many more. In this paper, we study (i) a sparsity-constrained version of the problem, in which the testing procedure is subjected to one of the following two constraints: items are finitely divisible and thus may participate in at most $gamma $ tests; or tests are size-constrained to pool no more than $rho $ items per test; and (ii) a noisy version of the problem, where each test outcome is independently flipped with some constant probability. Under each of these settings, considering the for-each recovery guarantee with asymptotically vanishing error probability, we introduce a fast splitting algorithm and establish its near-optimality not only in terms of the number of tests, but also in terms of the decoding time. While the most basic formulations of our algorithms require $varOmega (n)$ storage for each algorithm, we also provide low-storage variants based on hashing, with similar recovery guarantees.
在小组测试中,目标是基于测试结果指示是否存在至少一个缺陷项目的测试,在更大的项目集合中识别缺陷项目的子集。这个问题与医学检测、DNA测序、通信协议等领域有关。在本文中,我们研究了(i)该问题的稀疏性约束版本,其中测试过程受到以下两个约束之一的约束:项目是有限可分的,因此最多可以参与$gamma$测试;或者测试的大小被限制为每次测试汇集不超过$rho$个项目;以及(ii)问题的噪声版本,其中每个测试结果以一定的恒定概率独立翻转。在每种设置下,考虑到误差概率渐近消失的每种恢复保证,我们引入了一种快速分裂算法,并建立了它的近似最优性,不仅在测试次数方面,而且在解码时间方面。虽然我们算法的最基本公式需要每个算法的$varOmega(n)$存储,但我们也提供了基于哈希的低存储变体,具有类似的恢复保证。
{"title":"Fast splitting algorithms for sparsity-constrained and noisy group testing","authors":"Eric Price;Jonathan Scarlett;Nelvin Tan","doi":"10.1093/imaiai/iaac031","DOIUrl":"https://doi.org/10.1093/imaiai/iaac031","url":null,"abstract":"In group testing, the goal is to identify a subset of defective items within a larger set of items based on tests whose outcomes indicate whether at least one defective item is present. This problem is relevant in areas such as medical testing, DNA sequencing, communication protocols and many more. In this paper, we study (i) a sparsity-constrained version of the problem, in which the testing procedure is subjected to one of the following two constraints: items are finitely divisible and thus may participate in at most \u0000<tex>$gamma $</tex>\u0000 tests; or tests are size-constrained to pool no more than \u0000<tex>$rho $</tex>\u0000 items per test; and (ii) a noisy version of the problem, where each test outcome is independently flipped with some constant probability. Under each of these settings, considering the for-each recovery guarantee with asymptotically vanishing error probability, we introduce a fast splitting algorithm and establish its near-optimality not only in terms of the number of tests, but also in terms of the decoding time. While the most basic formulations of our algorithms require \u0000<tex>$varOmega (n)$</tex>\u0000 storage for each algorithm, we also provide low-storage variants based on hashing, with similar recovery guarantees.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"1141-1171"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50297919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Sparse recovery by reduced variance stochastic approximation 基于降方差随机近似的稀疏恢复
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac028
Anatoli Juditsky;Andrei Kulunchakov;Hlib Tsyntseus
In this paper, we discuss application of iterative Stochastic Optimization routines to the problem of sparse signal recovery from noisy observation. Using Stochastic Mirror Descent algorithm as a building block, we develop a multistage procedure for recovery of sparse solutions to Stochastic Optimization problem under assumption of smoothness and quadratic minoration on the expected objective. An interesting feature of the proposed algorithm is linear convergence of the approximate solution during the preliminary phase of the routine when the component of stochastic error in the gradient observation, which is due to bad initial approximation of the optimal solution, is larger than the ‘ideal’ asymptotic error component owing to observation noise ‘at the optimal solution’. We also show how one can straightforwardly enhance reliability of the corresponding solution using Median-of-Means-like techniques.We illustrate the performance of the proposed algorithms in application to classical problems of recovery of sparse and low-rank signals in the generalized linear regression framework. We show, under rather weak assumption on the regressor and noise distributions, how they lead to parameter estimates which obey (up to factors which are logarithmic in problem dimension and confidence level) the best known accuracy bounds.
在本文中,我们讨论了迭代随机优化例程在从噪声观测中恢复稀疏信号问题中的应用。以随机镜像下降算法为构建块,在期望目标光滑性和二次幂的假设下,我们开发了一个多阶段随机优化问题稀疏解的恢复过程。所提出的算法的一个有趣的特征是,在程序的初始阶段,当梯度观测中的随机误差分量(由于最优解的初始近似不良)大于“最优解”处的观测噪声引起的“理想”渐近误差分量时,近似解的线性收敛。我们还展示了如何使用类似均值的中位数技术直接提高相应解决方案的可靠性。我们说明了所提出的算法在广义线性回归框架中应用于稀疏和低秩信号恢复的经典问题中的性能。我们展示了在对回归器和噪声分布的较弱假设下,它们如何导致参数估计服从(问题维度和置信水平为对数的因素)最已知的精度边界。
{"title":"Sparse recovery by reduced variance stochastic approximation","authors":"Anatoli Juditsky;Andrei Kulunchakov;Hlib Tsyntseus","doi":"10.1093/imaiai/iaac028","DOIUrl":"https://doi.org/10.1093/imaiai/iaac028","url":null,"abstract":"In this paper, we discuss application of iterative Stochastic Optimization routines to the problem of sparse signal recovery from noisy observation. Using Stochastic Mirror Descent algorithm as a building block, we develop a multistage procedure for recovery of sparse solutions to Stochastic Optimization problem under assumption of smoothness and quadratic minoration on the expected objective. An interesting feature of the proposed algorithm is linear convergence of the approximate solution during the preliminary phase of the routine when the component of stochastic error in the gradient observation, which is due to bad initial approximation of the optimal solution, is larger than the ‘ideal’ asymptotic error component owing to observation noise ‘at the optimal solution’. We also show how one can straightforwardly enhance reliability of the corresponding solution using Median-of-Means-like techniques.We illustrate the performance of the proposed algorithms in application to classical problems of recovery of sparse and low-rank signals in the generalized linear regression framework. We show, under rather weak assumption on the regressor and noise distributions, how they lead to parameter estimates which obey (up to factors which are logarithmic in problem dimension and confidence level) the best known accuracy bounds.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"851-896"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means Stahel–Donoho均值中值对对抗性腐败和重尾数据的稳健性
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac026
Jules Depersin;Guillaume Lecué
We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the $L_2$ case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the $sqrt{n}$-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.
我们考虑Stahel–Donoho寿命(SDO)[23,66]和中值绝对偏差(MAD)[30]函数的均值中值(MOM)版本,以在对抗性污染和重尾数据下构建均值向量的亚高斯估计量。我们开发了SDO的MOM版本的单一分析,它涵盖了从高斯情况到$L_2$情况的所有情况。它基于SDO和MAD的MOM版本的同构和几乎等距性质。该分析还涵盖了平均值甚至不存在,但位置参数存在的情况;在这些情况下,我们仍然可以恢复相同的亚高斯速率和相同的对抗性污染价格,即使没有第一时间。这些性质是由经典SDO中值实现的,因此是Stahel–Donoho中值上的第一个非渐近统计界,补充了Stahel-Donoho估计量的$sqrt{n}$-一致性[58]和渐近正态性[74]。我们还证明了只有在存在二阶矩的情况下,MAD的MOM版本才能用于构造协方差矩阵的估计器,或者如果不存在二阶力矩,则可以用于构造散射矩阵的估计器。
{"title":"On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means","authors":"Jules Depersin;Guillaume Lecué","doi":"10.1093/imaiai/iaac026","DOIUrl":"https://doi.org/10.1093/imaiai/iaac026","url":null,"abstract":"We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the \u0000<tex>$L_2$</tex>\u0000 case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the \u0000<tex>$sqrt{n}$</tex>\u0000-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"814-850"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The geometry of adversarial training in binary classification 二元分类中对抗性训练的几何结构
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac029
Leon Bungert;Nicolás García Trillos;Ryan Murray
We establish an equivalence between a family of adversarial training problems for non-parametric binary classification and a family of regularized risk minimization problems where the regularizer is a nonlocal perimeter functional. The resulting regularized risk minimization problems admit exact convex relaxations of the type $L^1+text{(nonlocal)}operatorname{TV}$, a form frequently studied in image analysis and graph-based learning. A rich geometric structure is revealed by this reformulation which in turn allows us to establish a series of properties of optimal solutions of the original problem, including the existence of minimal and maximal solutions (interpreted in a suitable sense) and the existence of regular solutions (also interpreted in a suitable sense). In addition, we highlight how the connection between adversarial training and perimeter minimization problems provides a novel, directly interpretable, statistical motivation for a family of regularized risk minimization problems involving perimeter/total variation. The majority of our theoretical results are independent of the distance used to define adversarial attacks.
我们在非参数二元分类的对抗性训练问题族和正则化风险最小化问题族之间建立了等价性,其中正则化子是非局部周边函数。由此产生的正则化风险最小化问题允许类型为$L^1+text{(非局部)}operatorname{TV}$的精确凸松弛,这是图像分析和基于图的学习中经常研究的一种形式。这种重新表述揭示了丰富的几何结构,这反过来又使我们能够建立原始问题最优解的一系列性质,包括极小解和极大解的存在性(在适当的意义上解释)以及正则解的存在(也在适当的义义上解释)。此外,我们强调了对抗性训练和周长最小化问题之间的联系如何为涉及周长/总变异的正则化风险最小化问题家族提供了一种新的、可直接解释的统计动机。我们的大多数理论结果与用于定义对抗性攻击的距离无关。
{"title":"The geometry of adversarial training in binary classification","authors":"Leon Bungert;Nicolás García Trillos;Ryan Murray","doi":"10.1093/imaiai/iaac029","DOIUrl":"https://doi.org/10.1093/imaiai/iaac029","url":null,"abstract":"We establish an equivalence between a family of adversarial training problems for non-parametric binary classification and a family of regularized risk minimization problems where the regularizer is a nonlocal perimeter functional. The resulting regularized risk minimization problems admit exact convex relaxations of the type \u0000<tex>$L^1+text{(nonlocal)}operatorname{TV}$</tex>\u0000, a form frequently studied in image analysis and graph-based learning. A rich geometric structure is revealed by this reformulation which in turn allows us to establish a series of properties of optimal solutions of the original problem, including the existence of minimal and maximal solutions (interpreted in a suitable sense) and the existence of regular solutions (also interpreted in a suitable sense). In addition, we highlight how the connection between adversarial training and perimeter minimization problems provides a novel, directly interpretable, statistical motivation for a family of regularized risk minimization problems involving perimeter/total variation. The majority of our theoretical results are independent of the distance used to define adversarial attacks.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"921-968"},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
Information and Inference-A Journal of the Ima
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1