首页 > 最新文献

Information and Inference-A Journal of the Ima最新文献

英文 中文
The limits of distribution-free conditional predictive inference 无分布条件预测推理的极限
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-01 DOI: 10.1093/imaiai/iaaa017
Rina Foygel Barber;Emmanuel J Candès;Aaditya Ramdas;Ryan J Tibshirani
We consider the problem of distribution-free predictive inference, with the goal of producing predictive coverage guarantees that hold conditionally rather than marginally. Existing methods such as conformal prediction offer marginal coverage guarantees, where predictive coverage holds on average over all possible test points, but this is not sufficient for many practical applications where we would like to know that our predictions are valid for a given individual, not merely on average over a population. On the other hand, exact conditional inference guarantees are known to be impossible without imposing assumptions on the underlying distribution. In this work, we aim to explore the space in between these two and examine what types of relaxations of the conditional coverage property would alleviate some of the practical concerns with marginal coverage guarantees while still being possible to achieve in a distribution-free setting.
我们考虑了无分布预测推理的问题,目的是产生有条件而非边际的预测覆盖保证。现有的方法,如保角预测,提供了边际覆盖保证,其中预测覆盖在所有可能的测试点上平均保持,但这对于许多实际应用来说是不够的,在这些应用中,我们希望知道我们的预测对给定的个体有效,而不仅仅是对群体的平均有效。另一方面,如果不对基本分布强加假设,精确的条件推理保证是不可能的。在这项工作中,我们的目的是探索这两者之间的空间,并研究什么类型的条件覆盖属性的放松将缓解边际覆盖担保的一些实际问题,同时仍然可以在无分布的环境中实现。
{"title":"The limits of distribution-free conditional predictive inference","authors":"Rina Foygel Barber;Emmanuel J Candès;Aaditya Ramdas;Ryan J Tibshirani","doi":"10.1093/imaiai/iaaa017","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa017","url":null,"abstract":"We consider the problem of distribution-free predictive inference, with the goal of producing predictive coverage guarantees that hold conditionally rather than marginally. Existing methods such as conformal prediction offer marginal coverage guarantees, where predictive coverage holds on average over all possible test points, but this is not sufficient for many practical applications where we would like to know that our predictions are valid for a given individual, not merely on average over a population. On the other hand, exact conditional inference guarantees are known to be impossible without imposing assumptions on the underlying distribution. In this work, we aim to explore the space in between these two and examine what types of relaxations of the conditional coverage property would alleviate some of the practical concerns with marginal coverage guarantees while still being possible to achieve in a distribution-free setting.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"455-482"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 162
Oracle inequalities for square root analysis estimators with application to total variation penalties 平方根分析估计量的Oracle不等式及其在总变异罚中的应用
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-01 DOI: 10.1093/imaiai/iaaa002
Francesco Ortelli;Sara van de Geer
Through the direct study of the analysis estimator we derive oracle inequalities with fast and slow rates by adapting the arguments involving projections by Dalalyan et al. (2017, Bernoulli, 23, 552–581). We then extend the theory to the square root analysis estimator. Finally, we focus on (square root) total variation regularized estimators on graphs and obtain constant-friendly rates, which, up to log terms, match previous results obtained by entropy calculations. We also obtain an oracle inequality for the (square root) total variation regularized estimator over the cycle graph.
通过对分析估计器的直接研究,我们通过调整Dalalyan等人(2017,Bernoulli,23552-581)涉及预测的论点,导出了具有快速和慢速率的预言不等式。然后,我们将该理论推广到平方根分析估计器。最后,我们关注图上的(平方根)全变差正则化估计量,并获得了常数友好率,该常数友好率在对数项之前与熵计算获得的先前结果相匹配。我们还得到了循环图上(平方根)全变差正则化估计器的预言不等式。
{"title":"Oracle inequalities for square root analysis estimators with application to total variation penalties","authors":"Francesco Ortelli;Sara van de Geer","doi":"10.1093/imaiai/iaaa002","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa002","url":null,"abstract":"Through the direct study of the analysis estimator we derive oracle inequalities with fast and slow rates by adapting the arguments involving projections by Dalalyan et al. (2017, Bernoulli, 23, 552–581). We then extend the theory to the square root analysis estimator. Finally, we focus on (square root) total variation regularized estimators on graphs and obtain constant-friendly rates, which, up to log terms, match previous results obtained by entropy calculations. We also obtain an oracle inequality for the (square root) total variation regularized estimator over the cycle graph.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"483-514"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Composite optimization for robust rank one bilinear sensing 鲁棒秩一双线性传感的复合优化
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-01 DOI: 10.1093/imaiai/iaaa027
Vasileios Charisopoulos;Damek Davis;Mateo Díaz;Dmitriy Drusvyatskiy
We consider the task of recovering a pair of vectors from a set of rank one bilinear measurements, possibly corrupted by noise. Most notably, the problem of robust blind deconvolution can be modeled in this way. We consider a natural nonsmooth formulation of the rank one bilinear sensing problem and show that its moduli of weak convexity, sharpness and Lipschitz continuity are all dimension independent, under favorable statistical assumptions. This phenomenon persists even when up to half of the measurements are corrupted by noise. Consequently, standard algorithms, such as the subgradient and prox-linear methods, converge at a rapid dimension-independent rate when initialized within a constant relative error of the solution. We complete the paper with a new initialization strategy, complementing the local search algorithms. The initialization procedure is both provably efficient and robust to outlying measurements. Numerical experiments, on both simulated and real data, illustrate the developed theory and methods.
我们考虑的任务是从一组秩为一的双线性测量中恢复一对向量,可能被噪声破坏。最值得注意的是,鲁棒盲反卷积问题可以用这种方式建模。我们考虑了秩一双线性传感问题的一个自然非光滑公式,并证明了在有利的统计假设下,其弱凸性、锐度和Lipschitz连续性的模都是维度无关的。即使多达一半的测量值被噪声破坏,这种现象也会持续存在。因此,当在解的恒定相对误差内初始化时,标准算法,如次梯度和近似线性方法,以快速的与维度无关的速率收敛。我们用一种新的初始化策略来完成本文,补充了局部搜索算法。初始化过程既可证明是有效的,又对外围测量具有鲁棒性。在模拟和实际数据上的数值实验说明了所发展的理论和方法。
{"title":"Composite optimization for robust rank one bilinear sensing","authors":"Vasileios Charisopoulos;Damek Davis;Mateo Díaz;Dmitriy Drusvyatskiy","doi":"10.1093/imaiai/iaaa027","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa027","url":null,"abstract":"We consider the task of recovering a pair of vectors from a set of rank one bilinear measurements, possibly corrupted by noise. Most notably, the problem of robust blind deconvolution can be modeled in this way. We consider a natural nonsmooth formulation of the rank one bilinear sensing problem and show that its moduli of weak convexity, sharpness and Lipschitz continuity are all dimension independent, under favorable statistical assumptions. This phenomenon persists even when up to half of the measurements are corrupted by noise. Consequently, standard algorithms, such as the subgradient and prox-linear methods, converge at a rapid dimension-independent rate when initialized within a constant relative error of the solution. We complete the paper with a new initialization strategy, complementing the local search algorithms. The initialization procedure is both provably efficient and robust to outlying measurements. Numerical experiments, on both simulated and real data, illustrate the developed theory and methods.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"333-396"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Robust and resource efficient identification of shallow neural networks by fewest samples 基于最少样本的浅层神经网络鲁棒有效识别
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-01 DOI: 10.1093/imaiai/iaaa036
Massimo Fornasier;Jan Vybíral;Ingrid Daubechies
We address the structure identification and the uniform approximation of sums of ridge functions $f(x)=sum _{i=1}^m g_i(langle a_i,xrangle )$ on ${mathbb{R}}^d$, representing a general form of a shallow feed-forward neural network, from a small number of query samples. Higher order differentiation, as used in our constructive approximations, of sums of ridge functions or of their compositions, as in deeper neural network, yields a natural connection between neural network weight identification and tensor product decomposition identification. In the case of the shallowest feed-forward neural network, second-order differentiation and tensors of order two (i.e., matrices) suffice as we prove in this paper. We use two sampling schemes to perform approximate differentiation—active sampling, where the sampling points are universal, actively and randomly designed, and passive sampling, where sampling points were preselected at random from a distribution with known density. Based on multiple gathered approximated first- and second-order differentials, our general approximation strategy is developed as a sequence of algorithms to perform individual sub-tasks. We first perform an active subspace search by approximating the span of the weight vectors $a_1,dots ,a_m$. Then we use a straightforward substitution, which reduces the dimensionality of the problem from $d$ to $m$. The core of the construction is then the stable and efficient approximation of weights expressed in terms of rank-$1$ matrices $a_i otimes a_i$, realized by formulating their individual identification as a suitable nonlinear program. We prove the successful identification by this program of weight vectors being close to orthonormal and we also show how we can constructively reduce to this case by a whitening procedure, without loss of any generality. We finally discuss the implementation and the performance of the proposed algorithmic pipeline with extensive numerical experiments, which illustrate and confirm the theoretical results.
我们从少量的查询样本中讨论了脊函数$f(x)=sum_{i=1}^m g_i(langle a_i,xrangle)$在${mathbb{R}^d$上的和的结构识别和一致逼近,该函数表示浅层前馈神经网络的一般形式。在我们的构造近似中,脊函数的和或其组成的高阶微分,如在更深的神经网络中,在神经网络权重识别和张量积分解识别之间产生了自然的联系。在最浅的前馈神经网络的情况下,正如我们在本文中证明的那样,二阶微分和二阶张量(即矩阵)就足够了。我们使用两种采样方案来执行近似微分——主动采样,其中采样点是通用的、主动和随机设计的;被动采样,其中从已知密度的分布中随机预选采样点。基于多个集合的近似一阶和二阶微分,我们的通用近似策略被开发为执行单个子任务的一系列算法。我们首先通过近似权重向量$a_1,dots,a_m$的跨度来执行主动子空间搜索。然后我们使用一个直接的替换,它将问题的维数从$d$降低到$m$。该结构的核心是用秩-$1$矩阵$a_iotimes a_i$表示的权重的稳定有效近似,通过将它们的个体识别公式化为合适的非线性程序来实现。我们证明了通过该程序成功地识别了接近正交的权重向量,我们还展示了如何通过白化程序建设性地减少到这种情况,而不损失任何通用性。最后,我们通过大量的数值实验讨论了所提出的算法流水线的实现和性能,这些实验说明并证实了理论结果。
{"title":"Robust and resource efficient identification of shallow neural networks by fewest samples","authors":"Massimo Fornasier;Jan Vybíral;Ingrid Daubechies","doi":"10.1093/imaiai/iaaa036","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa036","url":null,"abstract":"We address the structure identification and the uniform approximation of sums of ridge functions \u0000<tex>$f(x)=sum _{i=1}^m g_i(langle a_i,xrangle )$</tex>\u0000 on \u0000<tex>${mathbb{R}}^d$</tex>\u0000, representing a general form of a shallow feed-forward neural network, from a small number of query samples. Higher order differentiation, as used in our constructive approximations, of sums of ridge functions or of their compositions, as in deeper neural network, yields a natural connection between neural network weight identification and tensor product decomposition identification. In the case of the shallowest feed-forward neural network, second-order differentiation and tensors of order two (i.e., matrices) suffice as we prove in this paper. We use two sampling schemes to perform approximate differentiation—active sampling, where the sampling points are universal, actively and randomly designed, and passive sampling, where sampling points were preselected at random from a distribution with known density. Based on multiple gathered approximated first- and second-order differentials, our general approximation strategy is developed as a sequence of algorithms to perform individual sub-tasks. We first perform an active subspace search by approximating the span of the weight vectors \u0000<tex>$a_1,dots ,a_m$</tex>\u0000. Then we use a straightforward substitution, which reduces the dimensionality of the problem from \u0000<tex>$d$</tex>\u0000 to \u0000<tex>$m$</tex>\u0000. The core of the construction is then the stable and efficient approximation of weights expressed in terms of rank-\u0000<tex>$1$</tex>\u0000 matrices \u0000<tex>$a_i otimes a_i$</tex>\u0000, realized by formulating their individual identification as a suitable nonlinear program. We prove the successful identification by this program of weight vectors being close to orthonormal and we also show how we can constructively reduce to this case by a whitening procedure, without loss of any generality. We finally discuss the implementation and the performance of the proposed algorithmic pipeline with extensive numerical experiments, which illustrate and confirm the theoretical results.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"625-695"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Sensitivity of ℓ1 minimization to parameter choice 的灵敏度ℓ1参数选择的最小化
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-01 DOI: 10.1093/imaiai/iaaa014
Aaron Berk;Yaniv Plan;Özgür Yilmaz
The use of generalized Lasso is a common technique for recovery of structured high-dimensional signals. There are three common formulations of generalized Lasso; each program has a governing parameter whose optimal value depends on properties of the data. At this optimal value, compressed sensing theory explains why Lasso programs recover structured high-dimensional signals with minimax order-optimal error. Unfortunately in practice, the optimal choice is generally unknown and must be estimated. Thus, we investigate stability of each of the three Lasso programs with respect to its governing parameter. Our goal is to aid the practitioner in answering the following question: given real data, which Lasso program should be used? We take a step towards answering this by analysing the case where the measurement matrix is identity (the so-called proximal denoising setup) and we use $ell _{1}$ regularization. For each Lasso program, we specify settings in which that program is provably unstable with respect to its governing parameter. We support our analysis with detailed numerical simulations. For example, there are settings where a 0.1% underestimate of a Lasso parameter can increase the error significantly and a 50% underestimate can cause the error to increase by a factor of $10^{9}$.
广义Lasso的使用是用于恢复结构化高维信号的常用技术。广义拉索有三种常见的公式;每个程序都有一个控制参数,其最优值取决于数据的性质。在这个最优值下,压缩传感理论解释了为什么Lasso程序恢复具有最小-最大阶最优误差的结构化高维信号。不幸的是,在实践中,最佳选择通常是未知的,必须进行估计。因此,我们研究了三个拉索程序中每一个程序相对于其控制参数的稳定性。我们的目标是帮助从业者回答以下问题:给定真实数据,应该使用哪个Lasso程序?我们通过分析测量矩阵是恒等式的情况(所谓的近端去噪设置),并使用$ell_{1}$正则化,朝着回答这个问题迈出了一步。对于每个Lasso程序,我们指定程序相对于其控制参数可证明不稳定的设置。我们通过详细的数值模拟来支持我们的分析。例如,在某些设置中,对Lasso参数低估0.1%会显著增加误差,低估50%会导致误差增加$10^{9}$。
{"title":"Sensitivity of ℓ1 minimization to parameter choice","authors":"Aaron Berk;Yaniv Plan;Özgür Yilmaz","doi":"10.1093/imaiai/iaaa014","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa014","url":null,"abstract":"The use of generalized Lasso is a common technique for recovery of structured high-dimensional signals. There are three common formulations of generalized Lasso; each program has a governing parameter whose optimal value depends on properties of the data. At this optimal value, compressed sensing theory explains why Lasso programs recover structured high-dimensional signals with minimax order-optimal error. Unfortunately in practice, the optimal choice is generally unknown and must be estimated. Thus, we investigate stability of each of the three Lasso programs with respect to its governing parameter. Our goal is to aid the practitioner in answering the following question: given real data, which Lasso program should be used? We take a step towards answering this by analysing the case where the measurement matrix is identity (the so-called proximal denoising setup) and we use \u0000<tex>$ell _{1}$</tex>\u0000 regularization. For each Lasso program, we specify settings in which that program is provably unstable with respect to its governing parameter. We support our analysis with detailed numerical simulations. For example, there are settings where a 0.1% underestimate of a Lasso parameter can increase the error significantly and a 50% underestimate can cause the error to increase by a factor of \u0000<tex>$10^{9}$</tex>\u0000.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"397-453"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Super-resolution of near-colliding point sources 近碰撞点源的超分辨率
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-01 DOI: 10.1093/imaiai/iaaa005
Dmitry Batenkov;Gil Goldman;Yosef Yomdin
We consider the problem of stable recovery of sparse signals of the form $$begin{equation*}F(x)=sum_{j=1}^d a_jdelta(x-x_j),quad x_jinmathbb{R},;a_jinmathbb{C}, end{equation*}$$ from their spectral measurements, known in a bandwidth $varOmega $ with absolute error not exceeding $epsilon>0$. We consider the case when at most $pleqslant d$ nodes ${x_j}$ of $F$ form a cluster whose extent is smaller than the Rayleigh limit ${1over varOmega }$, while the rest of the nodes is well separated. Provided that $epsilon lessapprox operatorname{SRF}^{-2p+1}$, where $operatorname{SRF}=(varOmega varDelta )^{-1}$ and $varDelta $ is the minimal separation between the nodes, we show that the minimax error rate for reconstruction of the cluster nodes is of order ${1over varOmega }operatorname{SRF}^{2p-1}epsilon $, while for recovering the corresponding amplitudes ${a_j}$ the rate is of the order $operatorname{SRF}^{2p-1}epsilon $. Moreover, the corresponding minimax rates for the recovery of the non-clustered nodes and amplitudes are ${epsilon over varOmega }$ and $epsilon $, respectively. These results suggest that stable super-resolution is possible in much more general situations than previously thought. Our numerical experiments show that the well-known matrix pencil method achieves the above accuracy bounds.
我们考虑形式为$$beart{方程*}F(x)=sum_{j=1}^d a_jdelta(x-x_j),quad x_jinmathbb{R},的稀疏信号的稳定恢复问题;a_jinmathbb{C},end{方程*}$$来自它们的光谱测量,在带宽$varOmega$中已知,绝对误差不超过$epsilon>;0美元。我们考虑这样的情况,即$F$的最多$pleqslant d$节点${x_j}$形成一个范围小于瑞利极限${1overvarOmega}$的簇,而其余节点则很好地分离。假设$epsilonlessapproxoperatorname{SRF}^{-2p+1}$,其中$operatorname{SRF}=(varOmegavarDelta)^{-1}$和$varDelta$是节点之间的最小间隔,我们证明了簇节点重构的最小最大错误率为${1overvarOmega}operatorname{SRF}^{2p-1}epsilon$,而对于恢复相应的振幅${a_j}$,速率为$运算符名称{SRF}^{2p-1}ε$的阶。此外,非聚类节点和振幅的恢复的相应的最小-最大速率分别为${epsilonovervarOmega}$和$epsilon$。这些结果表明,在比以前想象的更普遍的情况下,稳定的超分辨率是可能的。我们的数值实验表明,众所周知的矩阵笔方法达到了上述精度界限。
{"title":"Super-resolution of near-colliding point sources","authors":"Dmitry Batenkov;Gil Goldman;Yosef Yomdin","doi":"10.1093/imaiai/iaaa005","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa005","url":null,"abstract":"We consider the problem of stable recovery of sparse signals of the form \u0000<tex>$$begin{equation*}F(x)=sum_{j=1}^d a_jdelta(x-x_j),quad x_jinmathbb{R},;a_jinmathbb{C}, end{equation*}$$</tex>\u0000 from their spectral measurements, known in a bandwidth \u0000<tex>$varOmega $</tex>\u0000 with absolute error not exceeding \u0000<tex>$epsilon&gt;0$</tex>\u0000. We consider the case when at most \u0000<tex>$pleqslant d$</tex>\u0000 nodes \u0000<tex>${x_j}$</tex>\u0000 of \u0000<tex>$F$</tex>\u0000 form a cluster whose extent is smaller than the Rayleigh limit \u0000<tex>${1over varOmega }$</tex>\u0000, while the rest of the nodes is well separated. Provided that \u0000<tex>$epsilon lessapprox operatorname{SRF}^{-2p+1}$</tex>\u0000, where \u0000<tex>$operatorname{SRF}=(varOmega varDelta )^{-1}$</tex>\u0000 and \u0000<tex>$varDelta $</tex>\u0000 is the minimal separation between the nodes, we show that the minimax error rate for reconstruction of the cluster nodes is of order \u0000<tex>${1over varOmega }operatorname{SRF}^{2p-1}epsilon $</tex>\u0000, while for recovering the corresponding amplitudes \u0000<tex>${a_j}$</tex>\u0000 the rate is of the order \u0000<tex>$operatorname{SRF}^{2p-1}epsilon $</tex>\u0000. Moreover, the corresponding minimax rates for the recovery of the non-clustered nodes and amplitudes are \u0000<tex>${epsilon over varOmega }$</tex>\u0000 and \u0000<tex>$epsilon $</tex>\u0000, respectively. These results suggest that stable super-resolution is possible in much more general situations than previously thought. Our numerical experiments show that the well-known matrix pencil method achieves the above accuracy bounds.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"515-572"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Low-rank matrix completion and denoising under Poisson noise 泊松噪声下的低秩矩阵补全与去噪
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-10-01 DOI: 10.1093/imaiai/iaaa020
Andrew D McRae;Mark A Davenport
This paper considers the problem of estimating a low-rank matrix from the observation of all or a subset of its entries in the presence of Poisson noise. When we observe all entries, this is a problem of matrix denoising; when we observe only a subset of the entries, this is a problem of matrix completion. In both cases, we exploit an assumption that the underlying matrix is low-rank. Specifically, we analyse several estimators, including a constrained nuclear-norm minimization program, nuclear-norm regularized least squares and a non-convex constrained low-rank optimization problem. We show that for all three estimators, with high probability, we have an upper error bound (in the Frobenius norm error metric) that depends on the matrix rank, the fraction of the elements observed and the maximal row and column sums of the true matrix. We furthermore show that the above results are minimax optimal (within a universal constant) in classes of matrices with low-rank and bounded row and column sums. We also extend these results to handle the case of matrix multinomial denoising and completion.
本文考虑了在存在泊松噪声的情况下,根据对其所有或子集项的观测来估计低秩矩阵的问题。当我们观察所有条目时,这是一个矩阵去噪的问题;当我们只观察条目的子集时,这是一个矩阵完备的问题。在这两种情况下,我们都利用了一个假设,即底层矩阵是低秩的。具体地,我们分析了几个估计量,包括约束核范数最小化程序、核范数正则化最小二乘和非凸约束低秩优化问题。我们证明,对于所有三个估计量,在高概率的情况下,我们有一个误差上界(在Frobenius范数误差度量中),它取决于矩阵秩、观察到的元素的分数以及真矩阵的最大行和和列和。我们进一步证明了在具有低秩和有界行和列和的矩阵类中,上述结果是极小极大最优的(在通用常数内)。我们还将这些结果扩展到处理矩阵多项式去噪和补全的情况。
{"title":"Low-rank matrix completion and denoising under Poisson noise","authors":"Andrew D McRae;Mark A Davenport","doi":"10.1093/imaiai/iaaa020","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa020","url":null,"abstract":"This paper considers the problem of estimating a low-rank matrix from the observation of all or a subset of its entries in the presence of Poisson noise. When we observe all entries, this is a problem of matrix denoising; when we observe only a subset of the entries, this is a problem of matrix completion. In both cases, we exploit an assumption that the underlying matrix is low-rank. Specifically, we analyse several estimators, including a constrained nuclear-norm minimization program, nuclear-norm regularized least squares and a non-convex constrained low-rank optimization problem. We show that for all three estimators, with high probability, we have an upper error bound (in the Frobenius norm error metric) that depends on the matrix rank, the fraction of the elements observed and the maximal row and column sums of the true matrix. We furthermore show that the above results are minimax optimal (within a universal constant) in classes of matrices with low-rank and bounded row and column sums. We also extend these results to handle the case of matrix multinomial denoising and completion.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"697-720"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa020","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Mutual information for low-rank even-order symmetric tensor estimation 低秩偶阶对称张量估计的互信息
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-09-24 DOI: 10.1093/imaiai/iaaa022
Clément Luneau, Jean Barbier, N. Macris
We consider a statistical model for finite-rank symmetric tensor factorization and prove a singleletter variational expression for its asymptotic mutual information when the tensor is of even order. The proof applies the adaptive interpolation method originally invented for rank-one factorization. Here we show how to extend the adaptive interpolation to finite-rank and even-order tensors. This requires new nontrivial ideas with respect to the current analysis in the literature. We also underline where the proof falls short when dealing with odd-order tensors.
考虑有限秩对称张量分解的统计模型,证明了其偶阶张量渐近互信息的单字母变分表达式。证明采用了最初发明的秩一分解自适应插值方法。这里我们展示了如何将自适应插值扩展到有限秩和偶阶张量。这就需要在当前文献分析的基础上提出新的重要观点。我们还强调了在处理奇阶张量时证明不足的地方。
{"title":"Mutual information for low-rank even-order symmetric tensor estimation","authors":"Clément Luneau, Jean Barbier, N. Macris","doi":"10.1093/imaiai/iaaa022","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa022","url":null,"abstract":"We consider a statistical model for finite-rank symmetric tensor factorization and prove a singleletter variational expression for its asymptotic mutual information when the tensor is of even order. The proof applies the adaptive interpolation method originally invented for rank-one factorization. Here we show how to extend the adaptive interpolation to finite-rank and even-order tensors. This requires new nontrivial ideas with respect to the current analysis in the literature. We also underline where the proof falls short when dealing with odd-order tensors.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"44 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2020-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77112189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Two-sample statistics based on anisotropic kernels. 基于各向异性核的双样本统计。
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-09-01 Epub Date: 2019-12-10 DOI: 10.1093/imaiai/iaz018
Xiuyuan Cheng, Alexander Cloninger, Ronald R Coifman

The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between [Formula: see text] data points and a set of [Formula: see text] reference points, where [Formula: see text] can be drastically smaller than [Formula: see text]. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as [Formula: see text], and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.

本文介绍了一种新的基于核的最大平均差异统计量,用于测量给定有限多变量样本的两个分布之间的距离。当分布是局部低维时,通过结合局部协方差矩阵和构造各向异性核,可以使所提出的测试更有效地区分某些备选方案。核矩阵是非对称的;它计算[公式:参见文本]数据点与一组[公式:参见文本]参考点之间的关联,其中[公式:参见文本]可能比[公式:参见文本]小得多。虽然所提出的统计量可以看作是一类特殊的再现核希尔伯特空间MMD,但在核的温和假设下,只要[公式:见文],就证明了检验的一致性,并得到了检验能力的有限样本下界。应用于流式细胞术和扩散MRI数据集被证明,这激发了提出的方法来比较分布。
{"title":"Two-sample statistics based on anisotropic kernels.","authors":"Xiuyuan Cheng,&nbsp;Alexander Cloninger,&nbsp;Ronald R Coifman","doi":"10.1093/imaiai/iaz018","DOIUrl":"https://doi.org/10.1093/imaiai/iaz018","url":null,"abstract":"<p><p>The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between [Formula: see text] data points and a set of [Formula: see text] reference points, where [Formula: see text] can be drastically smaller than [Formula: see text]. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as [Formula: see text], and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"9 3","pages":"677-719"},"PeriodicalIF":1.6,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaz018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38382429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Sparse confidence sets for normal mean models 正态均值模型的稀疏置信集
IF 1.6 4区 数学 Q2 MATHEMATICS, APPLIED Pub Date : 2020-08-17 DOI: 10.1093/imaiai/iaad003
Y. Ning, Guang Cheng
In this paper, we propose a new framework to construct confidence sets for a $d$-dimensional unknown sparse parameter ${boldsymbol theta }$ under the normal mean model ${boldsymbol X}sim N({boldsymbol theta },sigma ^{2}bf{I})$. A key feature of the proposed confidence set is its capability to account for the sparsity of ${boldsymbol theta }$, thus named as sparse confidence set. This is in sharp contrast with the classical methods, such as the Bonferroni confidence intervals and other resampling-based procedures, where the sparsity of ${boldsymbol theta }$ is often ignored. Specifically, we require the desired sparse confidence set to satisfy the following two conditions: (i) uniformly over the parameter space, the coverage probability for ${boldsymbol theta }$ is above a pre-specified level; (ii) there exists a random subset $S$ of ${1,...,d}$ such that $S$ guarantees the pre-specified true negative rate for detecting non-zero $theta _{j}$’s. To exploit the sparsity of ${boldsymbol theta }$, we allow the confidence interval for $theta _{j}$ to degenerate to a single point 0 for any $jnotin S$. Under this new framework, we first consider whether there exist sparse confidence sets that satisfy the above two conditions. To address this question, we establish a non-asymptotic minimax lower bound for the non-coverage probability over a suitable class of sparse confidence sets. The lower bound deciphers the role of sparsity and minimum signal-to-noise ratio (SNR) in the construction of sparse confidence sets. Furthermore, under suitable conditions on the SNR, a two-stage procedure is proposed to construct a sparse confidence set. To evaluate the optimality, the proposed sparse confidence set is shown to attain a minimax lower bound of some properly defined risk function up to a constant factor. Finally, we develop an adaptive procedure to the unknown sparsity. Numerical studies are conducted to verify the theoretical results.
本文提出了一种新的框架,用于在正态均值模型${boldsymbol X}sim N({boldsymbol theta },sigma ^{2}bf{I})$下构造$d$维未知稀疏参数${boldsymbol theta }$的置信集。所提出的置信集的一个关键特征是它能够考虑到${boldsymbol theta }$的稀疏性,因此被称为稀疏置信集。这与经典方法形成鲜明对比,例如Bonferroni置信区间和其他基于重采样的程序,其中${boldsymbol theta }$的稀疏性通常被忽略。具体来说,我们要求所需的稀疏置信集满足以下两个条件:(i)在参数空间上均匀地,${boldsymbol theta }$的覆盖概率大于预先指定的水平;(ii)存在一个${1,...,d}$的随机子集$S$,使得$S$保证预先指定的检测非零$theta _{j}$的真负率。为了利用${boldsymbol theta }$的稀疏性,我们允许$theta _{j}$的置信区间退化为任意$jnotin S$的单个点0。在此框架下,我们首先考虑是否存在满足上述两个条件的稀疏置信集。为了解决这个问题,我们建立了一类合适的稀疏置信集上的非覆盖概率的非渐近极小极大下界。下界解释了稀疏度和最小信噪比(SNR)在稀疏置信集构建中的作用。此外,在适当的信噪比条件下,提出了一种两阶段构造稀疏置信集的方法。为了评估最优性,所提出的稀疏置信集被证明可以获得一些适当定义的风险函数的最小极大下界,直至一个常数因子。最后,提出了一种对未知稀疏度的自适应处理方法。数值研究验证了理论结果。
{"title":"Sparse confidence sets for normal mean models","authors":"Y. Ning, Guang Cheng","doi":"10.1093/imaiai/iaad003","DOIUrl":"https://doi.org/10.1093/imaiai/iaad003","url":null,"abstract":"\u0000 In this paper, we propose a new framework to construct confidence sets for a $d$-dimensional unknown sparse parameter ${boldsymbol theta }$ under the normal mean model ${boldsymbol X}sim N({boldsymbol theta },sigma ^{2}bf{I})$. A key feature of the proposed confidence set is its capability to account for the sparsity of ${boldsymbol theta }$, thus named as sparse confidence set. This is in sharp contrast with the classical methods, such as the Bonferroni confidence intervals and other resampling-based procedures, where the sparsity of ${boldsymbol theta }$ is often ignored. Specifically, we require the desired sparse confidence set to satisfy the following two conditions: (i) uniformly over the parameter space, the coverage probability for ${boldsymbol theta }$ is above a pre-specified level; (ii) there exists a random subset $S$ of ${1,...,d}$ such that $S$ guarantees the pre-specified true negative rate for detecting non-zero $theta _{j}$’s. To exploit the sparsity of ${boldsymbol theta }$, we allow the confidence interval for $theta _{j}$ to degenerate to a single point 0 for any $jnotin S$. Under this new framework, we first consider whether there exist sparse confidence sets that satisfy the above two conditions. To address this question, we establish a non-asymptotic minimax lower bound for the non-coverage probability over a suitable class of sparse confidence sets. The lower bound deciphers the role of sparsity and minimum signal-to-noise ratio (SNR) in the construction of sparse confidence sets. Furthermore, under suitable conditions on the SNR, a two-stage procedure is proposed to construct a sparse confidence set. To evaluate the optimality, the proposed sparse confidence set is shown to attain a minimax lower bound of some properly defined risk function up to a constant factor. Finally, we develop an adaptive procedure to the unknown sparsity. Numerical studies are conducted to verify the theoretical results.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"42 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80096659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Information and Inference-A Journal of the Ima
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1