首页 > 最新文献

Annals of Statistics最新文献

英文 中文
Which bridge estimator is the best for variable selection? 哪种桥接估计器最适合变量选择?
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-10-01 DOI: 10.1214/19-AOS1906
Shuaiwen Wang, Haolei Weng, A. Maleki
{"title":"Which bridge estimator is the best for variable selection?","authors":"Shuaiwen Wang, Haolei Weng, A. Maleki","doi":"10.1214/19-AOS1906","DOIUrl":"https://doi.org/10.1214/19-AOS1906","url":null,"abstract":"","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2791-2823"},"PeriodicalIF":4.5,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47315145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Estimation and inference for precision matrices of nonstationary time series 非平稳时间序列精度矩阵的估计与推理
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-08-01 DOI: 10.1214/19-aos1894
Xiucai Ding, Zhou Zhou
{"title":"Estimation and inference for precision matrices of nonstationary time series","authors":"Xiucai Ding, Zhou Zhou","doi":"10.1214/19-aos1894","DOIUrl":"https://doi.org/10.1214/19-aos1894","url":null,"abstract":"","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2455-2477"},"PeriodicalIF":4.5,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49371694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Extending the validity of frequency domain bootstrap methods to general stationary processes 将频域自举方法的有效性推广到一般平稳过程
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-08-01 DOI: 10.1214/19-aos1892
M. Meyer, E. Paparoditis, Jens-Peter Kreiss
Existing frequency domain methods for bootstrapping time series have a limited range. Essentially, these procedures cover the case of linear time series with independent innovations, and some even require the time series to be Gaussian. In this paper we propose a new frequency domain bootstrap method – the hybrid periodogram bootstrap (HPB) – which is consistent for a much wider range of stationary, even nonlinear, processes and which can be applied to a large class of periodogram-based statistics. The HPB is designed to combine desirable features of different frequency domain techniques while overcoming their respective limitations. It is capable to imitate the weak dependence structure of the periodogram by invoking the concept of convolved subsampling in a novel way that is tailor-made for periodograms. We show consistency for the HPB procedure for a general class of stationary time series, ranging clearly beyond linear processes, and for spectral means and ratio statistics, on which we mainly focus. The finite sample performance of the new bootstrap procedure is illustrated via simulations.
现有的用于自举时间序列的频域方法具有有限的范围。从本质上讲,这些程序涵盖了具有独立创新的线性时间序列的情况,有些程序甚至要求时间序列是高斯的。在本文中,我们提出了一种新的频域自举方法——混合周期图自举(HPB)——它对更广泛的平稳甚至非线性过程是一致的,并且可以应用于一大类基于周期图的统计。HPB被设计为结合不同频域技术的期望特征,同时克服它们各自的局限性。它能够通过调用卷积子采样的概念,以一种为周期图量身定制的新颖方式来模拟周期图的弱依赖性结构。我们展示了一类平稳时间序列的HPB过程的一致性,其范围明显超出线性过程,以及我们主要关注的谱均值和比值统计。通过仿真说明了新的自举程序的有限样本性能。
{"title":"Extending the validity of frequency domain bootstrap methods to general stationary processes","authors":"M. Meyer, E. Paparoditis, Jens-Peter Kreiss","doi":"10.1214/19-aos1892","DOIUrl":"https://doi.org/10.1214/19-aos1892","url":null,"abstract":"Existing frequency domain methods for bootstrapping time series have a limited range. Essentially, these procedures cover the case of linear time series with independent innovations, and some even require the time series to be Gaussian. In this paper we propose a new frequency domain bootstrap method – the hybrid periodogram bootstrap (HPB) – which is consistent for a much wider range of stationary, even nonlinear, processes and which can be applied to a large class of periodogram-based statistics. The HPB is designed to combine desirable features of different frequency domain techniques while overcoming their respective limitations. It is capable to imitate the weak dependence structure of the periodogram by invoking the concept of convolved subsampling in a novel way that is tailor-made for periodograms. We show consistency for the HPB procedure for a general class of stationary time series, ranging clearly beyond linear processes, and for spectral means and ratio statistics, on which we mainly focus. The finite sample performance of the new bootstrap procedure is illustrated via simulations.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2404-2427"},"PeriodicalIF":4.5,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49428900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Double-slicing assisted sufficient dimension reduction for high-dimensional censored data 双切片有助于对高维截尾数据进行充分的降维
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-08-01 DOI: 10.1214/19-aos1880
Shanshan Ding, W. Qian, Lan Wang
This paper provides a unified framework and an efficient algorithm for analyzing high-dimensional survival data under weak modeling assumptions. In particular, it imposes neither parametric distributional assumption nor linear regression assumption. It only assumes that the survival time T depends on a high-dimensional covariate vector X through low-dimensional linear combinations of covariates ΓX. The censoring time is allowed to be conditionally independent of the survival time given the covariates. This general framework includes many popular parametric and semiparametric survival regression models as special cases. The proposed algorithm produces a number of practically useful outputs with theoretical guarantees, including a consistent estimate of the sufficient dimension reduction subspace of T |X, a uniformly consistent Kaplan-Meier type estimator of the conditional distribution function of T and a consistent estimator of the conditional quantile survival time. Our asymptotic results significantly extend the classical theory of sufficient dimension reduction for censored data (particularly that of Li et al. 1999) and the celebrated nonparametric Kaplan-Meier estimator to the setting where the number of covariates p diverges exponentially fast with the sample size n. We demonstrate the promising performance of the proposed new estimators through simulations and a real data example.
本文为弱建模假设下的高维生存数据分析提供了一个统一的框架和有效的算法。特别是,它既没有参数分布假设,也没有线性回归假设。它只假设生存时间T通过协变量的低维线性组合依赖于高维协变量向量X ΓX。允许审查时间与给定协变量的生存时间有条件地独立。这个通用框架包括许多流行的参数和半参数生存回归模型作为特殊情况。提出的算法产生了许多具有理论保证的实际有用的输出,包括T |X的充分降维子空间的一致估计,T的条件分布函数的一致一致Kaplan-Meier型估计和条件分位数生存时间的一致估计。我们的渐近结果显著地将经典的删节数据充分降维理论(特别是Li et al. 1999)和著名的非参数Kaplan-Meier估计扩展到协变量数p随样本量n呈指数快速发散的设置。我们通过模拟和实际数据示例证明了所提出的新估计的良好性能。
{"title":"Double-slicing assisted sufficient dimension reduction for high-dimensional censored data","authors":"Shanshan Ding, W. Qian, Lan Wang","doi":"10.1214/19-aos1880","DOIUrl":"https://doi.org/10.1214/19-aos1880","url":null,"abstract":"This paper provides a unified framework and an efficient algorithm for analyzing high-dimensional survival data under weak modeling assumptions. In particular, it imposes neither parametric distributional assumption nor linear regression assumption. It only assumes that the survival time T depends on a high-dimensional covariate vector X through low-dimensional linear combinations of covariates ΓX. The censoring time is allowed to be conditionally independent of the survival time given the covariates. This general framework includes many popular parametric and semiparametric survival regression models as special cases. The proposed algorithm produces a number of practically useful outputs with theoretical guarantees, including a consistent estimate of the sufficient dimension reduction subspace of T |X, a uniformly consistent Kaplan-Meier type estimator of the conditional distribution function of T and a consistent estimator of the conditional quantile survival time. Our asymptotic results significantly extend the classical theory of sufficient dimension reduction for censored data (particularly that of Li et al. 1999) and the celebrated nonparametric Kaplan-Meier estimator to the setting where the number of covariates p diverges exponentially fast with the sample size n. We demonstrate the promising performance of the proposed new estimators through simulations and a real data example.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2132-2154"},"PeriodicalIF":4.5,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44672142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Beyond HC: More sensitive tests for rare/weak alternatives 超越HC:罕见/弱替代品的更敏感测试
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-08-01 DOI: 10.1214/19-aos1885
Thomas Porter, M. Stewart
Higher criticism (HC) is a popular method for large-scale inference problems based on identifying unusually high proportions of small pvalues. It has been shown to enjoy a lower-order optimality property in a simple normal location mixture model which is shared by the ‘tailor-made’ parametric generalised likelihood ratio test (GLRT) for the same model, however HC has also been shown to perform well outside this ‘narrow’ model. We develop a higher-order framework for analysing the power of these and similar procedures, which reveals the perhaps unsurprising fact that the GLRT enjoys an edge in power over HC for the normal location mixture model. We also identify a similar parametric mixture model to which HC is similarly ‘tailor-made’ and show that the situation is (at least partly) reversed there. We also show that in the normal location mixture model a procedure based on the empirical moment-generating function enjoys the same local power properties as the GLRT and may be recommended as an easy to implement (and interpret), complementary procedure to HC. Some other practical advice regarding the implementation of these procedures is provided. Finally we provide some simulation results to help interpret our theoretical findings.
高等批评(HC)是一种基于识别异常高比例的小p值的大规模推理问题的流行方法。在一个简单的正态位置混合模型中,它被证明具有较低阶的最优性性质,这是同一模型的“量身定制”参数广义似然比检验(GLRT)所共享的,然而HC也被证明在这个“窄”模型之外表现良好。我们开发了一个高阶框架来分析这些和类似程序的功率,这揭示了一个可能并不令人惊讶的事实,即对于正常位置混合模型,GLRT在功率上优于HC。我们还确定了一个类似的参数混合物模型,HC与该模型类似地“量身定制”,并表明情况(至少部分)在那里发生了逆转。我们还表明,在正常位置混合模型中,基于经验矩生成函数的程序与GLRT具有相同的局部功率特性,并且可以被推荐为易于实现(和解释)的HC补充程序。还提供了关于执行这些程序的一些其他实际建议。最后,我们提供了一些模拟结果来帮助解释我们的理论发现。
{"title":"Beyond HC: More sensitive tests for rare/weak alternatives","authors":"Thomas Porter, M. Stewart","doi":"10.1214/19-aos1885","DOIUrl":"https://doi.org/10.1214/19-aos1885","url":null,"abstract":"Higher criticism (HC) is a popular method for large-scale inference problems based on identifying unusually high proportions of small pvalues. It has been shown to enjoy a lower-order optimality property in a simple normal location mixture model which is shared by the ‘tailor-made’ parametric generalised likelihood ratio test (GLRT) for the same model, however HC has also been shown to perform well outside this ‘narrow’ model. We develop a higher-order framework for analysing the power of these and similar procedures, which reveals the perhaps unsurprising fact that the GLRT enjoys an edge in power over HC for the normal location mixture model. We also identify a similar parametric mixture model to which HC is similarly ‘tailor-made’ and show that the situation is (at least partly) reversed there. We also show that in the normal location mixture model a procedure based on the empirical moment-generating function enjoys the same local power properties as the GLRT and may be recommended as an easy to implement (and interpret), complementary procedure to HC. Some other practical advice regarding the implementation of these procedures is provided. Finally we provide some simulation results to help interpret our theoretical findings.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2230-2252"},"PeriodicalIF":4.5,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44608637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Bayesian analysis of the covariance matrix of a multivariate normal distribution with a new class of priors 具有一类新先验的多元正态分布协方差矩阵的贝叶斯分析
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-08-01 DOI: 10.1214/19-aos1891
J. Berger, Dongchu Sun, Chengyuan Song
Bayesian analysis for the covariance matrix of a multivariate normal distribution has received a lot of attention in the last two decades. In this paper, we propose a new class of priors for the covariance matrix, including both inverse Wishart and reference priors as special cases. The main motivation for the new class is to have available priors – both subjective and objective – that do not “force eigenvalues apart,” which is a criticism of inverse Wishart and Jeffreys priors. Extensive comparison of these ‘shrinkage priors’ with inverse Wishart and Jeffreys priors is undertaken, with the new priors seeming to have considerably better performance. A number of curious facts about the new priors are also observed, such as that the posterior distribution will be proper with just three vector observations from the multivariate normal distribution – regardless of the dimension of the covariance matrix – and that useful inference about features of the covariance matrix can be possible. Finally, a new MCMC algorithm is developed for this class of priors and is shown to be computationally effective for matrices of up to 100 dimensions.
多元正态分布协方差矩阵的贝叶斯分析在过去二十年中受到了很多关注。在本文中,我们为协方差矩阵提出了一类新的先验,包括作为特例的逆Wishart和参考先验。新类别的主要动机是拥有可用的先验——包括主观和客观先验——它们不会“迫使特征值分离”,这是对逆Wishart和Jeffreys先验的批评。对这些“收缩先验”与逆Wishart和Jeffreys先验进行了广泛的比较,新的先验似乎具有更好的性能。还观察到了一些关于新先验的奇怪事实,例如,无论协方差矩阵的维度如何,多元正态分布中只有三个向量观测值的后验分布是正确的,并且可以对协方差矩阵的特征进行有用的推断。最后,为这类先验开发了一种新的MCMC算法,该算法对高达100维的矩阵在计算上是有效的。
{"title":"Bayesian analysis of the covariance matrix of a multivariate normal distribution with a new class of priors","authors":"J. Berger, Dongchu Sun, Chengyuan Song","doi":"10.1214/19-aos1891","DOIUrl":"https://doi.org/10.1214/19-aos1891","url":null,"abstract":"Bayesian analysis for the covariance matrix of a multivariate normal distribution has received a lot of attention in the last two decades. In this paper, we propose a new class of priors for the covariance matrix, including both inverse Wishart and reference priors as special cases. The main motivation for the new class is to have available priors – both subjective and objective – that do not “force eigenvalues apart,” which is a criticism of inverse Wishart and Jeffreys priors. Extensive comparison of these ‘shrinkage priors’ with inverse Wishart and Jeffreys priors is undertaken, with the new priors seeming to have considerably better performance. A number of curious facts about the new priors are also observed, such as that the posterior distribution will be proper with just three vector observations from the multivariate normal distribution – regardless of the dimension of the covariance matrix – and that useful inference about features of the covariance matrix can be possible. Finally, a new MCMC algorithm is developed for this class of priors and is shown to be computationally effective for matrices of up to 100 dimensions.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2381-2403"},"PeriodicalIF":4.5,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48597865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
GRID: A variable selection and structure discovery method for high dimensional nonparametric regression GRID:一种高维非参数回归的变量选择和结构发现方法
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-06-01 DOI: 10.1214/19-aos1846
F. Giordano, S. Lahiri, M. L. Parrella
We consider nonparametric regression in high dimensions where only a relatively small subset of a large number of variables are relevant and may have nonlinear effects on the response. We develop methods for variable selection, structure discovery and estimation of the true low-dimensional regression function, allowing any degree of interactions among the relevant variables that need not be specified a-priori. The proposed method, called the GRID, combines empirical likelihood based marginal testing with the local linear estimation machinery in a novel way to select the relevant variables. Further, it provides a simple graphical tool for identifying the low dimensional nonlinear structure of the regression function. Theoretical results establish consistency of variable selection and structure discovery, and also Oracle risk property of the GRID estimator of the regression function, allowing the dimension d of the covariates to grow with the sample size n at the rate d = O(n) for any a ∈ (0,∞) and the number of relevant covariates r to grow at a rate r = O(n) for some γ ∈ (0, 1) under some regularity conditions that, in particular, require finiteness of certain absolute moments of the error variables depending on a. Finite sample properties of the GRID are investigated in a moderately large simulation study.
我们考虑高维的非参数回归,其中只有大量变量中相对较小的子集是相关的,并且可能对响应产生非线性影响。我们开发了变量选择、结构发现和真实低维回归函数估计的方法,允许相关变量之间的任何程度的相互作用,而无需事先指定。所提出的方法称为GRID,以一种新颖的方式将基于经验似然的边际检验与局部线性估计机制相结合,以选择相关变量。此外,它提供了一个简单的图形工具,用于识别回归函数的低维非线性结构。理论结果建立了变量选择和结构发现的一致性以及回归函数的GRID估计量的Oracle风险性质,对于任意a∈(0,∞),允许协变量的维数d以d=O(n)的速率随样本量n增长,并且对于某些γ∈(0,1),允许相关协变量的数量r以r=O(n。在一个中等规模的模拟研究中,研究了网格的有限样本特性。
{"title":"GRID: A variable selection and structure discovery method for high dimensional nonparametric regression","authors":"F. Giordano, S. Lahiri, M. L. Parrella","doi":"10.1214/19-aos1846","DOIUrl":"https://doi.org/10.1214/19-aos1846","url":null,"abstract":"We consider nonparametric regression in high dimensions where only a relatively small subset of a large number of variables are relevant and may have nonlinear effects on the response. We develop methods for variable selection, structure discovery and estimation of the true low-dimensional regression function, allowing any degree of interactions among the relevant variables that need not be specified a-priori. The proposed method, called the GRID, combines empirical likelihood based marginal testing with the local linear estimation machinery in a novel way to select the relevant variables. Further, it provides a simple graphical tool for identifying the low dimensional nonlinear structure of the regression function. Theoretical results establish consistency of variable selection and structure discovery, and also Oracle risk property of the GRID estimator of the regression function, allowing the dimension d of the covariates to grow with the sample size n at the rate d = O(n) for any a ∈ (0,∞) and the number of relevant covariates r to grow at a rate r = O(n) for some γ ∈ (0, 1) under some regularity conditions that, in particular, require finiteness of certain absolute moments of the error variables depending on a. Finite sample properties of the GRID are investigated in a moderately large simulation study.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"1848-1874"},"PeriodicalIF":4.5,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48408022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK. 低预期秩随机矩阵的条目特征向量分析。
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-06-01 Epub Date: 2020-07-17 DOI: 10.1214/19-aos1854
Emmanuel Abbe, Jianqing Fan, Kaizheng Wang, Yiqiao Zhong

Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the norm: u k A u k * λ k * , where {u k } and { u k * } are eigenvectors of a random matrix A and its expectation E A , respectively. The fact that the approximation is both tight and linear in A facilitates sharp comparisons between u k and u k * . In particular, it allows for comparing the signs of u k and u k * even if u k - u k * is large. The results are further extended to perturbations of eigenspaces, yielding new -type bounds for synchronization ( 2 -spiked Wigner model) and noisy matrix completion.

通过特征向量扰动分析恢复低秩结构是统计机器学习中的一个常见问题,如因子分析、群落检测、排序、矩阵补全等。虽然对特征向量的经验统计和群体统计之间的平均误差有大量的约束,但很少有结果能严密地进行入口分析,而入口分析对群体检测等一系列问题至关重要。本文研究了一大类期望为低秩的随机矩阵的特征向量入口行为,这有助于解决 Abbe 等人(2014b)的猜想,即在随机块模型中,谱算法无需任何修剪或清理步骤即可实现精确恢复。关键在于ℓ ∞ 规范下特征向量的一阶近似:u k ≈ A u k * λ k *,其中 {u k } 和 { u k * } 分别是随机矩阵 A 的特征向量及其期望 E A。近似值在 A 中既紧密又线性,这一事实有助于对 u k 和 u k * 进行清晰的比较。特别是,即使 ‖ u k - u k * ‖ ∞ 很大,也能比较 u k 和 u k * 的符号。这些结果进一步扩展到特征空间的扰动,产生了同步化(ℤ 2 -spiked Wigner 模型)和噪声矩阵补全的新ℓ ∞ 型边界。
{"title":"ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK.","authors":"Emmanuel Abbe, Jianqing Fan, Kaizheng Wang, Yiqiao Zhong","doi":"10.1214/19-aos1854","DOIUrl":"10.1214/19-aos1854","url":null,"abstract":"<p><p>Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the <i>ℓ</i> <sub>∞</sub> norm: <dispformula> <math> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>≈</mo> <mfrac><mrow><mi>A</mi> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mrow><msubsup><mi>λ</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </mfrac> <mo>,</mo></mrow> </math> </dispformula> where {<i>u</i> <sub><i>k</i></sub> } and <math> <mrow><mrow><mo>{</mo> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>}</mo></mrow> </mrow> </math> are eigenvectors of a random matrix <i>A</i> and its expectation <math><mrow><mi>E</mi> <mi>A</mi></mrow> </math> , respectively. The fact that the approximation is both tight and linear in <i>A</i> facilitates sharp comparisons between <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> . In particular, it allows for comparing the signs of <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> even if <math> <mrow> <msub> <mrow><mrow><mo>‖</mo> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>-</mo> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>‖</mo></mrow> </mrow> <mi>∞</mi></msub> </mrow> </math> is large. The results are further extended to perturbations of eigenspaces, yielding new <i>ℓ</i> <sub>∞</sub>-type bounds for synchronization ( <math> <mrow><msub><mi>ℤ</mi> <mn>2</mn></msub> </mrow> </math> -spiked Wigner model) and noisy matrix completion.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 3","pages":"1452-1474"},"PeriodicalIF":4.5,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8046180/pdf/nihms-1053828.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38877757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On post dimension reduction statistical inference 降维后统计推理研究
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-06-01 DOI: 10.1214/19-aos1859
Kyongwon Kim, Bing Li, Zhou Yu, Lexin Li
{"title":"On post dimension reduction statistical inference","authors":"Kyongwon Kim, Bing Li, Zhou Yu, Lexin Li","doi":"10.1214/19-aos1859","DOIUrl":"https://doi.org/10.1214/19-aos1859","url":null,"abstract":"","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"1567-1592"},"PeriodicalIF":4.5,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47971326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A UNIFIED STUDY OF NONPARAMETRIC INFERENCE FOR MONOTONE FUNCTIONS. 单调函数非参数推断的统一研究。
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2020-04-01 Epub Date: 2020-05-26 DOI: 10.1214/19-aos1835
Ted Westling, Marco Carone

The problem of nonparametric inference on a monotone function has been extensively studied in many particular cases. Estimators considered have often been of so-called Grenander type, being representable as the left derivative of the greatest convex minorant or least concave majorant of an estimator of a primitive function. In this paper, we provide general conditions for consistency and pointwise convergence in distribution of a class of generalized Grenander-type estimators of a monotone function. This broad class allows the minorization or majoratization operation to be performed on a data-dependent transformation of the domain, possibly yielding benefits in practice. Additionally, we provide simpler conditions and more concrete distributional theory in the important case that the primitive estimator and data-dependent transformation function are asymptotically linear. We use our general results in the context of various well-studied problems, and show that we readily recover classical results established separately in each case. More importantly, we show that our results allow us to tackle more challenging problems involving parameters for which the use of flexible learning strategies appears necessary. In particular, we study inference on monotone density and hazard functions using informatively right-censored data, extending the classical work on independent censoring, and on a covariate-marginalized conditional mean function, extending the classical work on monotone regression functions.

关于单调函数的非参数推断问题,已经在许多特定情况下进行了广泛研究。所考虑的估计子通常是所谓的格勒南德类型,可表示为原始函数估计子的最大凸小值或最小凹大值的左导数。在本文中,我们提供了一类单调函数的广义格勒南德型估计子在分布上的一致性和点收敛性的一般条件。这一大类估计器允许在依赖数据的域变换上执行小化或大化操作,这可能会在实践中产生好处。此外,在原始估计器和依赖数据的变换函数渐近线性的重要情况下,我们提供了更简单的条件和更具体的分布理论。我们将我们的一般结果用于各种已被充分研究的问题,并证明我们很容易恢复在每种情况下分别建立的经典结果。更重要的是,我们证明了我们的结果使我们能够解决涉及参数的更具挑战性的问题,对于这些问题,似乎有必要使用灵活的学习策略。特别是,我们研究了使用信息右删失数据的单调密度和危险函数推断,扩展了关于独立删失的经典研究;我们还研究了协变量边际化条件均值函数推断,扩展了关于单调回归函数的经典研究。
{"title":"A UNIFIED STUDY OF NONPARAMETRIC INFERENCE FOR MONOTONE FUNCTIONS.","authors":"Ted Westling, Marco Carone","doi":"10.1214/19-aos1835","DOIUrl":"10.1214/19-aos1835","url":null,"abstract":"<p><p>The problem of nonparametric inference on a monotone function has been extensively studied in many particular cases. Estimators considered have often been of so-called Grenander type, being representable as the left derivative of the greatest convex minorant or least concave majorant of an estimator of a primitive function. In this paper, we provide general conditions for consistency and pointwise convergence in distribution of a class of generalized Grenander-type estimators of a monotone function. This broad class allows the minorization or majoratization operation to be performed on a data-dependent transformation of the domain, possibly yielding benefits in practice. Additionally, we provide simpler conditions and more concrete distributional theory in the important case that the primitive estimator and data-dependent transformation function are asymptotically linear. We use our general results in the context of various well-studied problems, and show that we readily recover classical results established separately in each case. More importantly, we show that our results allow us to tackle more challenging problems involving parameters for which the use of flexible learning strategies appears necessary. In particular, we study inference on monotone density and hazard functions using informatively right-censored data, extending the classical work on independent censoring, and on a covariate-marginalized conditional mean function, extending the classical work on monotone regression functions.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 2","pages":"1001-1024"},"PeriodicalIF":4.5,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7377427/pdf/nihms-1597646.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38194372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annals of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1