首页 > 最新文献

The Annals of Statistics最新文献

英文 中文
Willem van Zwet’s contributions to the profession Willem van Zwet对这个专业的贡献
Pub Date : 2021-10-01 DOI: 10.1214/21-aos2053
N. Fisher, A. Smith
{"title":"Willem van Zwet’s contributions to the profession","authors":"N. Fisher, A. Smith","doi":"10.1214/21-aos2053","DOIUrl":"https://doi.org/10.1214/21-aos2053","url":null,"abstract":"","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85500523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference for a two-stage enrichment design 两阶段浓缩设计的推论
Pub Date : 2021-10-01 DOI: 10.1214/21-aos2051
Zhantao Lin, N. Flournoy, W. Rosenberger
{"title":"Inference for a two-stage enrichment design","authors":"Zhantao Lin, N. Flournoy, W. Rosenberger","doi":"10.1214/21-aos2051","DOIUrl":"https://doi.org/10.1214/21-aos2051","url":null,"abstract":"","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82676670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Minimax rates for conditional density estimation via empirical entropy 通过经验熵的条件密度估计的极大极小率
Pub Date : 2021-09-21 DOI: 10.1214/23-AOS2270
Blair Bilodeau, Dylan J. Foster, Daniel M. Roy
We consider the task of estimating a conditional density using i.i.d. samples from a joint distribution, which is a fundamental problem with applications in both classification and uncertainty quantification for regression. For joint density estimation, minimax rates have been characterized for general density classes in terms of uniform (metric) entropy, a well-studied notion of statistical capacity. When applying these results to conditional density estimation, the use of uniform entropy -- which is infinite when the covariate space is unbounded and suffers from the curse of dimensionality -- can lead to suboptimal rates. Consequently, minimax rates for conditional density estimation cannot be characterized using these classical results. We resolve this problem for well-specified models, obtaining matching (within logarithmic factors) upper and lower bounds on the minimax Kullback--Leibler risk in terms of the empirical Hellinger entropy for the conditional density class. The use of empirical entropy allows us to appeal to concentration arguments based on local Rademacher complexity, which -- in contrast to uniform entropy -- leads to matching rates for large, potentially nonparametric classes and captures the correct dependence on the complexity of the covariate space. Our results require only that the conditional densities are bounded above, and do not require that they are bounded below or otherwise satisfy any tail conditions.
我们考虑使用联合分布的i.i.d样本估计条件密度的任务,这是回归分类和不确定性量化应用中的一个基本问题。对于联合密度估计,一般密度类的极大极小率已经用均匀(度量)熵来表征,这是一个被充分研究的统计能力的概念。当将这些结果应用于条件密度估计时,使用均匀熵(当协变量空间无界并遭受维度诅咒时,均匀熵是无限的)可能导致次优率。因此,条件密度估计的极大极小率不能用这些经典结果来表征。我们为明确的模型解决了这个问题,根据条件密度类的经验Hellinger熵,获得了最小-最大Kullback- Leibler风险的上界和下界(在对数因子内)。经验熵的使用使我们能够诉诸于基于局部Rademacher复杂度的集中论证,这与均匀熵相反,导致了大型,潜在的非参数类的匹配率,并捕获了对协变量空间复杂性的正确依赖。我们的结果只要求条件密度在上面有界,而不要求它们在下面有界或满足任何尾部条件。
{"title":"Minimax rates for conditional density estimation via empirical entropy","authors":"Blair Bilodeau, Dylan J. Foster, Daniel M. Roy","doi":"10.1214/23-AOS2270","DOIUrl":"https://doi.org/10.1214/23-AOS2270","url":null,"abstract":"We consider the task of estimating a conditional density using i.i.d. samples from a joint distribution, which is a fundamental problem with applications in both classification and uncertainty quantification for regression. For joint density estimation, minimax rates have been characterized for general density classes in terms of uniform (metric) entropy, a well-studied notion of statistical capacity. When applying these results to conditional density estimation, the use of uniform entropy -- which is infinite when the covariate space is unbounded and suffers from the curse of dimensionality -- can lead to suboptimal rates. Consequently, minimax rates for conditional density estimation cannot be characterized using these classical results. We resolve this problem for well-specified models, obtaining matching (within logarithmic factors) upper and lower bounds on the minimax Kullback--Leibler risk in terms of the empirical Hellinger entropy for the conditional density class. The use of empirical entropy allows us to appeal to concentration arguments based on local Rademacher complexity, which -- in contrast to uniform entropy -- leads to matching rates for large, potentially nonparametric classes and captures the correct dependence on the complexity of the covariate space. Our results require only that the conditional densities are bounded above, and do not require that they are bounded below or otherwise satisfy any tail conditions.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74769228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Asymptotic normality for eigenvalue statistics of a general sample covariance matrix when p/n→∞ and applications p/n→∞时一般样本协方差矩阵特征值统计量的渐近正态性及其应用
Pub Date : 2021-09-14 DOI: 10.1214/23-aos2300
Jiaxin Qiu, Zeng Li, Jianfeng Yao
The asymptotic normality for a large family of eigenvalue statistics of a general sample covariance matrix is derived under the ultra-high dimensional setting, that is, when the dimension to sample size ratio $p/n to infty$. Based on this CLT result, we first adapt the covariance matrix test problem to the new ultra-high dimensional context. Then as a second application, we develop a new test for the separable covariance structure of a matrix-valued white noise. Simulation experiments are conducted for the investigation of finite-sample properties of the general asymptotic normality of eigenvalue statistics, as well as the second test for separable covariance structure of matrix-valued white noise.
在超高维设置下,即当维数与样本量之比$p/n to infty$时,导出了一般样本协方差矩阵的一大组特征值统计量的渐近正态性。在此CLT结果的基础上,我们首先将协方差矩阵检验问题应用于新的超高维环境。然后作为第二个应用,我们提出了一个新的检验矩阵值白噪声的可分离协方差结构的方法。对特征值统计量一般渐近正态性的有限样本性质进行了仿真实验研究,并对矩阵值白噪声的可分协方差结构进行了二次检验。
{"title":"Asymptotic normality for eigenvalue statistics of a general sample covariance matrix when p/n→∞ and applications","authors":"Jiaxin Qiu, Zeng Li, Jianfeng Yao","doi":"10.1214/23-aos2300","DOIUrl":"https://doi.org/10.1214/23-aos2300","url":null,"abstract":"The asymptotic normality for a large family of eigenvalue statistics of a general sample covariance matrix is derived under the ultra-high dimensional setting, that is, when the dimension to sample size ratio $p/n to infty$. Based on this CLT result, we first adapt the covariance matrix test problem to the new ultra-high dimensional context. Then as a second application, we develop a new test for the separable covariance structure of a matrix-valued white noise. Simulation experiments are conducted for the investigation of finite-sample properties of the general asymptotic normality of eigenvalue statistics, as well as the second test for separable covariance structure of matrix-valued white noise.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89544403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Rerandomization with diminishing covariate imbalance and diverging number of covariates 协变量不平衡减少和协变量数目分散的再随机化
Pub Date : 2021-09-06 DOI: 10.1214/22-aos2235
Yuhao Wang, Xinran Li
Completely randomized experiments have been the gold standard for drawing causal inference because they can balance all potential confounding on average. However, they may suffer from unbalanced covariates for realized treatment assignments. Rerandomization, a design that rerandomizes the treatment assignment until a prespecified covariate balance criterion is met, has recently got attention due to its easy implementation, improved covariate balance and more efficient inference. Researchers have then suggested to use the treatment assignments that minimize the covariate imbalance, namely the optimally balanced design. This has caused again the long-time controversy between two philosophies for designing experiments: randomization versus optimal and thus almost deterministic designs. Existing literature argued that rerandomization with overly balanced observed covariates can lead to highly imbalanced unobserved covariates, making it vulnerable to model misspecification. On the contrary, rerandomization with properly balanced covariates can provide robust inference for treatment effects while sacrific-ing some efficiency compared to the ideally optimal design. In this paper, we show it is possible that, by making the covariate imbalance diminishing at a proper rate as the sample size increases, rerandomization can achieve its ideally optimal precision that one can expect with perfectly balanced covariates, while still maintaining its robustness. We further investigate conditions on the number of covariates for achieving the desired optimality. Our results rely on a more delicate asymptotic analysis for rerandomization, allowing both diminishing covariate imbalance threshold (or equivalently the acceptance probability) and diverging number of covariates. The derived theory for rerandomization provides a deeper understanding of its large-sample property and can better guide its practical implementation. Furthermore, it also helps reconcile the controversy between randomized and optimal designs in an asymptotic sense.
完全随机实验一直是得出因果推理的黄金标准,因为它们平均可以平衡所有潜在的混杂因素。然而,对于已实现的治疗分配,它们可能会受到协变量不平衡的影响。再随机化是一种将处理分配重新随机化,直到满足预先指定的协变量平衡标准的设计,由于其易于实现,改善了协变量平衡和更有效的推理,最近受到了关注。研究人员随后建议使用使协变量不平衡最小化的处理分配,即最优平衡设计。这再次引起了两种设计实验的哲学之间的长期争论:随机化与最优设计,因此几乎是确定性设计。现有文献认为,观察到的协变量过于平衡的再随机化会导致未观察到的协变量高度不平衡,使其容易出现模型错配。相反,与理想的最优设计相比,适当平衡协变量的再随机化可以为治疗效果提供稳健的推断,同时牺牲一些效率。在本文中,我们表明,通过使协变量不平衡随着样本量的增加而以适当的速率递减,再随机化可以达到理想的最佳精度,即人们可以期望具有完全平衡的协变量,同时仍然保持其稳健性。我们进一步研究了实现期望最优性的协变量数的条件。我们的结果依赖于更精细的再随机化渐近分析,允许减少协变量不平衡阈值(或等效的接受概率)和分散的协变量数量。导出的再随机化理论对其大样本特性有了更深入的理解,可以更好地指导其实际实施。此外,它还有助于在渐近意义上调和随机和最优设计之间的争议。
{"title":"Rerandomization with diminishing covariate imbalance and diverging number of covariates","authors":"Yuhao Wang, Xinran Li","doi":"10.1214/22-aos2235","DOIUrl":"https://doi.org/10.1214/22-aos2235","url":null,"abstract":"Completely randomized experiments have been the gold standard for drawing causal inference because they can balance all potential confounding on average. However, they may suffer from unbalanced covariates for realized treatment assignments. Rerandomization, a design that rerandomizes the treatment assignment until a prespecified covariate balance criterion is met, has recently got attention due to its easy implementation, improved covariate balance and more efficient inference. Researchers have then suggested to use the treatment assignments that minimize the covariate imbalance, namely the optimally balanced design. This has caused again the long-time controversy between two philosophies for designing experiments: randomization versus optimal and thus almost deterministic designs. Existing literature argued that rerandomization with overly balanced observed covariates can lead to highly imbalanced unobserved covariates, making it vulnerable to model misspecification. On the contrary, rerandomization with properly balanced covariates can provide robust inference for treatment effects while sacrific-ing some efficiency compared to the ideally optimal design. In this paper, we show it is possible that, by making the covariate imbalance diminishing at a proper rate as the sample size increases, rerandomization can achieve its ideally optimal precision that one can expect with perfectly balanced covariates, while still maintaining its robustness. We further investigate conditions on the number of covariates for achieving the desired optimality. Our results rely on a more delicate asymptotic analysis for rerandomization, allowing both diminishing covariate imbalance threshold (or equivalently the acceptance probability) and diverging number of covariates. The derived theory for rerandomization provides a deeper understanding of its large-sample property and can better guide its practical implementation. Furthermore, it also helps reconcile the controversy between randomized and optimal designs in an asymptotic sense.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"170 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74143009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Uniform consistency in nonparametric mixture models 非参数混合模型的均匀一致性
Pub Date : 2021-08-31 DOI: 10.1214/22-aos2255
Bryon Aragam, Ruiyi Yang
We study uniform consistency in nonparametric mixture models as well as closely related mixture of regression (also known as mixed regression) models, where the regression functions are allowed to be nonparametric and the error distributions are assumed to be convolutions of a Gaussian density. We construct uniformly consistent estimators under general conditions while simultaneously highlighting several pain points in extending existing pointwise consistency results to uniform results. The resulting analysis turns out to be nontrivial, and several novel technical tools are developed along the way. In the case of mixed regression, we prove $L^1$ convergence of the regression functions while allowing for the component regression functions to intersect arbitrarily often, which presents additional technical challenges. We also consider generalizations to general (i.e. non-convolutional) nonparametric mixtures.
我们研究非参数混合模型以及密切相关的混合回归(也称为混合回归)模型中的均匀一致性,其中回归函数允许是非参数的,并且假设误差分布是高斯密度的卷积。我们在一般条件下构造一致一致的估计量,同时强调了将现有的点一致结果扩展到一致结果的几个难点。由此产生的分析结果是非平凡的,并且在此过程中开发了一些新的技术工具。在混合回归的情况下,我们证明了回归函数的$L^1$收敛性,同时允许组件回归函数经常任意相交,这提出了额外的技术挑战。我们还考虑一般(即非卷积)非参数混合的推广。
{"title":"Uniform consistency in nonparametric mixture models","authors":"Bryon Aragam, Ruiyi Yang","doi":"10.1214/22-aos2255","DOIUrl":"https://doi.org/10.1214/22-aos2255","url":null,"abstract":"We study uniform consistency in nonparametric mixture models as well as closely related mixture of regression (also known as mixed regression) models, where the regression functions are allowed to be nonparametric and the error distributions are assumed to be convolutions of a Gaussian density. We construct uniformly consistent estimators under general conditions while simultaneously highlighting several pain points in extending existing pointwise consistency results to uniform results. The resulting analysis turns out to be nontrivial, and several novel technical tools are developed along the way. In the case of mixed regression, we prove $L^1$ convergence of the regression functions while allowing for the component regression functions to intersect arbitrarily often, which presents additional technical challenges. We also consider generalizations to general (i.e. non-convolutional) nonparametric mixtures.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75951770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Conditional sequential Monte Carlo in high dimensions 条件顺序蒙特卡罗在高维
Pub Date : 2021-08-23 DOI: 10.1214/22-aos2252
Axel Finke, Alexandre Hoang Thiery
The iterated conditional sequential Monte Carlo (i-CSMC) algorithm from Andrieu, Doucet and Holenstein (2010) is an MCMC approach for efficiently sampling from the joint posterior distribution of the $T$ latent states in challenging time-series models, e.g. in non-linear or non-Gaussian state-space models. It is also the main ingredient in particle Gibbs samplers which infer unknown model parameters alongside the latent states. In this work, we first prove that the i-CSMC algorithm suffers from a curse of dimension in the dimension of the states, $D$: it breaks down unless the number of samples ("particles"), $N$, proposed by the algorithm grows exponentially with $D$. Then, we present a novel"local"version of the algorithm which proposes particles using Gaussian random-walk moves that are suitably scaled with $D$. We prove that this iterated random-walk conditional sequential Monte Carlo (i-RW-CSMC) algorithm avoids the curse of dimension: for arbitrary $N$, its acceptance rates and expected squared jumping distance converge to non-trivial limits as $D to infty$. If $T = N = 1$, our proposed algorithm reduces to a Metropolis--Hastings or Barker's algorithm with Gaussian random-walk moves and we recover the well known scaling limits for such algorithms.
Andrieu, Doucet和Holenstein(2010)提出的迭代条件序列蒙特卡罗(i-CSMC)算法是一种MCMC方法,用于在具有挑战性的时间序列模型(例如非线性或非高斯状态空间模型)中有效地从$T$潜在状态的联合后验分布中采样。它也是粒子吉布斯采样器的主要成分,它可以推断出未知的模型参数以及潜在状态。在这项工作中,我们首先证明了i-CSMC算法在状态维度中遭受维度诅咒$D$:除非算法提出的样本(“粒子”)数量$N$随$D$呈指数增长,否则它会崩溃。然后,我们提出了一种新的“局部”版本的算法,该算法使用高斯随机行走移动来提出粒子,该移动适当地缩放$D$。我们证明了这种迭代随机漫步条件序列蒙特卡罗(i-RW-CSMC)算法避免了维数诅咒:对于任意$N$,它的接受率和期望的平方跳跃距离收敛于非平凡极限$D to infty$。如果$T = N = 1$,我们提出的算法减少到一个Metropolis- Hastings或Barker的算法与高斯随机行走移动,我们恢复了众所周知的缩放限制的算法。
{"title":"Conditional sequential Monte Carlo in high dimensions","authors":"Axel Finke, Alexandre Hoang Thiery","doi":"10.1214/22-aos2252","DOIUrl":"https://doi.org/10.1214/22-aos2252","url":null,"abstract":"The iterated conditional sequential Monte Carlo (i-CSMC) algorithm from Andrieu, Doucet and Holenstein (2010) is an MCMC approach for efficiently sampling from the joint posterior distribution of the $T$ latent states in challenging time-series models, e.g. in non-linear or non-Gaussian state-space models. It is also the main ingredient in particle Gibbs samplers which infer unknown model parameters alongside the latent states. In this work, we first prove that the i-CSMC algorithm suffers from a curse of dimension in the dimension of the states, $D$: it breaks down unless the number of samples (\"particles\"), $N$, proposed by the algorithm grows exponentially with $D$. Then, we present a novel\"local\"version of the algorithm which proposes particles using Gaussian random-walk moves that are suitably scaled with $D$. We prove that this iterated random-walk conditional sequential Monte Carlo (i-RW-CSMC) algorithm avoids the curse of dimension: for arbitrary $N$, its acceptance rates and expected squared jumping distance converge to non-trivial limits as $D to infty$. If $T = N = 1$, our proposed algorithm reduces to a Metropolis--Hastings or Barker's algorithm with Gaussian random-walk moves and we recover the well known scaling limits for such algorithms.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84427709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On singular values of data matrices with general independent columns 关于具有一般独立列的数据矩阵的奇异值
Pub Date : 2021-08-15 DOI: 10.1214/23-aos2263
T. Mei, Chen Wang, Jianfeng Yao
In this paper, we analyse singular values of a large $ptimes n$ data matrix $mathbf{X}_n= (mathbf{x}_{n1},ldots,mathbf{x}_{nn})$ where the column $mathbf{x}_{nj}$'s are independent $p$-dimensional vectors, possibly with different distributions. Such data matrices are common in high-dimensional statistics. Under a key assumption that the covariance matrices $mathbf{Sigma}_{nj}=text{Cov}(mathbf{x}_{nj})$ can be asymptotically simultaneously diagonalizable, and appropriate convergence of their spectra, we establish a limiting distribution for the singular values of $mathbf{X}_n$ when both dimension $p$ and $n$ grow to infinity in a comparable magnitude. The matrix model goes beyond and includes many existing works on different types of sample covariance matrices, including the weighted sample covariance matrix, the Gram matrix model and the sample covariance matrix of linear times series models. Furthermore, we develop two applications of our general approach. First, we obtain the existence and uniqueness of a new limiting spectral distribution of realized covariance matrices for a multi-dimensional diffusion process with anisotropic time-varying co-volatility processes. Secondly, we derive the limiting spectral distribution for singular values of the data matrix for a recent matrix-valued auto-regressive model. Finally, for a generalized finite mixture model, the limiting spectral distribution for singular values of the data matrix is obtained.
在本文中,我们分析了一个大的$p * n$数据矩阵$mathbf{X}_n= (mathbf{X}_ {n1},ldots,mathbf{X}_ {nn})$的奇异值,其中列$mathbf{X}_ {nj}$是独立的$p$维向量,可能具有不同的分布。这种数据矩阵在高维统计中很常见。在协方差矩阵$mathbf{Sigma}_{nj}=text{Cov}(mathbf{x}_{nj})$是渐近同时可对角化的关键假设下,我们建立了$mathbf{x} _n$的奇异值在$p$和$n$都以相当的幅度增长到无穷大时的极限分布。矩阵模型是对不同类型的样本协方差矩阵的扩展,包括加权样本协方差矩阵、Gram矩阵模型和线性时间序列模型的样本协方差矩阵等。此外,我们开发了我们的一般方法的两个应用程序。首先,我们得到了具有各向异性时变共挥发过程的多维扩散过程中所实现协方差矩阵的一种新的极限谱分布的存在唯一性。其次,我们得到了一个最新的矩阵值自回归模型的数据矩阵奇异值的极限谱分布。最后,对于广义有限混合模型,得到了数据矩阵奇异值的极限谱分布。
{"title":"On singular values of data matrices with general independent columns","authors":"T. Mei, Chen Wang, Jianfeng Yao","doi":"10.1214/23-aos2263","DOIUrl":"https://doi.org/10.1214/23-aos2263","url":null,"abstract":"In this paper, we analyse singular values of a large $ptimes n$ data matrix $mathbf{X}_n= (mathbf{x}_{n1},ldots,mathbf{x}_{nn})$ where the column $mathbf{x}_{nj}$'s are independent $p$-dimensional vectors, possibly with different distributions. Such data matrices are common in high-dimensional statistics. Under a key assumption that the covariance matrices $mathbf{Sigma}_{nj}=text{Cov}(mathbf{x}_{nj})$ can be asymptotically simultaneously diagonalizable, and appropriate convergence of their spectra, we establish a limiting distribution for the singular values of $mathbf{X}_n$ when both dimension $p$ and $n$ grow to infinity in a comparable magnitude. The matrix model goes beyond and includes many existing works on different types of sample covariance matrices, including the weighted sample covariance matrix, the Gram matrix model and the sample covariance matrix of linear times series models. Furthermore, we develop two applications of our general approach. First, we obtain the existence and uniqueness of a new limiting spectral distribution of realized covariance matrices for a multi-dimensional diffusion process with anisotropic time-varying co-volatility processes. Secondly, we derive the limiting spectral distribution for singular values of the data matrix for a recent matrix-valued auto-regressive model. Finally, for a generalized finite mixture model, the limiting spectral distribution for singular values of the data matrix is obtained.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"125 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84921301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dispersal density estimation across scales 跨尺度的分散密度估计
Pub Date : 2021-08-11 DOI: 10.1214/23-aos2290
M. Hoffmann, Mathias Trabs
We consider a space structured population model generated by two point clouds: a homogeneous Poisson process $M$ with intensity $ntoinfty$ as a model for a parent generation together with a Cox point process $N$ as offspring generation, with conditional intensity given by the convolution of $M$ with a scaled dispersal density $sigma^{-1}f(cdot/sigma)$. Based on a realisation of $M$ and $N$, we study the nonparametric estimation of $f$ and the estimation of the physical scale parameter $sigma>0$ simultaneously for all regimes $sigma=sigma_n$. We establish that the optimal rates of convergence do not depend monotonously on the scale and we construct minimax estimators accordingly whether $sigma$ is known or considered as a nuisance, in which case we can estimate it and achieve asymptotic minimaxity by plug-in. The statistical reconstruction exhibits a competition between a direct and a deconvolution problem. Our study reveals in particular the existence of a least favourable intermediate inference scale, a phenomenon that seems to be new.
我们考虑由两个点云生成的空间结构种群模型:一个强度为$ntoinfty$的均匀泊松过程$M$作为亲代模型,一个Cox点过程$N$作为后代模型,条件强度由$M$与缩放分散密度$sigma^{-1}f(cdot/sigma)$的卷积给出。在实现$M$和$N$的基础上,我们同时研究了所有状态$sigma=sigma_n$的$f$的非参数估计和物理尺度参数$sigma>0$的估计。我们建立了最优收敛速率不单调依赖于尺度,并相应地构造了极大极小估计量,无论$sigma$是已知的还是被认为是一个累赘,在这种情况下我们可以估计它并通过插件实现渐近极小。统计重建表现出直接问题和反卷积问题之间的竞争。我们的研究特别揭示了最不利的中间推理尺度的存在,这一现象似乎是新的。
{"title":"Dispersal density estimation across scales","authors":"M. Hoffmann, Mathias Trabs","doi":"10.1214/23-aos2290","DOIUrl":"https://doi.org/10.1214/23-aos2290","url":null,"abstract":"We consider a space structured population model generated by two point clouds: a homogeneous Poisson process $M$ with intensity $ntoinfty$ as a model for a parent generation together with a Cox point process $N$ as offspring generation, with conditional intensity given by the convolution of $M$ with a scaled dispersal density $sigma^{-1}f(cdot/sigma)$. Based on a realisation of $M$ and $N$, we study the nonparametric estimation of $f$ and the estimation of the physical scale parameter $sigma>0$ simultaneously for all regimes $sigma=sigma_n$. We establish that the optimal rates of convergence do not depend monotonously on the scale and we construct minimax estimators accordingly whether $sigma$ is known or considered as a nuisance, in which case we can estimate it and achieve asymptotic minimaxity by plug-in. The statistical reconstruction exhibits a competition between a direct and a deconvolution problem. Our study reveals in particular the existence of a least favourable intermediate inference scale, a phenomenon that seems to be new.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80997231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Peskun–Tierney ordering for Markovian Monte Carlo: Beyond the reversible scenario 马尔可夫蒙特卡洛的Peskun-Tierney排序:超越可逆情形
Pub Date : 2021-08-01 DOI: 10.1214/20-aos2008
C. Andrieu, Samuel Livingstone
Historically time-reversibility of the transitions or processes underpinning Markov chain Monte Carlo methods (MCMC) has played a key role in their development, while the self-adjointness of associated operators together with the use of classical functional analysis techniques on Hilbert spaces have led to powerful and practically successful tools to characterise and compare their performance. Similar results for algorithms relying on nonreversible Markov processes are scarce. We show that for a type of nonreversible Monte Carlo Markov chains and processes, of current or renewed interest in the physics and statistical literatures, it is possible to develop comparison results which closely mirror those available in the reversible scenario. We show that these results shed light on earlier literature, proving some conjectures and strengthening some earlier results.
历史上,支持马尔可夫链蒙特卡罗方法(MCMC)的过渡或过程的时间可逆性在它们的发展中发挥了关键作用,而相关算子的自伴随性以及希尔伯特空间上经典泛函分析技术的使用已经导致了强大且实际上成功的工具来表征和比较它们的性能。依赖于不可逆马尔可夫过程的算法的类似结果很少。我们表明,对于一类不可逆的蒙特卡罗马尔可夫链和过程,当前或在物理和统计文献中重新产生兴趣,有可能开发出与可逆场景中可用的结果密切相关的比较结果。我们表明,这些结果阐明了早期的文献,证明了一些猜想,并加强了一些早期的结果。
{"title":"Peskun–Tierney ordering for Markovian Monte Carlo: Beyond the reversible scenario","authors":"C. Andrieu, Samuel Livingstone","doi":"10.1214/20-aos2008","DOIUrl":"https://doi.org/10.1214/20-aos2008","url":null,"abstract":"Historically time-reversibility of the transitions or processes underpinning Markov chain Monte Carlo methods (MCMC) has played a key role in their development, while the self-adjointness of associated operators together with the use of classical functional analysis techniques on Hilbert spaces have led to powerful and practically successful tools to characterise and compare their performance. Similar results for algorithms relying on nonreversible Markov processes are scarce. We show that for a type of nonreversible Monte Carlo Markov chains and processes, of current or renewed interest in the physics and statistical literatures, it is possible to develop comparison results which closely mirror those available in the reversible scenario. We show that these results shed light on earlier literature, proving some conjectures and strengthening some earlier results.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81283373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
The Annals of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1