This paper establishes asymptotic theory for optimal estimation of change points in general time series models under α-mixing conditions. We show that the Bayes-type estimator is asymptotically minimax for change-point estimation under squared error loss. Two bootstrap procedures are developed to construct confidence intervals for the change points. An approximate limiting distribution of the change-point estimator under small change is also derived. Simulations and real data applications are presented to investigate the finite sample performance of the Bayes-type estimator and the bootstrap procedures.
{"title":"Optimal change-point estimation in time series","authors":"N. Chan, Wai Leong Ng, C. Yau, Haihan Yu","doi":"10.1214/20-aos2039","DOIUrl":"https://doi.org/10.1214/20-aos2039","url":null,"abstract":"This paper establishes asymptotic theory for optimal estimation of change points in general time series models under α-mixing conditions. We show that the Bayes-type estimator is asymptotically minimax for change-point estimation under squared error loss. Two bootstrap procedures are developed to construct confidence intervals for the change points. An approximate limiting distribution of the change-point estimator under small change is also derived. Simulations and real data applications are presented to investigate the finite sample performance of the Bayes-type estimator and the bootstrap procedures.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89189758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When statistical decision theory was emerging as a promising new paradigm, Charles Stein was to play a major role in the development of minimax theory for invariant statistical problems. In some of his earliest work with Gil Hunt, he set out to prove that, in problems where invariant procedures have constant risk, any best invariant test would be minimax among all tests. Although finding it not quite true in general, this led to the legendary Hunt–Stein theorem, which established the result under restrictive conditions on the underlying group of transformations. In decision problems invariant under such suitable groups, an overall minimax test was guaranteed to reside within the class of invariant procedures where it would typically be much easier to find. But when it did not seem possible to establish this result for invariance under the full linear group, he instead turned to prove its impossibility with counterexamples such as the nonminimaxity of the usual sample covariance estimator where the full linear group was just too big for the Hunt–Stein theorem to apply. Further explorations of invariance such as the sometimes problematic inference under a fiducial distribution, or the characterization of a best invariant procedure as a formal Bayes procedure under a right Haar prior, are further examples of the far reaching influence of Stein’s contributions to invariance theory.
{"title":"Charles Stein and invariance: Beginning with the Hunt–Stein theorem","authors":"M. L. Eaton, E. George","doi":"10.1214/21-aos2075","DOIUrl":"https://doi.org/10.1214/21-aos2075","url":null,"abstract":"When statistical decision theory was emerging as a promising new paradigm, Charles Stein was to play a major role in the development of minimax theory for invariant statistical problems. In some of his earliest work with Gil Hunt, he set out to prove that, in problems where invariant procedures have constant risk, any best invariant test would be minimax among all tests. Although finding it not quite true in general, this led to the legendary Hunt–Stein theorem, which established the result under restrictive conditions on the underlying group of transformations. In decision problems invariant under such suitable groups, an overall minimax test was guaranteed to reside within the class of invariant procedures where it would typically be much easier to find. But when it did not seem possible to establish this result for invariance under the full linear group, he instead turned to prove its impossibility with counterexamples such as the nonminimaxity of the usual sample covariance estimator where the full linear group was just too big for the Hunt–Stein theorem to apply. Further explorations of invariance such as the sometimes problematic inference under a fiducial distribution, or the characterization of a best invariant procedure as a formal Bayes procedure under a right Haar prior, are further examples of the far reaching influence of Stein’s contributions to invariance theory.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86639897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction note: “Statistical inference for the mean outcome under a possibly nonunique optimal treatment rule”","authors":"Alexander Luedtke, Aurélien F. Bibaut, M. J. Laan","doi":"10.1214/20-aos2031","DOIUrl":"https://doi.org/10.1214/20-aos2031","url":null,"abstract":"","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89560649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charles Stein made fundamental contributions to admissibility and inadmissibility in estimation and testing. This paper surveys some of the more important ones. Particular attention will be paid to his monumentally important, and at the time, incredibly surprising discovery of the inadmissibility of the usual estimator of the mean in three and higher dimensions. His result on admissibility of Pitman’s estimator of a mean in one and two dimensions, and his results on estimation of a mean matrix and a covariance matrix are also discussed. His work on testing is briefly covered.
{"title":"On Charles Stein’s contributions to (in)admissibility","authors":"W. Strawderman","doi":"10.1214/21-aos2108","DOIUrl":"https://doi.org/10.1214/21-aos2108","url":null,"abstract":"Charles Stein made fundamental contributions to admissibility and inadmissibility in estimation and testing. This paper surveys some of the more important ones. Particular attention will be paid to his monumentally important, and at the time, incredibly surprising discovery of the inadmissibility of the usual estimator of the mean in three and higher dimensions. His result on admissibility of Pitman’s estimator of a mean in one and two dimensions, and his results on estimation of a mean matrix and a covariance matrix are also discussed. His work on testing is briefly covered.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"105 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79234017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial: Memorial issue for Charles Stein","authors":"R. Samworth, Ming Yuan","doi":"10.1214/21-aos2110","DOIUrl":"https://doi.org/10.1214/21-aos2110","url":null,"abstract":"","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"156 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87096281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a method for the detection of a change point in a sequence ${F_i}$ of distributions, which are available through a large number of observations at each $i geq 1$. Under the null hypothesis, the distributions $F_i$ are equal. Under the alternative hypothesis, there is a change point $i^* > 1$, such that $F_i = G$ for $i geq i^*$ and some unknown distribution $G$, which is not equal to $F_1$. The change point, if it exists, is unknown, and the distributions before and after the potential change point are unknown. The decision about the existence of a change point is made sequentially, as new data arrive. At each time $i$, the count of observations, $N$, can increase to infinity. The detection procedure is based on a weighted version of the Wasserstein distance. Its asymptotic and finite sample validity is established. Its performance is illustrated by an application to returns on stocks in the S&P 500 index.
{"title":"Monitoring for a change point in a sequence of distributions","authors":"Lajos Horváth, P. Kokoszka, Shixuan Wang","doi":"10.1214/20-aos2036","DOIUrl":"https://doi.org/10.1214/20-aos2036","url":null,"abstract":"We propose a method for the detection of a change point in a sequence ${F_i}$ of distributions, which are available through a large number of observations at each $i geq 1$. Under the null hypothesis, the distributions $F_i$ are equal. Under the alternative hypothesis, there is a change point $i^* > 1$, such that $F_i = G$ for $i geq i^*$ and some unknown distribution $G$, which is not equal to $F_1$. The change point, if it exists, is unknown, and the distributions before and after the potential change point are unknown. The decision about the existence of a change point is made sequentially, as new data arrive. At each time $i$, the count of observations, $N$, can increase to infinity. The detection procedure is based on a weighted version of the Wasserstein distance. Its asymptotic and finite sample validity is established. Its performance is illustrated by an application to returns on stocks in the S&P 500 index.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83056712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Under some regularity conditions (Stein refers to [33]), the maximum likelihood estimator θ̂n based on an i.i.d. sample X1, . . . ,Xn from pθ satisfies that √ n(θ̂n − θ) tends to a normal distribution with mean zero and variance ∇φ(θ)T I−1 θ ∇φ(θ), and hence attains this bound. Even if the parameter set may be multi-dimensional, this lower bound for estimation of a real-valued parameter φ(θ) can already be obtained from considering a one-dimensional
{"title":"Stein 1956: Efficient nonparametric testing and estimation","authors":"A. Vaart, J. Wellner","doi":"10.1214/21-aos2056","DOIUrl":"https://doi.org/10.1214/21-aos2056","url":null,"abstract":"Under some regularity conditions (Stein refers to [33]), the maximum likelihood estimator θ̂n based on an i.i.d. sample X1, . . . ,Xn from pθ satisfies that √ n(θ̂n − θ) tends to a normal distribution with mean zero and variance ∇φ(θ)T I−1 θ ∇φ(θ), and hence attains this bound. Even if the parameter set may be multi-dimensional, this lower bound for estimation of a real-valued parameter φ(θ) can already be obtained from considering a one-dimensional","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87891168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Let {(Xi, Yi)}i=1 be a sequence of independent bivariate random vectors. In this paper, we establish a refined Cramér type moderate deviation theorem for the general self-normalized sum ∑n i=1Xi/( ∑n i=1 Y 2 i ) 1/2, which unifies and extends the classical Cramér (1938) theorem and the selfnormalized Cramér type moderate deviation theorems by Jing, Shao and Wang (2003) as well as the further refined version by Wang (2011). The advantage of our result is evidenced through successful applications to weakly dependent random variables and self-normalized winsorized mean. Specifically, by applying our new framework on general self-normalized sum, we significantly improve Cramér type moderate deviation theorems for onedependent random variables, geometrically β-mixing random variables and causal processes under geometrical moment contraction. As an additional application, we also derive the Cramér type moderate deviation theorems for self-normalized winsorized mean.
本文对广义自归一化和∑n i=1Xi/(∑n i=1 Y 2 i) 1/2建立了一个改进的cram宽泛中偏差定理,统一和推广了经典的cram宽泛(1938)定理和Jing、Shao和Wang(2003)的自归一化cram宽泛中偏差定理以及Wang(2011)的进一步改进版本。通过对弱相关随机变量和自归一化均值的成功应用证明了我们的结果的优势。具体地说,通过将我们的新框架应用于一般自归一化和,我们显著地改进了单依随机变量、几何β混合随机变量和几何矩收缩下因果过程的cram型中等偏差定理。作为一个附加的应用,我们也得到了自归一化均值的cram型中等偏差定理。
{"title":"Refined Cramér-type moderate deviation theorems for general self-normalized sums with applications to dependent random variables and winsorized mean","authors":"Lan Gao, Q. Shao, Jiasheng Shi","doi":"10.1214/21-aos2122","DOIUrl":"https://doi.org/10.1214/21-aos2122","url":null,"abstract":"Let {(Xi, Yi)}i=1 be a sequence of independent bivariate random vectors. In this paper, we establish a refined Cramér type moderate deviation theorem for the general self-normalized sum ∑n i=1Xi/( ∑n i=1 Y 2 i ) 1/2, which unifies and extends the classical Cramér (1938) theorem and the selfnormalized Cramér type moderate deviation theorems by Jing, Shao and Wang (2003) as well as the further refined version by Wang (2011). The advantage of our result is evidenced through successful applications to weakly dependent random variables and self-normalized winsorized mean. Specifically, by applying our new framework on general self-normalized sum, we significantly improve Cramér type moderate deviation theorems for onedependent random variables, geometrically β-mixing random variables and causal processes under geometrical moment contraction. As an additional application, we also derive the Cramér type moderate deviation theorems for self-normalized winsorized mean.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82028780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High-dimensional time series data appear in many scientific areas in the current data-rich environment. Analysis of such data poses new challenges to data analysts because of not only the complicated dynamic dependence between the series, but also the existence of aberrant observations, such as missing values, contaminated observations, and heavy-tailed distributions. For high-dimensional vector autoregressive (VAR) models, we introduce a unified estimation procedure that is robust to model misspecification, heavy-tailed noise contamination, and conditional heteroscedasticity. The proposed methodology enjoys both statistical optimality and computational efficiency, and can handle many popular high-dimensional models, such as sparse, reduced-rank, banded, and network-structured VAR models. With proper regularization and data truncation, the estimation convergence rates are shown to be almost optimal in the minimax sense under a bounded $(2+2epsilon)$-th moment condition. When $epsilongeq1$, the rates of convergence match those obtained under the sub-Gaussian assumption. Consistency of the proposed estimators is also established for some $epsilonin(0,1)$, with minimax optimal convergence rates associated with $epsilon$. The efficacy of the proposed estimation methods is demonstrated by simulation and a U.S. macroeconomic example.
{"title":"Rate-optimal robust estimation of high-dimensional vector autoregressive models","authors":"Di Wang, R. Tsay","doi":"10.1214/23-aos2278","DOIUrl":"https://doi.org/10.1214/23-aos2278","url":null,"abstract":"High-dimensional time series data appear in many scientific areas in the current data-rich environment. Analysis of such data poses new challenges to data analysts because of not only the complicated dynamic dependence between the series, but also the existence of aberrant observations, such as missing values, contaminated observations, and heavy-tailed distributions. For high-dimensional vector autoregressive (VAR) models, we introduce a unified estimation procedure that is robust to model misspecification, heavy-tailed noise contamination, and conditional heteroscedasticity. The proposed methodology enjoys both statistical optimality and computational efficiency, and can handle many popular high-dimensional models, such as sparse, reduced-rank, banded, and network-structured VAR models. With proper regularization and data truncation, the estimation convergence rates are shown to be almost optimal in the minimax sense under a bounded $(2+2epsilon)$-th moment condition. When $epsilongeq1$, the rates of convergence match those obtained under the sub-Gaussian assumption. Consistency of the proposed estimators is also established for some $epsilonin(0,1)$, with minimax optimal convergence rates associated with $epsilon$. The efficacy of the proposed estimation methods is demonstrated by simulation and a U.S. macroeconomic example.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83769165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
: We consider the problem of positive-semidefinite continuation: extending a partially spec- ified covariance kernel from a subdomain Ω of a rectangular domain I × I to a covariance kernel on the entire domain I × I . For a broad class of domains Ω called serrated domains , we are able to present a complete theory. Namely, we demonstrate that a canonical completion always exists and can be explicitly constructed. We characterise all possible completions as suitable perturbations of the canonical completion, and determine necessary and sufficient conditions for a unique completion to exist. We interpret the canonical completion via the graphical model structure it induces on the associated Gaussian process. Furthermore, we show how the estimation of the canonical completion reduces to the solution of a system of linear statistical inverse problems in the space of Hilbert-Schmidt operators, and derive rates of convergence. We conclude by providing extensions of our theory to more general forms of domains, and by demonstrating how our results can be used to construct covariance estimators from sample path fragments of the associated stochastic process. Our results are illustrated numerically by way of a simulation study and a real example.
{"title":"The completion of covariance kernels","authors":"Kartik G. Waghmare, V. Panaretos","doi":"10.1214/22-aos2228","DOIUrl":"https://doi.org/10.1214/22-aos2228","url":null,"abstract":": We consider the problem of positive-semidefinite continuation: extending a partially spec- ified covariance kernel from a subdomain Ω of a rectangular domain I × I to a covariance kernel on the entire domain I × I . For a broad class of domains Ω called serrated domains , we are able to present a complete theory. Namely, we demonstrate that a canonical completion always exists and can be explicitly constructed. We characterise all possible completions as suitable perturbations of the canonical completion, and determine necessary and sufficient conditions for a unique completion to exist. We interpret the canonical completion via the graphical model structure it induces on the associated Gaussian process. Furthermore, we show how the estimation of the canonical completion reduces to the solution of a system of linear statistical inverse problems in the space of Hilbert-Schmidt operators, and derive rates of convergence. We conclude by providing extensions of our theory to more general forms of domains, and by demonstrating how our results can be used to construct covariance estimators from sample path fragments of the associated stochastic process. Our results are illustrated numerically by way of a simulation study and a real example.","PeriodicalId":22375,"journal":{"name":"The Annals of Statistics","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82635870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}