Conditional value-at-risk is a popular risk measure in risk management. We study the inference problem of conditional value-at-risk under a linear predictive regression model. We derive the asymptotic distribution of the least squares estimator for the conditional value-at-risk. Our results relax the model assumptions made in Chun et al. (2012) and correct their mistake in the asymptotic variance expression. We show that the asymptotic variance depends on the quantile density function of the unobserved error and whether the model has a predictor with infinite variance, which makes it challenging to actually quantify the uncertainty of the conditional risk measure. To make the inference feasible, we then propose a smooth empirical likelihood based method for constructing a confidence interval for the conditional value-at-risk based on either independent errors or GARCH errors. Our approach not only bypasses the challenge of directly estimating the asymptotic variance but also does not need to know whether there exists an infinite variance predictor in the predictive model. Furthermore, we apply the same idea to the quantile regression method, which allows infinite variance predictors and generalizes the parameter estimation in Whang (2006) to conditional value-at-risk in the supplementary material. We demonstrate the finite sample performance of the derived confidence intervals through numerical studies before applying them to real data.
{"title":"Inference for conditional value-at-risk of a predictive regression","authors":"Yi He, Yanxi Hou, L. Peng, Haipeng Shen","doi":"10.1214/19-aos1937","DOIUrl":"https://doi.org/10.1214/19-aos1937","url":null,"abstract":"Conditional value-at-risk is a popular risk measure in risk management. We study the inference problem of conditional value-at-risk under a linear predictive regression model. We derive the asymptotic distribution of the least squares estimator for the conditional value-at-risk. Our results relax the model assumptions made in Chun et al. (2012) and correct their mistake in the asymptotic variance expression. We show that the asymptotic variance depends on the quantile density function of the unobserved error and whether the model has a predictor with infinite variance, which makes it challenging to actually quantify the uncertainty of the conditional risk measure. To make the inference feasible, we then propose a smooth empirical likelihood based method for constructing a confidence interval for the conditional value-at-risk based on either independent errors or GARCH errors. Our approach not only bypasses the challenge of directly estimating the asymptotic variance but also does not need to know whether there exists an infinite variance predictor in the predictive model. Furthermore, we apply the same idea to the quantile regression method, which allows infinite variance predictors and generalizes the parameter estimation in Whang (2006) to conditional value-at-risk in the supplementary material. We demonstrate the finite sample performance of the derived confidence intervals through numerical studies before applying them to real data.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"3442-3464"},"PeriodicalIF":4.5,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48812460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
By Fabienne Comte∗, Valentine Genon-Catalot∗ Université de Paris, MAP5, CNRS, F-75006, France ∗ We considerN independent stochastic processes (Xi(t), t ∈ [0, T ]), i = 1, . . . , N , de ned by a one-dimensional stochastic di erential equation which are continuously observed throughout a time interval [0, T ] where T is xed. We study nonparametric estimation of the drift function on a given subset A of R. Projection estimators are de ned on nite dimensional subsets of L(A, dx). We stress that the set A may be compact or not and the di usion coe cient may be bounded or not. A data-driven procedure to select the dimension of the projection space is proposed where the dimension is chosen within a random collection of models. Upper bounds of risks are obtained, the assumptions are discussed and simulation experiments are reported.
{"title":"Nonparametric drift estimation for i.i.d. paths of stochastic differential equations","authors":"F. Comte, V. Genon-Catalot","doi":"10.1214/19-aos1933","DOIUrl":"https://doi.org/10.1214/19-aos1933","url":null,"abstract":"By Fabienne Comte∗, Valentine Genon-Catalot∗ Université de Paris, MAP5, CNRS, F-75006, France ∗ We considerN independent stochastic processes (Xi(t), t ∈ [0, T ]), i = 1, . . . , N , de ned by a one-dimensional stochastic di erential equation which are continuously observed throughout a time interval [0, T ] where T is xed. We study nonparametric estimation of the drift function on a given subset A of R. Projection estimators are de ned on nite dimensional subsets of L(A, dx). We stress that the set A may be compact or not and the di usion coe cient may be bounded or not. A data-driven procedure to select the dimension of the projection space is proposed where the dimension is chosen within a random collection of models. Upper bounds of risks are obtained, the assumptions are discussed and simulation experiments are reported.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"3336-3365"},"PeriodicalIF":4.5,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48826148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamiltonian Monte Carlo (HMC) is currently one of the most popular Markov Chain Monte Carlo algorithms to sample smooth distributions over continuous state space. This paper discusses the irreducibility and geometric ergodicity of the HMC algorithm. We consider cases where the number of steps of the StörmerVerlet integrator is either fixed or random. Under mild conditions on the potential U associated with target distribution π, we first show that the Markov kernel associated to the HMC algorithm is irreducible and positive recurrent. Under more stringent conditions, we then establish that the Markov kernel is Harris recurrent. We provide verifiable conditions on U under which the HMC sampler is geometrically ergodic. Finally, we illustrate our results on several examples.
{"title":"Irreducibility and geometric ergodicity of Hamiltonian Monte Carlo","authors":"Alain Durmus, É. Moulines, E. Saksman","doi":"10.1214/19-aos1941","DOIUrl":"https://doi.org/10.1214/19-aos1941","url":null,"abstract":"Hamiltonian Monte Carlo (HMC) is currently one of the most popular Markov Chain Monte Carlo algorithms to sample smooth distributions over continuous state space. This paper discusses the irreducibility and geometric ergodicity of the HMC algorithm. We consider cases where the number of steps of the StörmerVerlet integrator is either fixed or random. Under mild conditions on the potential U associated with target distribution π, we first show that the Markov kernel associated to the HMC algorithm is irreducible and positive recurrent. Under more stringent conditions, we then establish that the Markov kernel is Harris recurrent. We provide verifiable conditions on U under which the HMC sampler is geometrically ergodic. Finally, we illustrate our results on several examples.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"3545-3564"},"PeriodicalIF":4.5,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44346149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fréchet change-point detection","authors":"Paromita Dubey, H. Müller","doi":"10.1214/19-AOS1930","DOIUrl":"https://doi.org/10.1214/19-AOS1930","url":null,"abstract":"","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"3312-3335"},"PeriodicalIF":4.5,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44332233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An elaborate theory of predictions of a causal hypothesis consists of several falsifiable statements derived from the causal hypothesis. Statistical tests for the various pieces of the elaborate theory help to clarify how much the causal hypothesis is corroborated. In practice, the degree of corroboration of the causal hypothesis has been assessed by a verbal description of which of the several tests provides evidence for which of the several predictions. This verbal approach can miss quantitative patterns. In this paper, we develop a quantitative approach. We first decompose these various tests of the predictions into independent factors with different sources of potential biases. Support for the causal hypothesis is enhanced when many of these evidence factors support the predictions. A sensitivity analysis is used to assess the potential bias that could make the finding of the tests spurious. Along with this multi-parameter sensitivity analysis, we consider the partial conjunctions of the tests. These partial conjunctions quantify the evidence supporting various fractions of the collection of predictions. A partial conjunction test involves combining tests of the components in the partial conjunction. We find the asymptotically optimal combination of tests in the context of a sensitivity analysis. Our analysis of an elaborate theory of a causal hypothesis controls for the familywise error rate.
{"title":"Assessment of the extent of corroboration of an elaborate theory of a causal hypothesis using partial conjunctions of evidence factors","authors":"B. Karmakar, Dylan S. Small","doi":"10.1214/19-aos1929","DOIUrl":"https://doi.org/10.1214/19-aos1929","url":null,"abstract":"An elaborate theory of predictions of a causal hypothesis consists of several falsifiable statements derived from the causal hypothesis. Statistical tests for the various pieces of the elaborate theory help to clarify how much the causal hypothesis is corroborated. In practice, the degree of corroboration of the causal hypothesis has been assessed by a verbal description of which of the several tests provides evidence for which of the several predictions. This verbal approach can miss quantitative patterns. In this paper, we develop a quantitative approach. We first decompose these various tests of the predictions into independent factors with different sources of potential biases. Support for the causal hypothesis is enhanced when many of these evidence factors support the predictions. A sensitivity analysis is used to assess the potential bias that could make the finding of the tests spurious. Along with this multi-parameter sensitivity analysis, we consider the partial conjunctions of the tests. These partial conjunctions quantify the evidence supporting various fractions of the collection of predictions. A partial conjunction test involves combining tests of the components in the partial conjunction. We find the asymptotically optimal combination of tests in the context of a sensitivity analysis. Our analysis of an elaborate theory of a causal hypothesis controls for the familywise error rate.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"3283-3311"},"PeriodicalIF":4.5,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43683533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01Epub Date: 2020-09-19DOI: 10.1214/19-aos1900
Ethan X Fang, Yang Ning, Runze Li
This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach, which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (2002)'s procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.
本文涉及具有超高维度协变量的纵向数据的统计推断。我们首先研究了为感兴趣的低维参数构建置信区间和假设检验的问题。我们面临的主要挑战是如何在纵向数据存在高维滋扰参数和复杂的主体内相关性的情况下构建一个强大的检验统计量。为了应对这一挑战,我们提出了一种新的二次装饰相关推断函数方法,它能同时消除滋扰参数的影响并结合相关性以提高估计过程的效率。当感兴趣的参数是固定维度时,我们证明了所提出的估计器是渐近正态的,并达到了半参数信息约束,在此基础上我们可以构建一个最优的 Wald 检验统计量。我们进一步扩展了这一结果,并在感兴趣参数的维数随样本量以多项式速率增长的情况下,建立了估计器的极限分布。最后,我们研究了当感兴趣的是高维回归参数向量时如何控制误发现率(FDR)。我们证明,在纵向数据中,将 Storey(2002)的程序应用于每个回归参数的拟议检验统计量,可以渐近地控制 FDR。我们进行了模拟研究,以评估所建议程序的有限样本性能。我们的模拟结果表明,新提出的程序既能控制低维感兴趣参数检验的 I 类误差,也能控制多重检验问题中的 FDR。我们还将提出的程序应用于一个真实数据实例。
{"title":"TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA.","authors":"Ethan X Fang, Yang Ning, Runze Li","doi":"10.1214/19-aos1900","DOIUrl":"10.1214/19-aos1900","url":null,"abstract":"<p><p>This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach, which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (2002)'s procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 5","pages":"2622-2645"},"PeriodicalIF":4.5,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8277154/pdf/nihms-1614211.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39189359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hypothesis testing for high-dimensional time series via self-normalization","authors":"Runmin Wang, X. Shao","doi":"10.1214/19-AOS1904","DOIUrl":"https://doi.org/10.1214/19-AOS1904","url":null,"abstract":"","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2728-2758"},"PeriodicalIF":4.5,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42553602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arun K. Kuchibhotla, L. Brown, A. Buja, Junhui Cai, E. George, Linda H. Zhao
S.1. Simulations Continued. The simulation setting in this section is the same as in Section 9. We first describe the reason for using the null situation β0 0p in the model. If β0 is an arbitrary non-zero vector, then, for fixed covariates, XiYi cannot be identically distributed and hence only (asymptotically) conservative inference is possible. In simulations this conservativeness confounds with the simultaneity so that the coverage becomes close to 1 (if not 1). In the main manuscript, we have shown plots comparing our method with Berk et al. (2013) and selective inference. We label our confidence region R̂:n,M (12) as “UPoSI,” the projected confidence region B̂ n,M (28) as “UPoSIBox”, and Berk et al. (2013) as “PoSI.” Tables 1, 2, and 3 show exact numbers for the comparison of our method with Berk et al. (2013). Note that size of each dot in the row plot of Figure 9 indicates the proportion of confidence regions of that volume among same-sized models. In Setting A and B, the confidence region volumes of same-sized models are the same. In Setting C, volumes of confidence regions of Berk and PoSI Box enlarge (hence smaller logpVolq{|M |q if the last covariate is included. Tables 4 and 5 show the numbers for the comparison of our method with selective inference when the selection procedure is forward stepwise and LARS, respectively. Sample splitting is a simple procedure that provides valid inference after selection as discussed in Section 1.3. We stress here that this is valid only for independent observations and that the model selected in the first split half could be different from the one selected in the full data. The comparison results with n 1000, p 500 and selection methods forward stepwise, LARS and BIC are summarized in Figure S.1. For sample splitting we have used the Bonferroni correction to obtain simultaneous inference for all coefficients in a model. Table 6 shows the comparison of our method with sample splitting.
S.1。模拟继续说。本节中的模拟设置与第9节中的相同。我们首先描述了在模型中使用零情况β0 0p的原因。如果β0是任意非零向量,则对于固定的协变量,XiYi不可能是同分布的,因此只能(渐近)保守推断。在模拟中,这种保守性与同时性相混淆,使覆盖率接近1(如果不是1)。在主要手稿中,我们展示了将我们的方法与Berk等人(2013)和选择性推断进行比较的图表。我们将我们的置信区域R n,M(12)标记为“UPoSI”,将预测的置信区域B n,M(28)标记为“UPoSIBox”,并将Berk et al.(2013)标记为“PoSI”。表1、2和3显示了我们的方法与Berk et al.(2013)比较的确切数字。注意,图9的行图中每个点的大小表示该体积在相同大小的模型中置信区域的比例。在设置A和B中,相同大小模型的置信区域体积相同。在设置C中,如果包括最后一个协变量,则Berk和PoSI Box的置信区域的体积增大(因此更小的logpVolq{|M |q)。表4和表5分别显示了当选择过程为逐步前向和LARS时,我们的方法与选择性推理的比较数字。样本分割是一个简单的过程,在选择后提供有效的推理,如1.3节所讨论的。我们在这里强调,这只对独立的观测有效,并且在第一个分割部分中选择的模型可能不同于在完整数据中选择的模型。与n 1000, p 500和逐步选择方法,LARS和BIC的比较结果总结在图S.1中。对于样本分割,我们使用Bonferroni校正来获得模型中所有系数的同时推断。表6显示了我们的方法与样本分割的比较。
{"title":"Valid post-selection inference in model-free linear regression","authors":"Arun K. Kuchibhotla, L. Brown, A. Buja, Junhui Cai, E. George, Linda H. Zhao","doi":"10.1214/19-AOS1917","DOIUrl":"https://doi.org/10.1214/19-AOS1917","url":null,"abstract":"S.1. Simulations Continued. The simulation setting in this section is the same as in Section 9. We first describe the reason for using the null situation β0 0p in the model. If β0 is an arbitrary non-zero vector, then, for fixed covariates, XiYi cannot be identically distributed and hence only (asymptotically) conservative inference is possible. In simulations this conservativeness confounds with the simultaneity so that the coverage becomes close to 1 (if not 1). In the main manuscript, we have shown plots comparing our method with Berk et al. (2013) and selective inference. We label our confidence region R̂:n,M (12) as “UPoSI,” the projected confidence region B̂ n,M (28) as “UPoSIBox”, and Berk et al. (2013) as “PoSI.” Tables 1, 2, and 3 show exact numbers for the comparison of our method with Berk et al. (2013). Note that size of each dot in the row plot of Figure 9 indicates the proportion of confidence regions of that volume among same-sized models. In Setting A and B, the confidence region volumes of same-sized models are the same. In Setting C, volumes of confidence regions of Berk and PoSI Box enlarge (hence smaller logpVolq{|M |q if the last covariate is included. Tables 4 and 5 show the numbers for the comparison of our method with selective inference when the selection procedure is forward stepwise and LARS, respectively. Sample splitting is a simple procedure that provides valid inference after selection as discussed in Section 1.3. We stress here that this is valid only for independent observations and that the model selected in the first split half could be different from the one selected in the full data. The comparison results with n 1000, p 500 and selection methods forward stepwise, LARS and BIC are summarized in Figure S.1. For sample splitting we have used the Bonferroni correction to obtain simultaneous inference for all coefficients in a model. Table 6 shows the comparison of our method with sample splitting.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2953-2981"},"PeriodicalIF":4.5,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66077588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we consider the problem of testing the equality of two multivariate distributions based on geometric graphs, constructed using the inter-point distances between the observations. These include the test based on the minimum spanning tree and the K-nearest neighbor (NN) graphs, among others. These tests are asymptotically distribution-free, universally consistent, and computationally efficient, making them particularly useful in modern applications. However, very little is known about the power properties of these tests. In this paper, using theory of stabilizing geometric graphs, we derive the asymptotic distribution of these tests under general alternatives, in the Poissonized setting. Using this, the detection threshold and the limiting local power of the test based on the K-NN graph are obtained, where interesting exponents depending on dimension emerge. This provides a way to compare and justify the performance of these tests in different examples.
{"title":"Asymptotic distribution and detection thresholds for two-sample tests based on geometric graphs","authors":"B. Bhattacharya","doi":"10.1214/19-AOS1913","DOIUrl":"https://doi.org/10.1214/19-AOS1913","url":null,"abstract":"In this paper we consider the problem of testing the equality of two multivariate distributions based on geometric graphs, constructed using the inter-point distances between the observations. These include the test based on the minimum spanning tree and the K-nearest neighbor (NN) graphs, among others. These tests are asymptotically distribution-free, universally consistent, and computationally efficient, making them particularly useful in modern applications. However, very little is known about the power properties of these tests. In this paper, using theory of stabilizing geometric graphs, we derive the asymptotic distribution of these tests under general alternatives, in the Poissonized setting. Using this, the detection threshold and the limiting local power of the test based on the K-NN graph are obtained, where interesting exponents depending on dimension emerge. This provides a way to compare and justify the performance of these tests in different examples.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2879-2903"},"PeriodicalIF":4.5,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43314523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asymptotic risk and phase transition of $l_{1}$-penalized robust estimator","authors":"Hanwen Huang","doi":"10.1214/19-AOS1923","DOIUrl":"https://doi.org/10.1214/19-AOS1923","url":null,"abstract":"","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"3090-3111"},"PeriodicalIF":4.5,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48301749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}