Pub Date : 2021-04-01Epub Date: 2021-04-02DOI: 10.1214/20-aos1973
Fredrik Sävje, Peter Aronow, Michael Hudgens
We investigate large-sample properties of treatment effect estimators under unknown interference in randomized experiments. The inferential target is a generalization of the average treatment effect estimand that marginalizes over potential spillover effects. We show that estimators commonly used to estimate treatment effects under no interference are consistent for the generalized estimand for several common experimental designs under limited but otherwise arbitrary and unknown interference. The rates of convergence depend on the rate at which the amount of interference grows and the degree to which it aligns with dependencies in treatment assignment. Importantly for practitioners, the results imply that if one erroneously assumes that units do not interfere in a setting with limited, or even moderate, interference, standard estimators are nevertheless likely to be close to an average treatment effect if the sample is sufficiently large. Conventional confidence statements may, however, not be accurate.
{"title":"AVERAGE TREATMENT EFFECTS IN THE PRESENCE OF UNKNOWN INTERFERENCE.","authors":"Fredrik Sävje, Peter Aronow, Michael Hudgens","doi":"10.1214/20-aos1973","DOIUrl":"10.1214/20-aos1973","url":null,"abstract":"<p><p>We investigate large-sample properties of treatment effect estimators under unknown interference in randomized experiments. The inferential target is a generalization of the average treatment effect estimand that marginalizes over potential spillover effects. We show that estimators commonly used to estimate treatment effects under no interference are consistent for the generalized estimand for several common experimental designs under limited but otherwise arbitrary and unknown interference. The rates of convergence depend on the rate at which the amount of interference grows and the degree to which it aligns with dependencies in treatment assignment. Importantly for practitioners, the results imply that if one erroneously assumes that units do not interfere in a setting with limited, or even moderate, interference, standard estimators are nevertheless likely to be close to an average treatment effect if the sample is sufficiently large. Conventional confidence statements may, however, not be accurate.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 2","pages":"673-701"},"PeriodicalIF":4.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8372033/pdf/nihms-1683738.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39334102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates conditions for variable selection consistency of the LASSO in high dimensional regression models and gives necessary and sufficient conditions for the same, potentially allowing the model dimension p to grow arbitrarily fast as a function of the sample size n. These conditions require both upper and lower bounds on the growth rate of the penalty parameter. It turns out that a variant of the irrepresentable Condition (IRC) of Zhao and Yu (2006), herein called the lower irrepresentable Condition (or LIRC), is determined by the lower bound considerations while the upper bound considerations lead to a new condition, called the upper irrepresentable Condition (or UIRC) in this paper. It is shown that the LIRC together with the UIRC is necessary and sufficient for the variable selection consistency of the LASSO, thereby settling a conjecture of (Zhao and Yu, 2006). Further, it is shown that under some mild regularity conditions, the penalty parameter must necessarily tend to infinity at a certain minimal rate to ensure variable selection consistency of the LASSO and that the corresponding LASSO estimators of the nonzero regression parameters can not be √ nconsistent (even for individual parameters). Thus, under fairly general conditions, the LASSO with a single choice of the penalty parameter can not achieve both variable selection consistency and √ n-consistency simultaneously. MSC 2010 subject classifications: Primary62E20; secondary 62J05.
本文研究了高维回归模型中LASSO变量选择一致性的条件,并给出了必要和充分条件,可能允许模型维数p作为样本量n的函数任意快速增长。这些条件需要惩罚参数增长率的上界和下界。结果表明,Zhao和Yu(2006)的不可表征条件(IRC)的一个变体,这里称为下不可表征条件(LIRC),由下界考虑决定,而上界考虑导致一个新的条件,本文称为上不可表征条件(UIRC)。证明了lrc和UIRC对于LASSO的变量选择一致性是充分必要的,从而解决了(Zhao and Yu, 2006)的猜想。进一步证明了在一些温和的正则性条件下,惩罚参数必须以一定的最小速率趋于无穷大,以保证LASSO的变量选择一致性,并且相应的非零回归参数的LASSO估计量不能是不一致的(即使是单个参数)。因此,在相当一般的条件下,惩罚参数选择单一的LASSO不能同时实现变量选择一致性和√n一致性。MSC 2010学科分类:Primary62E20;二次62 j05。
{"title":"Necessary and sufficient conditions for variable selection consistency of the LASSO in high dimensions","authors":"S. Lahiri","doi":"10.1214/20-AOS1979","DOIUrl":"https://doi.org/10.1214/20-AOS1979","url":null,"abstract":"This paper investigates conditions for variable selection consistency of the LASSO in high dimensional regression models and gives necessary and sufficient conditions for the same, potentially allowing the model dimension p to grow arbitrarily fast as a function of the sample size n. These conditions require both upper and lower bounds on the growth rate of the penalty parameter. It turns out that a variant of the irrepresentable Condition (IRC) of Zhao and Yu (2006), herein called the lower irrepresentable Condition (or LIRC), is determined by the lower bound considerations while the upper bound considerations lead to a new condition, called the upper irrepresentable Condition (or UIRC) in this paper. It is shown that the LIRC together with the UIRC is necessary and sufficient for the variable selection consistency of the LASSO, thereby settling a conjecture of (Zhao and Yu, 2006). Further, it is shown that under some mild regularity conditions, the penalty parameter must necessarily tend to infinity at a certain minimal rate to ensure variable selection consistency of the LASSO and that the corresponding LASSO estimators of the nonzero regression parameters can not be √ nconsistent (even for individual parameters). Thus, under fairly general conditions, the LASSO with a single choice of the penalty parameter can not achieve both variable selection consistency and √ n-consistency simultaneously. MSC 2010 subject classifications: Primary62E20; secondary 62J05.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":" ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43542740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-01Epub Date: 2021-01-29DOI: 10.1214/20-aos1963
Yuxin Chen, Chen Cheng, Jianqing Fan
This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix , yet only a randomly perturbed version M is observed. The noise matrix M - M⋆ is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, we have two independent samples for each entry of M⋆ and arrange them in an asymmetric fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of M⋆. We demonstrate that the leading eigenvalue of the data matrix M can be times more accurate (up to some log factor) than its (unadjusted) leading singular value of M in eigenvalue estimation. Moreover, the eigen-decomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing non-asymptotic eigenvector perturbation bounds; in particular, we are able to bound the perterbation of any linear function of the leading eigenvector of M (e.g. entrywise eigenvector perturbation). We also provide partial theory for the more general rank-r case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigen-decomposition could sometimes be quite beneficial.
本文关注统计不对称与光谱方法之间的相互作用。假设我们有兴趣估算一个秩为 1 的对称矩阵 M ⋆ ∈ ℝ n × n,但只观测到随机扰动版本的 M。噪声矩阵 M - M ⋆ 由独立(但不一定是同源)条目组成,因此一般不是对称的。例如,如果我们对 M ⋆ 的每个条目都有两个独立样本,并以非对称方式排列,就可能出现这种情况。我们的目的是估计 M ⋆ 的前导特征值和前导特征向量。我们证明,在特征值估计中,数据矩阵 M 的前导特征值比 M 的(未调整的)前导奇异值精确 O ( n ) 倍(达到某个对数因子)。此外,特征分解方法还能完全适应噪声的异方差性,而无需任何关于噪声分布的先验知识。简而言之,这种奇特现象的出现是因为统计不对称自动减轻了特征值方法的偏差,从而无需进行仔细的偏差校正。此外,我们还开发了具有吸引力的非渐近特征向量扰动约束;特别是,我们能够约束 M 的前导特征向量的任何线性函数的扰动(例如入口特征向量扰动)。我们还为更一般的秩r情况提供了部分理论。我们的启示是:以非对称方式排列数据样本并进行特征分解有时会非常有益。
{"title":"ASYMMETRY HELPS: EIGENVALUE AND EIGENVECTOR ANALYSES OF ASYMMETRICALLY PERTURBED LOW-RANK MATRICES.","authors":"Yuxin Chen, Chen Cheng, Jianqing Fan","doi":"10.1214/20-aos1963","DOIUrl":"10.1214/20-aos1963","url":null,"abstract":"<p><p>This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix <math> <mrow> <msup><mstyle><mi>M</mi></mstyle> <mo>⋆</mo></msup> <mo>∈</mo> <msup><mi>ℝ</mi> <mrow><mi>n</mi> <mo>×</mo> <mi>n</mi></mrow> </msup> </mrow> </math> , yet only a randomly perturbed version <b><i>M</i></b> is observed. The noise matrix <b><i>M</i></b> - <b><i>M</i></b> <sup>⋆</sup> is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, we have two independent samples for each entry of <b><i>M</i></b> <sup>⋆</sup> and arrange them in an <i>asymmetric</i> fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of <b><i>M</i></b> <sup>⋆</sup>. We demonstrate that the leading eigenvalue of the data matrix <b><i>M</i></b> can be <math><mrow><mi>O</mi> <mo>(</mo> <msqrt><mi>n</mi></msqrt> <mo>)</mo></mrow> </math> times more accurate (up to some log factor) than its (unadjusted) leading singular value of <b><i>M</i></b> in eigenvalue estimation. Moreover, the eigen-decomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing non-asymptotic eigenvector perturbation bounds; in particular, we are able to bound the perterbation of any linear function of the leading eigenvector of <b><i>M</i></b> (e.g. entrywise eigenvector perturbation). We also provide partial theory for the more general rank-<i>r</i> case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigen-decomposition could sometimes be quite beneficial.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"435-458"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8300484/pdf/nihms-1639565.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39218981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Consider a process that produces a series of independent identically distributed vectors. A change in an underlying state may become manifest in a modification of one or more of the marginal distributions. Often, the dependence structure between coordinates is unknown, impeding surveillance based on the joint distribution. A popular approach is to construct control charts for each coordinate separately and raise an alarm the first time any (or some) of the control charts signals. The difficulty is obtaining an expression for the overall average run length to false alarm (ARL2FA).We argue that despite the dependence structure, when the process is in control, for large ARLs to false alarm, run lengths of many types of control charts run in parallel are asymptotically independent. Furthermore, often, in-control run lengths are asymptotically exponentially distributed, enabling uncomplicated asymptotic expressions for the ARL2FA.We prove this assertion for certain Cusum and Shiryaev–Roberts-type control charts and illustrate it by simulations.
{"title":"A rule of thumb: Run lengths to false alarm of many types of control charts run in parallel on dependent streams are asymptotically independent","authors":"M. Pollak","doi":"10.1214/20-AOS1968","DOIUrl":"https://doi.org/10.1214/20-AOS1968","url":null,"abstract":"Consider a process that produces a series of independent identically distributed vectors. A change in an underlying state may become manifest in a modification of one or more of the marginal distributions. Often, the dependence structure between coordinates is unknown, impeding surveillance based on the joint distribution. A popular approach is to construct control charts for each coordinate separately and raise an alarm the first time any (or some) of the control charts signals. The difficulty is obtaining an expression for the overall average run length to false alarm (ARL2FA).We argue that despite the dependence structure, when the process is in control, for large ARLs to false alarm, run lengths of many types of control charts run in parallel are asymptotically independent. Furthermore, often, in-control run lengths are asymptotically exponentially distributed, enabling uncomplicated asymptotic expressions for the ARL2FA.We prove this assertion for certain Cusum and Shiryaev–Roberts-type control charts and illustrate it by simulations.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"557-567"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47932043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
While the minimum aberration criterion is popular for selecting good designs with qualitative factors under an ANOVA model, the minimum $beta$-aberration criterion is more suitable for selecting designs with quantitative factors under a polynomial model. In this paper, we propose the concept of wordlength enumerator to unify these two criteria. The wordlength enumerator is defined as an average similarity of contrasts among all possible pairs of runs. The wordlength enumerator is easy and fast to compute, and can be used to compare and rank designs efficiently. Based on the wordlength enumerator, we develop simple and fast methods for calculating both the generalized wordlength pattern and the $beta$-wordlength pattern. We further obtain a lower bound of the wordlength enumerator for three-level designs and characterize the combinatorial structure of designs achieving the lower bound. Finally, we propose two methods for constructing supersaturated designs that have both generalized minimum aberration and minimum $beta$-aberration.
{"title":"Wordlength enumerator for fractional factorial designs","authors":"Yu Tang, Hongquan Xu","doi":"10.1214/20-AOS1955","DOIUrl":"https://doi.org/10.1214/20-AOS1955","url":null,"abstract":"While the minimum aberration criterion is popular for selecting good designs with qualitative factors under an ANOVA model, the minimum $beta$-aberration criterion is more suitable for selecting designs with quantitative factors under a polynomial model. In this paper, we propose the concept of wordlength enumerator to unify these two criteria. The wordlength enumerator is defined as an average similarity of contrasts among all possible pairs of runs. The wordlength enumerator is easy and fast to compute, and can be used to compare and rank designs efficiently. Based on the wordlength enumerator, we develop simple and fast methods for calculating both the generalized wordlength pattern and the $beta$-wordlength pattern. We further obtain a lower bound of the wordlength enumerator for three-level designs and characterize the combinatorial structure of designs achieving the lower bound. Finally, we propose two methods for constructing supersaturated designs that have both generalized minimum aberration and minimum $beta$-aberration.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"255-271"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48338113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction note: “Optimal two-stage procedures for estimating location and size of the maximum of a multivariate regression function” Ann. Statist. 40 (2012) 2850–2876","authors":"E. Belitser, S. Ghosal, H. Zanten","doi":"10.1214/20-AOS1993","DOIUrl":"https://doi.org/10.1214/20-AOS1993","url":null,"abstract":"We rectify a wrongly stated fact in the paper of Belitser, Ghosal and van Zanten (Ann. Statist.40 (2012) 2850–2876).","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"612-613"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48716846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction note: Higher order elicitability and Osband’s principle","authors":"Tobias Fissler, Johanna F. Ziegel","doi":"10.1214/20-AOS2014","DOIUrl":"https://doi.org/10.1214/20-AOS2014","url":null,"abstract":"","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"614-614"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45392545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Covariances and spectral density functions play a fundamental role in the theory of time series. There is a well-developed asymptotic theory for their estimates for low-dimensional stationary processes. For high-dimensional nonstationary processes, however, many important problems on their asymptotic behaviors are still unanswered. This paper presents a systematic asymptotic theory for the estimates of time-varying second-order statistics for a general class of high-dimensional locally stationary processes. Using the framework of functional dependence measure, we derive convergence rates of the estimates which depend on the sample size $T$, the dimension $p$, the moment condition and the dependence of the underlying processes.
{"title":"Convergence of covariance and spectral density estimates for high-dimensional locally stationary processes","authors":"Danna Zhang, W. Wu","doi":"10.1214/20-AOS1954","DOIUrl":"https://doi.org/10.1214/20-AOS1954","url":null,"abstract":"Covariances and spectral density functions play a fundamental role in the theory of time series. There is a well-developed asymptotic theory for their estimates for low-dimensional stationary processes. For high-dimensional nonstationary processes, however, many important problems on their asymptotic behaviors are still unanswered. This paper presents a systematic asymptotic theory for the estimates of time-varying second-order statistics for a general class of high-dimensional locally stationary processes. Using the framework of functional dependence measure, we derive convergence rates of the estimates which depend on the sample size $T$, the dimension $p$, the moment condition and the dependence of the underlying processes.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"233-254"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43249786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-01Epub Date: 2021-01-29DOI: 10.1214/20-aos1951
Yinqiu He, Gongjun Xu, Chong Wu, Wei Pan
Many high-dimensional hypothesis tests aim to globally examine marginal or low-dimensional features of a high-dimensional joint distribution, such as testing of mean vectors, covariance matrices and regression coefficients. This paper constructs a family of U-statistics as unbiased estimators of the ℓp -norms of those features. We show that under the null hypothesis, the U-statistics of different finite orders are asymptotically independent and normally distributed. Moreover, they are also asymptotically independent with the maximum-type test statistic, whose limiting distribution is an extreme value distribution. Based on the asymptotic independence property, we propose an adaptive testing procedure which combines p-values computed from the U-statistics of different orders. We further establish power analysis results and show that the proposed adaptive procedure maintains high power against various alternatives.
许多高维假设检验旨在全面检验高维联合分布的边际或低维特征,如检验均值向量、协方差矩阵和回归系数。本文构建了一系列 U 统计量,作为这些特征的 ℓ p 矩的无偏估计值。我们证明,在零假设下,不同有限阶的 U 统计量是渐近独立和正态分布的。此外,它们与最大类型检验统计量也是渐近独立的,最大类型检验统计量的极限分布是极值分布。基于渐近独立特性,我们提出了一种自适应检验程序,该程序结合了从不同阶的 U 统计量计算出的 p 值。我们进一步建立了功率分析结果,并表明所提出的自适应程序在面对各种替代方案时都能保持较高的功率。
{"title":"ASYMPTOTICALLY INDEPENDENT U-STATISTICS IN HIGH-DIMENSIONAL TESTING.","authors":"Yinqiu He, Gongjun Xu, Chong Wu, Wei Pan","doi":"10.1214/20-aos1951","DOIUrl":"10.1214/20-aos1951","url":null,"abstract":"<p><p>Many high-dimensional hypothesis tests aim to globally examine marginal or low-dimensional features of a high-dimensional joint distribution, such as testing of mean vectors, covariance matrices and regression coefficients. This paper constructs a family of U-statistics as unbiased estimators of the <i>ℓ</i> <sub><i>p</i></sub> -norms of those features. We show that under the null hypothesis, the U-statistics of different finite orders are asymptotically independent and normally distributed. Moreover, they are also asymptotically independent with the maximum-type test statistic, whose limiting distribution is an extreme value distribution. Based on the asymptotic independence property, we propose an adaptive testing procedure which combines <i>p</i>-values computed from the U-statistics of different orders. We further establish power analysis results and show that the proposed adaptive procedure maintains high power against various alternatives.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"154-181"},"PeriodicalIF":3.2,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8634550/pdf/nihms-1737820.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39939694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nonparametric estimation of the cumulative distribution function and the probability density of a lifetime X modified by a current status censoring (CSC), including cases of right and left missing data, is a classical ill-posed problem with biased data. The biased nature of CSC data may preclude us from consistent estimation unless the biasing function is known or may be estimated, and its ill-posed nature slows down rates of convergence. Under a traditionally studied CSC, we observe a sample from $(Z,Delta )$ where a continuous monitoring time $Z$ is independent of $X$, $Delta :=I(Xleq Z)$ is the status, and the bias of observations is created by the density of $Z$ which is estimable. In presence of right or left missing, we observe corresponding samples from $(Delta Z,Delta )$ or $((1-Delta )Z,Delta )$; the data are again biased but now the density of $Z$ cannot be estimated from the data. As a result, to solve the estimation problem, either the density of $Z$ must be known (like in a controlled study) or an extra cross-sectional sampling of $Z$, which is typically simpler than an underlying CSC study, be conducted. The main aim of the paper is to develop for this biased and ill-posed problem the theory of efficient (sharp-minimax) estimation which is inspired by known results for the case of directly observed $X$. Among interesting aspects of the developed theory: (i) While sharp-minimax analysis of missing CSC may follow the classical Pinsker’s methodology, analysis of CSC requires a more complicated estimation procedure based on a special smoothing in both frequency and time domains; (ii) Efficient estimation requires solving an old-standing problem of approximating aperiodic Sobolev functions; (iii) If smoothness of the cdf of $X$ is known, then its rate-minimax estimation is possible even if the density of $Z$ is rougher. Real and simulated examples, as well as extensions of the core models to dependent $X$ and Z and case-control CSC, are presented.
{"title":"Sharp minimax distribution estimation for current status censoring with or without missing","authors":"S. Efromovich","doi":"10.1214/20-AOS1970","DOIUrl":"https://doi.org/10.1214/20-AOS1970","url":null,"abstract":"Nonparametric estimation of the cumulative distribution function and the probability density of a lifetime X modified by a current status censoring (CSC), including cases of right and left missing data, is a classical ill-posed problem with biased data. The biased nature of CSC data may preclude us from consistent estimation unless the biasing function is known or may be estimated, and its ill-posed nature slows down rates of convergence. Under a traditionally studied CSC, we observe a sample from $(Z,Delta )$ where a continuous monitoring time $Z$ is independent of $X$, $Delta :=I(Xleq Z)$ is the status, and the bias of observations is created by the density of $Z$ which is estimable. In presence of right or left missing, we observe corresponding samples from $(Delta Z,Delta )$ or $((1-Delta )Z,Delta )$; the data are again biased but now the density of $Z$ cannot be estimated from the data. As a result, to solve the estimation problem, either the density of $Z$ must be known (like in a controlled study) or an extra cross-sectional sampling of $Z$, which is typically simpler than an underlying CSC study, be conducted. The main aim of the paper is to develop for this biased and ill-posed problem the theory of efficient (sharp-minimax) estimation which is inspired by known results for the case of directly observed $X$. Among interesting aspects of the developed theory: (i) While sharp-minimax analysis of missing CSC may follow the classical Pinsker’s methodology, analysis of CSC requires a more complicated estimation procedure based on a special smoothing in both frequency and time domains; (ii) Efficient estimation requires solving an old-standing problem of approximating aperiodic Sobolev functions; (iii) If smoothness of the cdf of $X$ is known, then its rate-minimax estimation is possible even if the density of $Z$ is rougher. Real and simulated examples, as well as extensions of the core models to dependent $X$ and Z and case-control CSC, are presented.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"568-589"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49238602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}