Annals of Statistics最新文献

英文中文

Necessary and sufficient conditions for variable selection consistency of the LASSO in high dimensions 高维LASSO变量选择一致性的充分必要条件

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-04-01 DOI: 10.1214/20-AOS1979

S. Lahiri

This paper investigates conditions for variable selection consistency of the LASSO in high dimensional regression models and gives necessary and sufficient conditions for the same, potentially allowing the model dimension p to grow arbitrarily fast as a function of the sample size n. These conditions require both upper and lower bounds on the growth rate of the penalty parameter. It turns out that a variant of the irrepresentable Condition (IRC) of Zhao and Yu (2006), herein called the lower irrepresentable Condition (or LIRC), is determined by the lower bound considerations while the upper bound considerations lead to a new condition, called the upper irrepresentable Condition (or UIRC) in this paper. It is shown that the LIRC together with the UIRC is necessary and sufficient for the variable selection consistency of the LASSO, thereby settling a conjecture of (Zhao and Yu, 2006). Further, it is shown that under some mild regularity conditions, the penalty parameter must necessarily tend to infinity at a certain minimal rate to ensure variable selection consistency of the LASSO and that the corresponding LASSO estimators of the nonzero regression parameters can not be √ nconsistent (even for individual parameters). Thus, under fairly general conditions, the LASSO with a single choice of the penalty parameter can not achieve both variable selection consistency and √ n-consistency simultaneously. MSC 2010 subject classifications: Primary62E20; secondary 62J05.

本文研究了高维回归模型中LASSO变量选择一致性的条件，并给出了必要和充分条件，可能允许模型维数p作为样本量n的函数任意快速增长。这些条件需要惩罚参数增长率的上界和下界。结果表明，Zhao和Yu(2006)的不可表征条件(IRC)的一个变体，这里称为下不可表征条件(LIRC)，由下界考虑决定，而上界考虑导致一个新的条件，本文称为上不可表征条件(UIRC)。证明了lrc和UIRC对于LASSO的变量选择一致性是充分必要的，从而解决了(Zhao and Yu, 2006)的猜想。进一步证明了在一些温和的正则性条件下，惩罚参数必须以一定的最小速率趋于无穷大，以保证LASSO的变量选择一致性，并且相应的非零回归参数的LASSO估计量不能是不一致的(即使是单个参数)。因此，在相当一般的条件下，惩罚参数选择单一的LASSO不能同时实现变量选择一致性和√n一致性。MSC 2010学科分类:Primary62E20;二次62 j05。

{"title":"Necessary and sufficient conditions for variable selection consistency of the LASSO in high dimensions","authors":"S. Lahiri","doi":"10.1214/20-AOS1979","DOIUrl":"https://doi.org/10.1214/20-AOS1979","url":null,"abstract":"This paper investigates conditions for variable selection consistency of the LASSO in high dimensional regression models and gives necessary and sufficient conditions for the same, potentially allowing the model dimension p to grow arbitrarily fast as a function of the sample size n. These conditions require both upper and lower bounds on the growth rate of the penalty parameter. It turns out that a variant of the irrepresentable Condition (IRC) of Zhao and Yu (2006), herein called the lower irrepresentable Condition (or LIRC), is determined by the lower bound considerations while the upper bound considerations lead to a new condition, called the upper irrepresentable Condition (or UIRC) in this paper. It is shown that the LIRC together with the UIRC is necessary and sufficient for the variable selection consistency of the LASSO, thereby settling a conjecture of (Zhao and Yu, 2006). Further, it is shown that under some mild regularity conditions, the penalty parameter must necessarily tend to infinity at a certain minimal rate to ensure variable selection consistency of the LASSO and that the corresponding LASSO estimators of the nonzero regression parameters can not be √ nconsistent (even for individual parameters). Thus, under fairly general conditions, the LASSO with a single choice of the penalty parameter can not achieve both variable selection consistency and √ n-consistency simultaneously. MSC 2010 subject classifications: Primary62E20; secondary 62J05.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":" ","pages":""},"PeriodicalIF":4.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43542740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

ASYMMETRY HELPS: EIGENVALUE AND EIGENVECTOR ANALYSES OF ASYMMETRICALLY PERTURBED LOW-RANK MATRICES. 不对称有助于：不对称扰动低阶矩阵的特征值和特征向量分析。

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 Epub Date: 2021-01-29 DOI: 10.1214/20-aos1963

Yuxin Chen, Chen Cheng, Jianqing Fan

This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix $M^{⋆} \in ℝ^{n \times n}$ , yet only a randomly perturbed version M is observed. The noise matrix M - M ^⋆ is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, we have two independent samples for each entry of M ^⋆ and arrange them in an asymmetric fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of M ^⋆. We demonstrate that the leading eigenvalue of the data matrix M can be $O (\sqrt{n})$ times more accurate (up to some log factor) than its (unadjusted) leading singular value of M in eigenvalue estimation. Moreover, the eigen-decomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing non-asymptotic eigenvector perturbation bounds; in particular, we are able to bound the perterbation of any linear function of the leading eigenvector of M (e.g. entrywise eigenvector perturbation). We also provide partial theory for the more general rank-r case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigen-decomposition could sometimes be quite beneficial.

本文关注统计不对称与光谱方法之间的相互作用。假设我们有兴趣估算一个秩为 1 的对称矩阵 M ⋆ ∈ ℝ n × n，但只观测到随机扰动版本的 M。噪声矩阵 M - M ⋆ 由独立（但不一定是同源）条目组成，因此一般不是对称的。例如，如果我们对 M ⋆ 的每个条目都有两个独立样本，并以非对称方式排列，就可能出现这种情况。我们的目的是估计 M ⋆ 的前导特征值和前导特征向量。我们证明，在特征值估计中，数据矩阵 M 的前导特征值比 M 的（未调整的）前导奇异值精确 O ( n ) 倍（达到某个对数因子）。此外，特征分解方法还能完全适应噪声的异方差性，而无需任何关于噪声分布的先验知识。简而言之，这种奇特现象的出现是因为统计不对称自动减轻了特征值方法的偏差，从而无需进行仔细的偏差校正。此外，我们还开发了具有吸引力的非渐近特征向量扰动约束；特别是，我们能够约束 M 的前导特征向量的任何线性函数的扰动（例如入口特征向量扰动）。我们还为更一般的秩r情况提供了部分理论。我们的启示是：以非对称方式排列数据样本并进行特征分解有时会非常有益。

{"title":"ASYMMETRY HELPS: EIGENVALUE AND EIGENVECTOR ANALYSES OF ASYMMETRICALLY PERTURBED LOW-RANK MATRICES.","authors":"Yuxin Chen, Chen Cheng, Jianqing Fan","doi":"10.1214/20-aos1963","DOIUrl":"10.1214/20-aos1963","url":null,"abstract":"This paper is concerned with the interplay between statistical asymmetry and spectral methods. Suppose we are interested in estimating a rank-1 and symmetric matrix <math> <mrow> <msup><mstyle><mi>M</mi></mstyle> <mo>⋆</mo></msup> <mo>∈</mo> <msup><mi>ℝ</mi> <mrow><mi>n</mi> <mo>×</mo> <mi>n</mi></mrow> </msup> </mrow> </math> , yet only a randomly perturbed version M is observed. The noise matrix M - M ⋆ is composed of independent (but not necessarily homoscedastic) entries and is, therefore, not symmetric in general. This might arise if, for example, we have two independent samples for each entry of M ⋆ and arrange them in an asymmetric fashion. The aim is to estimate the leading eigenvalue and the leading eigenvector of M ⋆. We demonstrate that the leading eigenvalue of the data matrix M can be <math><mrow><mi>O</mi> <mo>(</mo> <msqrt><mi>n</mi></msqrt> <mo>)</mo></mrow> </math> times more accurate (up to some log factor) than its (unadjusted) leading singular value of M in eigenvalue estimation. Moreover, the eigen-decomposition approach is fully adaptive to heteroscedasticity of noise, without the need of any prior knowledge about the noise distributions. In a nutshell, this curious phenomenon arises since the statistical asymmetry automatically mitigates the bias of the eigenvalue approach, thus eliminating the need of careful bias correction. Additionally, we develop appealing non-asymptotic eigenvector perturbation bounds; in particular, we are able to bound the perterbation of any linear function of the leading eigenvector of M (e.g. entrywise eigenvector perturbation). We also provide partial theory for the more general rank-r case. The takeaway message is this: arranging the data samples in an asymmetric manner and performing eigen-decomposition could sometimes be quite beneficial.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"435-458"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8300484/pdf/nihms-1639565.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39218981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A rule of thumb: Run lengths to false alarm of many types of control charts run in parallel on dependent streams are asymptotically independent 经验法则：在相关流上并行运行的许多类型的控制图的虚警运行长度是渐近独立的

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 DOI: 10.1214/20-AOS1968

M. Pollak

Consider a process that produces a series of independent identically distributed vectors. A change in an underlying state may become manifest in a modification of one or more of the marginal distributions. Often, the dependence structure between coordinates is unknown, impeding surveillance based on the joint distribution. A popular approach is to construct control charts for each coordinate separately and raise an alarm the first time any (or some) of the control charts signals. The difficulty is obtaining an expression for the overall average run length to false alarm (ARL2FA).We argue that despite the dependence structure, when the process is in control, for large ARLs to false alarm, run lengths of many types of control charts run in parallel are asymptotically independent. Furthermore, often, in-control run lengths are asymptotically exponentially distributed, enabling uncomplicated asymptotic expressions for the ARL2FA.We prove this assertion for certain Cusum and Shiryaev–Roberts-type control charts and illustrate it by simulations.

考虑一个生成一系列独立的同分布向量的过程。潜在状态的变化可能在一个或多个边际分布的修改中变得明显。通常，坐标之间的依赖结构是未知的，阻碍了基于联合分布的监视。一种流行的方法是分别为每个坐标构建控制图，并在任何（或某些）控制图发出信号时发出警报。困难在于获得到虚警的总平均游程长度（ARL2FA）的表达式。我们认为，尽管存在依赖结构，但当过程处于控制状态时，对于大的ARL到虚警，许多类型的控制图并行运行的游程长度是渐近独立的。此外，通常情况下，控制行程长度是渐近指数分布的，这使得ARL2FA能够得到简单的渐近表达式。我们在某些Cusum和Shiryaev–Roberts型控制图上证明了这一断言，并通过仿真进行了说明。

引用次数: 0

Wordlength enumerator for fractional factorial designs 分数阶乘设计的字长枚举器

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 DOI: 10.1214/20-AOS1955

Yu Tang, Hongquan Xu

While the minimum aberration criterion is popular for selecting good designs with qualitative factors under an ANOVA model, the minimum $beta$-aberration criterion is more suitable for selecting designs with quantitative factors under a polynomial model. In this paper, we propose the concept of wordlength enumerator to unify these two criteria. The wordlength enumerator is defined as an average similarity of contrasts among all possible pairs of runs. The wordlength enumerator is easy and fast to compute, and can be used to compare and rank designs efficiently. Based on the wordlength enumerator, we develop simple and fast methods for calculating both the generalized wordlength pattern and the $beta$-wordlength pattern. We further obtain a lower bound of the wordlength enumerator for three-level designs and characterize the combinatorial structure of designs achieving the lower bound. Finally, we propose two methods for constructing supersaturated designs that have both generalized minimum aberration and minimum $beta$-aberration.

虽然最小像差标准在ANOVA模型下适用于选择具有定性因素的良好设计，但最小$beta$-像差标准更适用于在多项式模型下选择具有定量因素的设计。在本文中，我们提出了单词长度枚举器的概念来统一这两个标准。字长枚举器被定义为所有可能的运行对之间的对比度的平均相似性。字长枚举器计算简单快捷，可用于有效地比较和排序设计。基于字长枚举器，我们开发了简单快速的方法来计算广义字长模式和$beta$字长模式。我们进一步获得了三级设计的字长枚举器的下界，并描述了实现该下界的设计的组合结构。最后，我们提出了两种构造过饱和设计的方法，这两种方法同时具有广义最小像差和最小$beta$-像差。

引用次数: 4

Correction note: “Optimal two-stage procedures for estimating location and size of the maximum of a multivariate regression function” Ann. Statist. 40 (2012) 2850–2876 更正说明：“估计多元回归函数最大值的位置和大小的最佳两阶段程序”Ann.Statist。40（2012）2850–2876

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 DOI: 10.1214/20-AOS1993

E. Belitser, S. Ghosal, H. Zanten

We rectify a wrongly stated fact in the paper of Belitser, Ghosal and van Zanten (Ann. Statist.40 (2012) 2850–2876).

我们纠正了Belitser, Ghosal和van Zanten (Ann)的论文中错误陈述的事实。统计。40(2012)2850-2876)。

引用次数: 1

Correction note: Higher order elicitability and Osband’s principle 更正注释：高阶启发性和Osband原理

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 DOI: 10.1214/20-AOS2014

Tobias Fissler, Johanna F. Ziegel

引用次数: 3

Convergence of covariance and spectral density estimates for high-dimensional locally stationary processes 高维局部平稳过程的协方差和谱密度估计的收敛性

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 DOI: 10.1214/20-AOS1954

Danna Zhang, W. Wu

Covariances and spectral density functions play a fundamental role in the theory of time series. There is a well-developed asymptotic theory for their estimates for low-dimensional stationary processes. For high-dimensional nonstationary processes, however, many important problems on their asymptotic behaviors are still unanswered. This paper presents a systematic asymptotic theory for the estimates of time-varying second-order statistics for a general class of high-dimensional locally stationary processes. Using the framework of functional dependence measure, we derive convergence rates of the estimates which depend on the sample size $T$, the dimension $p$, the moment condition and the dependence of the underlying processes.

协方差和谱密度函数在时间序列理论中起着重要作用。对于低维平稳过程的估计，有一个完善的渐近理论。然而，对于高维非平稳过程，关于其渐近行为的许多重要问题仍然没有得到解答。本文给出了一类高维局部平稳过程时变二阶统计量估计的系统渐近理论。利用函数依赖性测度的框架，我们导出了估计的收敛率，这些估计取决于样本大小$T$、维度$p$、矩条件和基本过程的依赖性。

引用次数: 22

ASYMPTOTICALLY INDEPENDENT U-STATISTICS IN HIGH-DIMENSIONAL TESTING. 高维测试中渐近独立的 u 统计量。

IF 3.2 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 Epub Date: 2021-01-29 DOI: 10.1214/20-aos1951

Yinqiu He, Gongjun Xu, Chong Wu, Wei Pan

Many high-dimensional hypothesis tests aim to globally examine marginal or low-dimensional features of a high-dimensional joint distribution, such as testing of mean vectors, covariance matrices and regression coefficients. This paper constructs a family of U-statistics as unbiased estimators of the ℓ _p -norms of those features. We show that under the null hypothesis, the U-statistics of different finite orders are asymptotically independent and normally distributed. Moreover, they are also asymptotically independent with the maximum-type test statistic, whose limiting distribution is an extreme value distribution. Based on the asymptotic independence property, we propose an adaptive testing procedure which combines p-values computed from the U-statistics of different orders. We further establish power analysis results and show that the proposed adaptive procedure maintains high power against various alternatives.

许多高维假设检验旨在全面检验高维联合分布的边际或低维特征，如检验均值向量、协方差矩阵和回归系数。本文构建了一系列 U 统计量，作为这些特征的 ℓ p 矩的无偏估计值。我们证明，在零假设下，不同有限阶的 U 统计量是渐近独立和正态分布的。此外，它们与最大类型检验统计量也是渐近独立的，最大类型检验统计量的极限分布是极值分布。基于渐近独立特性，我们提出了一种自适应检验程序，该程序结合了从不同阶的 U 统计量计算出的 p 值。我们进一步建立了功率分析结果，并表明所提出的自适应程序在面对各种替代方案时都能保持较高的功率。

引用次数: 0

Sharp minimax distribution estimation for current status censoring with or without missing 有或没有缺失的当前状态审查的尖锐极小最大值分布估计

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2021-02-01 DOI: 10.1214/20-AOS1970

S. Efromovich

Nonparametric estimation of the cumulative distribution function and the probability density of a lifetime X modified by a current status censoring (CSC), including cases of right and left missing data, is a classical ill-posed problem with biased data. The biased nature of CSC data may preclude us from consistent estimation unless the biasing function is known or may be estimated, and its ill-posed nature slows down rates of convergence. Under a traditionally studied CSC, we observe a sample from $(Z,Delta )$ where a continuous monitoring time $Z$ is independent of $X$, $Delta :=I(Xleq Z)$ is the status, and the bias of observations is created by the density of $Z$ which is estimable. In presence of right or left missing, we observe corresponding samples from $(Delta Z,Delta )$ or $((1-Delta )Z,Delta )$; the data are again biased but now the density of $Z$ cannot be estimated from the data. As a result, to solve the estimation problem, either the density of $Z$ must be known (like in a controlled study) or an extra cross-sectional sampling of $Z$, which is typically simpler than an underlying CSC study, be conducted. The main aim of the paper is to develop for this biased and ill-posed problem the theory of efficient (sharp-minimax) estimation which is inspired by known results for the case of directly observed $X$. Among interesting aspects of the developed theory: (i) While sharp-minimax analysis of missing CSC may follow the classical Pinsker’s methodology, analysis of CSC requires a more complicated estimation procedure based on a special smoothing in both frequency and time domains; (ii) Efficient estimation requires solving an old-standing problem of approximating aperiodic Sobolev functions; (iii) If smoothness of the cdf of $X$ is known, then its rate-minimax estimation is possible even if the density of $Z$ is rougher. Real and simulated examples, as well as extensions of the core models to dependent $X$ and Z and case-control CSC, are presented.

通过当前状态截尾（CSC）修改的寿命X的累积分布函数和概率密度的非参数估计，包括左右缺失数据的情况，是一个具有偏差数据的经典不适定问题。CSC数据的偏差性质可能会使我们无法进行一致的估计，除非偏差函数是已知的或可以估计的，并且其不适定性质会减慢收敛速度。在传统研究的CSC下，我们观察到来自$（Z，Delta）$的样本，其中连续监测时间$Z$独立于$X$，$Delta：=I（Xleq Z）$是状态，并且观测的偏差由$Z$的密度产生，这是可估计的。在存在右或左缺失的情况下，我们观察到来自$（Delta Z，Delta）$或$（（1-Delta（Z，Deleta））$的相应样本；数据再次有偏差，但现在不能根据数据估计$Z$的密度。因此，为了解决估计问题，必须知道$Z$的密度（就像在对照研究中一样），或者进行额外的$Z$横截面抽样，这通常比基础CSC研究更简单。本文的主要目的是针对这一有偏和不适定性问题发展有效（尖锐极小极大）估计理论，该理论受到直接观测$X$情况下已知结果的启发。在所发展的理论的有趣方面中：（i）虽然缺失CSC的尖锐极小极大分析可能遵循经典的Pinsker方法，但CSC的分析需要基于频域和时域中的特殊平滑的更复杂的估计过程；（ii）有效的估计需要解决近似非周期Sobolev函数的老问题；（iii）如果$X$的cdf的光滑性是已知的，则即使$Z$的密度更粗糙，其速率最小最大估计也是可能的。给出了真实和模拟的例子，以及将核心模型扩展到依赖$X$和Z以及病例对照CSC。

{"title":"Sharp minimax distribution estimation for current status censoring with or without missing","authors":"S. Efromovich","doi":"10.1214/20-AOS1970","DOIUrl":"https://doi.org/10.1214/20-AOS1970","url":null,"abstract":"Nonparametric estimation of the cumulative distribution function and the probability density of a lifetime X modified by a current status censoring (CSC), including cases of right and left missing data, is a classical ill-posed problem with biased data. The biased nature of CSC data may preclude us from consistent estimation unless the biasing function is known or may be estimated, and its ill-posed nature slows down rates of convergence. Under a traditionally studied CSC, we observe a sample from $(Z,Delta )$ where a continuous monitoring time $Z$ is independent of $X$, $Delta :=I(Xleq Z)$ is the status, and the bias of observations is created by the density of $Z$ which is estimable. In presence of right or left missing, we observe corresponding samples from $(Delta Z,Delta )$ or $((1-Delta )Z,Delta )$; the data are again biased but now the density of $Z$ cannot be estimated from the data. As a result, to solve the estimation problem, either the density of $Z$ must be known (like in a controlled study) or an extra cross-sectional sampling of $Z$, which is typically simpler than an underlying CSC study, be conducted. The main aim of the paper is to develop for this biased and ill-posed problem the theory of efficient (sharp-minimax) estimation which is inspired by known results for the case of directly observed $X$. Among interesting aspects of the developed theory: (i) While sharp-minimax analysis of missing CSC may follow the classical Pinsker’s methodology, analysis of CSC requires a more complicated estimation procedure based on a special smoothing in both frequency and time domains; (ii) Efficient estimation requires solving an old-standing problem of approximating aperiodic Sobolev functions; (iii) If smoothness of the cdf of $X$ is known, then its rate-minimax estimation is possible even if the density of $Z$ is rougher. Real and simulated examples, as well as extensions of the core models to dependent $X$ and Z and case-control CSC, are presented.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 1","pages":"568-589"},"PeriodicalIF":4.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49238602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Inference for conditional value-at-risk of a predictive regression 预测回归的条件风险值推断

IF 4.5 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics

Pub Date : 2020-12-01 DOI: 10.1214/19-aos1937

Yi He, Yanxi Hou, L. Peng, Haipeng Shen

Conditional value-at-risk is a popular risk measure in risk management. We study the inference problem of conditional value-at-risk under a linear predictive regression model. We derive the asymptotic distribution of the least squares estimator for the conditional value-at-risk. Our results relax the model assumptions made in Chun et al. (2012) and correct their mistake in the asymptotic variance expression. We show that the asymptotic variance depends on the quantile density function of the unobserved error and whether the model has a predictor with infinite variance, which makes it challenging to actually quantify the uncertainty of the conditional risk measure. To make the inference feasible, we then propose a smooth empirical likelihood based method for constructing a confidence interval for the conditional value-at-risk based on either independent errors or GARCH errors. Our approach not only bypasses the challenge of directly estimating the asymptotic variance but also does not need to know whether there exists an infinite variance predictor in the predictive model. Furthermore, we apply the same idea to the quantile regression method, which allows infinite variance predictors and generalizes the parameter estimation in Whang (2006) to conditional value-at-risk in the supplementary material. We demonstrate the finite sample performance of the derived confidence intervals through numerical studies before applying them to real data.

条件风险价值是风险管理中常用的风险度量。研究了线性预测回归模型下条件风险值的推理问题。导出了条件风险值的最小二乘估计量的渐近分布。我们的结果放宽了Chun等人(2012)的模型假设，并纠正了他们在渐近方差表达式中的错误。我们表明渐近方差取决于未观测误差的分位数密度函数以及模型是否具有具有无限方差的预测器，这使得实际量化条件风险度量的不确定性具有挑战性。为了使推理可行，我们提出了一种光滑的基于经验似然的方法，用于构建基于独立误差或GARCH误差的条件风险值的置信区间。我们的方法不仅绕过了直接估计渐近方差的挑战，而且不需要知道预测模型中是否存在无限方差预测器。此外，我们将相同的思想应用于分位数回归方法，该方法允许无限方差预测因子，并将Whang(2006)中的参数估计推广到补充材料中的条件风险值。在将所得置信区间应用于实际数据之前，通过数值研究证明了所得置信区间的有限样本性能。

{"title":"Inference for conditional value-at-risk of a predictive regression","authors":"Yi He, Yanxi Hou, L. Peng, Haipeng Shen","doi":"10.1214/19-aos1937","DOIUrl":"https://doi.org/10.1214/19-aos1937","url":null,"abstract":"Conditional value-at-risk is a popular risk measure in risk management. We study the inference problem of conditional value-at-risk under a linear predictive regression model. We derive the asymptotic distribution of the least squares estimator for the conditional value-at-risk. Our results relax the model assumptions made in Chun et al. (2012) and correct their mistake in the asymptotic variance expression. We show that the asymptotic variance depends on the quantile density function of the unobserved error and whether the model has a predictor with infinite variance, which makes it challenging to actually quantify the uncertainty of the conditional risk measure. To make the inference feasible, we then propose a smooth empirical likelihood based method for constructing a confidence interval for the conditional value-at-risk based on either independent errors or GARCH errors. Our approach not only bypasses the challenge of directly estimating the asymptotic variance but also does not need to know whether there exists an infinite variance predictor in the predictive model. Furthermore, we apply the same idea to the quantile regression method, which allows infinite variance predictors and generalizes the parameter estimation in Whang (2006) to conditional value-at-risk in the supplementary material. We demonstrate the finite sample performance of the derived confidence intervals through numerical studies before applying them to real data.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"3442-3464"},"PeriodicalIF":4.5,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48812460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Annals of Statistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀