首页 > 最新文献

Annals of Statistics最新文献

英文 中文
Extreme value inference for heterogeneous power law data 异构幂律数据的极值推断
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-06-01 DOI: 10.1214/23-aos2294
John H.J. Einmahl, Yi He
We extend extreme value statistics to independent data with possibly very different distributions. In particular, we present novel asymptotic normality results for the Hill estimator, which now estimates the extreme value index of the average distribution. Due to the heterogeneity, the asymptotic variance can be substantially smaller than that in the i.i.d. case. As a special case, we consider a heterogeneous scales model where the asymptotic variance can be calculated explicitly. The primary tool for the proofs is the functional central limit theorem for a weighted tail empirical process. We also present asymptotic normality results for the extreme quantile estimator. A simulation study shows the good finite-sample behavior of our limit theorems. We also present applications to assess the tail heaviness of earthquake energies and of cross-sectional stock market losses.
我们将极值统计扩展到具有可能非常不同分布的独立数据。特别地,我们给出了Hill估计量的新的渐近正态性结果,它现在估计平均分布的极值指数。由于异质性,渐近方差可以大大小于i.i.d情况。作为一种特殊情况,我们考虑一个异质尺度模型,其中渐近方差可以显式计算。证明的主要工具是加权尾经验过程的泛函中心极限定理。我们也给出了极值分位数估计的渐近正态性结果。仿真研究表明,我们的极限定理具有良好的有限样本性质。我们也提出应用来评估地震能量的尾重和横截面股市损失。
{"title":"Extreme value inference for heterogeneous power law data","authors":"John H.J. Einmahl, Yi He","doi":"10.1214/23-aos2294","DOIUrl":"https://doi.org/10.1214/23-aos2294","url":null,"abstract":"We extend extreme value statistics to independent data with possibly very different distributions. In particular, we present novel asymptotic normality results for the Hill estimator, which now estimates the extreme value index of the average distribution. Due to the heterogeneity, the asymptotic variance can be substantially smaller than that in the i.i.d. case. As a special case, we consider a heterogeneous scales model where the asymptotic variance can be calculated explicitly. The primary tool for the proofs is the functional central limit theorem for a weighted tail empirical process. We also present asymptotic normality results for the extreme quantile estimator. A simulation study shows the good finite-sample behavior of our limit theorems. We also present applications to assess the tail heaviness of earthquake energies and of cross-sectional stock market losses.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135046050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference on the maximal rank of time-varying covariance matrices using high-frequency data 基于高频数据的时变协方差矩阵最大秩的推断
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-04-01 DOI: 10.1214/23-aos2273
Markus Reiss, Lars Winkelmann
We study the rank of the instantaneous or spot covariance matrix ΣX(t) of a multidimensional process X(t). Given high-frequency observations X(i/n), i=0,…,n, we test the null hypothesis rank(ΣX(t))≤r for all t against local alternatives where the average (r+1)st eigenvalue is larger than some signal detection rate vn. A major problem is that the inherent averaging in local covariance statistics produces a bias that distorts the rank statistics. We show that the bias depends on the regularity and spectral gap of ΣX(t). We establish explicit matrix perturbation and concentration results that provide nonasymptotic uniform critical values and optimal signal detection rates vn. This leads to a rank estimation method via sequential testing. For a class of stochastic volatility models, we determine data-driven critical values via normed p-variations of estimated local covariance matrices. The methods are illustrated by simulations and an application to high-frequency data of U.S. government bonds.
我们研究了多维过程X(t)的瞬时或点协方差矩阵ΣX(t)的秩。给定高频观测值X(i/n), i=0,…,n,我们对所有t针对局部替代方案检验零假设秩(ΣX(t))≤r,其中平均(r+1)st特征值大于某些信号检测率vn。一个主要问题是局部协方差统计中固有的平均会产生偏差,从而扭曲秩统计。我们表明,偏差取决于ΣX(t)的规律性和谱间隙。我们建立了显式矩阵摄动和集中结果,提供了非渐近一致临界值和最佳信号检测率vn。这导致了通过顺序测试的秩估计方法。对于一类随机波动模型,我们通过估计的局部协方差矩阵的归一化p变来确定数据驱动的临界值。通过模拟和对美国政府债券高频数据的应用说明了这些方法。
{"title":"Inference on the maximal rank of time-varying covariance matrices using high-frequency data","authors":"Markus Reiss, Lars Winkelmann","doi":"10.1214/23-aos2273","DOIUrl":"https://doi.org/10.1214/23-aos2273","url":null,"abstract":"We study the rank of the instantaneous or spot covariance matrix ΣX(t) of a multidimensional process X(t). Given high-frequency observations X(i/n), i=0,…,n, we test the null hypothesis rank(ΣX(t))≤r for all t against local alternatives where the average (r+1)st eigenvalue is larger than some signal detection rate vn. A major problem is that the inherent averaging in local covariance statistics produces a bias that distorts the rank statistics. We show that the bias depends on the regularity and spectral gap of ΣX(t). We establish explicit matrix perturbation and concentration results that provide nonasymptotic uniform critical values and optimal signal detection rates vn. This leads to a rank estimation method via sequential testing. For a class of stochastic volatility models, we determine data-driven critical values via normed p-variations of estimated local covariance matrices. The methods are illustrated by simulations and an application to high-frequency data of U.S. government bonds.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"483 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135673417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimally tackling covariate shift in RKHS-based nonparametric regression 基于rkhs的非参数回归中协变量移位的优化处理
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-04-01 DOI: 10.1214/23-aos2268
Cong Ma, Reese Pathak, Martin J. Wainwright
We study the covariate shift problem in the context of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We focus on two natural families of covariate shift problems defined using the likelihood ratios between the source and target distributions. When the likelihood ratios are uniformly bounded, we prove that the kernel ridge regression (KRR) estimator with a carefully chosen regularization parameter is minimax rate-optimal (up to a log factor) for a large family of RKHSs with regular kernel eigenvalues. Interestingly, KRR does not require full knowledge of the likelihood ratio apart from an upper bound on it. In striking contrast to the standard statistical setting without covariate shift, we also demonstrate that a naïve estimator, which minimizes the empirical risk over the function class, is strictly suboptimal under covariate shift as compared to KRR. We then address the larger class of covariate shift problems where likelihood ratio is possibly unbounded yet has a finite second moment. Here, we propose a reweighted KRR estimator that weights samples based on a careful truncation of the likelihood ratios. Again, we are able to show that this estimator is minimax optimal, up to logarithmic factors.
研究了非参数回归在再现核希尔伯特空间(RKHS)上的协变量移位问题。我们关注两个自然的协变量移位问题族,使用源分布和目标分布之间的似然比来定义。当似然比一致有界时,我们证明了具有正则核特征值的核脊回归(KRR)估计器具有精心选择的正则化参数是最小最大率最优的(高达一个对数因子)。有趣的是,KRR不需要完全了解似然比,除了它的上界。与没有协变量移位的标准统计设置形成鲜明对比,我们还证明了与KRR相比,在协变量移位下,将函数类的经验风险最小化的naïve估计器是严格次优的。然后,我们处理更大的一类协变量移位问题,其中似然比可能是无界的,但具有有限的第二矩。在这里,我们提出了一个重新加权的KRR估计器,该估计器基于仔细截断似然比来对样本进行加权。再一次,我们能够证明这个估计器是最小最大最优的,直到对数因子。
{"title":"Optimally tackling covariate shift in RKHS-based nonparametric regression","authors":"Cong Ma, Reese Pathak, Martin J. Wainwright","doi":"10.1214/23-aos2268","DOIUrl":"https://doi.org/10.1214/23-aos2268","url":null,"abstract":"We study the covariate shift problem in the context of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We focus on two natural families of covariate shift problems defined using the likelihood ratios between the source and target distributions. When the likelihood ratios are uniformly bounded, we prove that the kernel ridge regression (KRR) estimator with a carefully chosen regularization parameter is minimax rate-optimal (up to a log factor) for a large family of RKHSs with regular kernel eigenvalues. Interestingly, KRR does not require full knowledge of the likelihood ratio apart from an upper bound on it. In striking contrast to the standard statistical setting without covariate shift, we also demonstrate that a naïve estimator, which minimizes the empirical risk over the function class, is strictly suboptimal under covariate shift as compared to KRR. We then address the larger class of covariate shift problems where likelihood ratio is possibly unbounded yet has a finite second moment. Here, we propose a reweighted KRR estimator that weights samples based on a careful truncation of the likelihood ratios. Again, we are able to show that this estimator is minimax optimal, up to logarithmic factors.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135673416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On high-dimensional Poisson models with measurement error: Hypothesis testing for nonlinear nonconvex optimization. 具有测量误差的高维泊松模型:非线性非凸优化的假设检验。
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-02-01 DOI: 10.1214/22-aos2248
Fei Jiang, Yeqing Zhou, Jianxuan Liu, Yanyuan Ma

We study estimation and testing in the Poisson regression model with noisy high dimensional covariates, which has wide applications in analyzing noisy big data. Correcting for the estimation bias due to the covariate noise leads to a non-convex target function to minimize. Treating the high dimensional issue further leads us to augment an amenable penalty term to the target function. We propose to estimate the regression parameter through minimizing the penalized target function. We derive the L1 and L2 convergence rates of the estimator and prove the variable selection consistency. We further establish the asymptotic normality of any subset of the parameters, where the subset can have infinitely many components as long as its cardinality grows sufficiently slow. We develop Wald and score tests based on the asymptotic normality of the estimator, which permits testing of linear functions of the members if the subset. We examine the finite sample performance of the proposed tests by extensive simulation. Finally, the proposed method is successfully applied to the Alzheimer's Disease Neuroimaging Initiative study, which motivated this work initially.

本文研究了含噪声高维协变量的泊松回归模型的估计和检验,该模型在噪声大数据分析中有广泛的应用。校正由协变量噪声引起的估计偏差导致非凸目标函数最小化。进一步处理高维问题会使我们对目标函数增加一个可接受的惩罚项。我们提出通过最小化惩罚目标函数来估计回归参数。我们得到了估计器的L1和L2收敛速率,并证明了变量选择的一致性。我们进一步建立了参数的任意子集的渐近正态性,只要其基数增长足够慢,该子集可以有无限多个分量。基于估计量的渐近正态性,我们开发了Wald和score检验,它允许对子集的成员的线性函数进行检验。我们通过广泛的模拟来检验所提出的测试的有限样本性能。最后,该方法成功应用于阿尔茨海默病神经影像学倡议研究,初步推动了本工作的开展。
{"title":"On high-dimensional Poisson models with measurement error: Hypothesis testing for nonlinear nonconvex optimization.","authors":"Fei Jiang,&nbsp;Yeqing Zhou,&nbsp;Jianxuan Liu,&nbsp;Yanyuan Ma","doi":"10.1214/22-aos2248","DOIUrl":"https://doi.org/10.1214/22-aos2248","url":null,"abstract":"<p><p>We study estimation and testing in the Poisson regression model with noisy high dimensional covariates, which has wide applications in analyzing noisy big data. Correcting for the estimation bias due to the covariate noise leads to a non-convex target function to minimize. Treating the high dimensional issue further leads us to augment an amenable penalty term to the target function. We propose to estimate the regression parameter through minimizing the penalized target function. We derive the <i>L</i><sub>1</sub> and <i>L</i><sub>2</sub> convergence rates of the estimator and prove the variable selection consistency. We further establish the asymptotic normality of any subset of the parameters, where the subset can have infinitely many components as long as its cardinality grows sufficiently slow. We develop Wald and score tests based on the asymptotic normality of the estimator, which permits testing of linear functions of the members if the subset. We examine the finite sample performance of the proposed tests by extensive simulation. Finally, the proposed method is successfully applied to the Alzheimer's Disease Neuroimaging Initiative study, which motivated this work initially.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"51 1","pages":"233-259"},"PeriodicalIF":4.5,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10438917/pdf/nihms-1868138.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10054730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On High dimensional Poisson models with measurement error: hypothesis testing for nonlinear nonconvex optimization 具有测量误差的高维泊松模型:非线性非凸优化的假设检验
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2022-12-31 DOI: 10.48550/arXiv.2301.00139
Fei Jiang, Yeqing Zhou, Jianxuan Liu, Yanyuan Ma
We study estimation and testing in the Poisson regression model with noisy high dimensional covariates, which has wide applications in analyzing noisy big data. Correcting for the estimation bias due to the covariate noise leads to a non-convex target function to minimize. Treating the high dimensional issue further leads us to augment an amenable penalty term to the target function. We propose to estimate the regression parameter through minimizing the penalized target function. We derive the L1 and L2 convergence rates of the estimator and prove the variable selection consistency. We further establish the asymptotic normality of any subset of the parameters, where the subset can have infinitely many components as long as its cardinality grows sufficiently slow. We develop Wald and score tests based on the asymptotic normality of the estimator, which permits testing of linear functions of the members if the subset. We examine the finite sample performance of the proposed tests by extensive simulation. Finally, the proposed method is successfully applied to the Alzheimer's Disease Neuroimaging Initiative study, which motivated this work initially.
本文研究了含噪声高维协变量的泊松回归模型的估计和检验,该模型在噪声大数据分析中有广泛的应用。校正由协变量噪声引起的估计偏差导致非凸目标函数最小化。进一步处理高维问题会使我们对目标函数增加一个可接受的惩罚项。我们提出通过最小化惩罚目标函数来估计回归参数。我们得到了估计器的L1和L2收敛速率,并证明了变量选择的一致性。我们进一步建立了参数的任意子集的渐近正态性,只要其基数增长足够慢,该子集可以有无限多个分量。基于估计量的渐近正态性,我们开发了Wald和score检验,它允许对子集的成员的线性函数进行检验。我们通过广泛的模拟来检验所提出的测试的有限样本性能。最后,该方法成功应用于阿尔茨海默病神经影像学倡议研究,初步推动了本工作的开展。
{"title":"On High dimensional Poisson models with measurement error: hypothesis testing for nonlinear nonconvex optimization","authors":"Fei Jiang, Yeqing Zhou, Jianxuan Liu, Yanyuan Ma","doi":"10.48550/arXiv.2301.00139","DOIUrl":"https://doi.org/10.48550/arXiv.2301.00139","url":null,"abstract":"We study estimation and testing in the Poisson regression model with noisy high dimensional covariates, which has wide applications in analyzing noisy big data. Correcting for the estimation bias due to the covariate noise leads to a non-convex target function to minimize. Treating the high dimensional issue further leads us to augment an amenable penalty term to the target function. We propose to estimate the regression parameter through minimizing the penalized target function. We derive the L1 and L2 convergence rates of the estimator and prove the variable selection consistency. We further establish the asymptotic normality of any subset of the parameters, where the subset can have infinitely many components as long as its cardinality grows sufficiently slow. We develop Wald and score tests based on the asymptotic normality of the estimator, which permits testing of linear functions of the members if the subset. We examine the finite sample performance of the proposed tests by extensive simulation. Finally, the proposed method is successfully applied to the Alzheimer's Disease Neuroimaging Initiative study, which motivated this work initially.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"51 1 1","pages":"233-259"},"PeriodicalIF":4.5,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45193479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
BATCH POLICY LEARNING IN AVERAGE REWARD MARKOV DECISION PROCESSES. 平均奖励马尔可夫决策过程中的批量策略学习。
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2022-12-01 DOI: 10.1214/22-aos2231
Peng Liao, Zhengling Qi, Runzhe Wan, Predrag Klasnja, Susan A Murphy

We consider the batch (off-line) policy learning problem in the infinite horizon Markov Decision Process. Motivated by mobile health applications, we focus on learning a policy that maximizes the long-term average reward. We propose a doubly robust estimator for the average reward and show that it achieves semiparametric efficiency. Further we develop an optimization algorithm to compute the optimal policy in a parameterized stochastic policy class. The performance of the estimated policy is measured by the difference between the optimal average reward in the policy class and the average reward of the estimated policy and we establish a finite-sample regret guarantee. The performance of the method is illustrated by simulation studies and an analysis of a mobile health study promoting physical activity.

研究了无限视界马尔可夫决策过程中的批量(离线)策略学习问题。在移动医疗应用程序的激励下,我们专注于学习一种使长期平均回报最大化的策略。我们提出了一种双鲁棒的平均奖励估计器,并证明它达到了半参数效率。在此基础上,提出了一种优化算法来计算参数化随机策略类的最优策略。估计策略的性能通过策略类中最优平均奖励与估计策略的平均奖励之间的差来衡量,并建立有限样本后悔保证。通过模拟研究和对促进身体活动的移动健康研究的分析说明了该方法的性能。
{"title":"BATCH POLICY LEARNING IN AVERAGE REWARD MARKOV DECISION PROCESSES.","authors":"Peng Liao,&nbsp;Zhengling Qi,&nbsp;Runzhe Wan,&nbsp;Predrag Klasnja,&nbsp;Susan A Murphy","doi":"10.1214/22-aos2231","DOIUrl":"https://doi.org/10.1214/22-aos2231","url":null,"abstract":"<p><p>We consider the batch (off-line) policy learning problem in the infinite horizon Markov Decision Process. Motivated by mobile health applications, we focus on learning a policy that maximizes the long-term average reward. We propose a doubly robust estimator for the average reward and show that it achieves semiparametric efficiency. Further we develop an optimization algorithm to compute the optimal policy in a parameterized stochastic policy class. The performance of the estimated policy is measured by the difference between the optimal average reward in the policy class and the average reward of the estimated policy and we establish a finite-sample regret guarantee. The performance of the method is illustrated by simulation studies and an analysis of a mobile health study promoting physical activity.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"50 6","pages":"3364-3387"},"PeriodicalIF":4.5,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10072865/pdf/nihms-1837036.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9270218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
LINEAR BIOMARKER COMBINATION FOR CONSTRAINED CLASSIFICATION. 约束分类的线性生物标志物组合。
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2022-10-01 Epub Date: 2022-10-27 DOI: 10.1214/22-aos2210
Yijian Huang, Martin G Sanda

Multiple biomarkers are often combined to improve disease diagnosis. The uniformly optimal combination, i.e., with respect to all reasonable performance metrics, unfortunately requires excessive distributional modeling, to which the estimation can be sensitive. An alternative strategy is rather to pursue local optimality with respect to a specific performance metric. Nevertheless, existing methods may not target clinical utility of the intended medical test, which usually needs to operate above a certain sensitivity or specificity level, or do not have their statistical properties well studied and understood. In this article, we develop and investigate a linear combination method to maximize the clinical utility empirically for such a constrained classification. The combination coefficient is shown to have cube root asymptotics. The convergence rate and limiting distribution of the predictive performance are subsequently established, exhibiting robustness of the method in comparison with others. An algorithm with sound statistical justification is devised for efficient and high-quality computation. Simulations corroborate the theoretical results, and demonstrate good statistical and computational performance. Illustration with a clinical study on aggressive prostate cancer detection is provided.

多种生物标志物经常联合使用以改善疾病诊断。不幸的是,统一的最优组合,即,关于所有合理的性能指标,需要过多的分布建模,这对估计可能很敏感。另一种策略是针对特定的性能指标追求局部最优性。然而,现有的方法可能无法针对预期医学测试的临床应用,通常需要在一定的灵敏度或特异性水平以上操作,或者对其统计特性没有很好的研究和理解。在本文中,我们开发和研究了一种线性组合方法,以最大限度地提高这种约束分类的临床效用。证明了组合系数具有立方根渐近性。随后建立了预测性能的收敛率和极限分布,与其他方法相比,显示了该方法的鲁棒性。为了提高计算效率和质量,设计了一种具有良好统计合理性的算法。仿真结果证实了理论结果,并显示出良好的统计性能和计算性能。提供了一项侵袭性前列腺癌检测的临床研究的例证。
{"title":"LINEAR BIOMARKER COMBINATION FOR CONSTRAINED CLASSIFICATION.","authors":"Yijian Huang,&nbsp;Martin G Sanda","doi":"10.1214/22-aos2210","DOIUrl":"https://doi.org/10.1214/22-aos2210","url":null,"abstract":"<p><p>Multiple biomarkers are often combined to improve disease diagnosis. The uniformly optimal combination, i.e., with respect to all reasonable performance metrics, unfortunately requires excessive distributional modeling, to which the estimation can be sensitive. An alternative strategy is rather to pursue local optimality with respect to a specific performance metric. Nevertheless, existing methods may not target clinical utility of the intended medical test, which usually needs to operate above a certain sensitivity or specificity level, or do not have their statistical properties well studied and understood. In this article, we develop and investigate a linear combination method to maximize the clinical utility empirically for such a constrained classification. The combination coefficient is shown to have cube root asymptotics. The convergence rate and limiting distribution of the predictive performance are subsequently established, exhibiting robustness of the method in comparison with others. An algorithm with sound statistical justification is devised for efficient and high-quality computation. Simulations corroborate the theoretical results, and demonstrate good statistical and computational performance. Illustration with a clinical study on aggressive prostate cancer detection is provided.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"50 5","pages":"2793-2815"},"PeriodicalIF":4.5,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9635489/pdf/nihms-1819429.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40449706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Quality of life and surgical outcomes of robotic retromuscular ventral hernia repair using a new hybrid mesh reinforcement. 使用新型混合网片加固的机器人腹股沟疝修补术的生活质量和手术效果。
IF 2.3 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2022-06-01 Epub Date: 2022-04-28 DOI: 10.1007/s10029-022-02619-5
Omar Yusef Kudsi, Georges Kaoukabani, Naseem Bou-Ayash, Kelly Vallar, Alexandra Chudner, Sara LaGrange, Fahri Gokcal

Purpose: The purpose of this study is to prospectively evaluate surgical and quality of life (QoL) outcomes of robotic retromuscular ventral hernia repair (rRMVHR) using a new hybrid mesh in high-risk patients.

Methods: Data was prospectively collected for patients classified as high-risk based on the modified ventral hernia working group (VHWG) grading system, who underwent rRMVHR using Synecor™ Pre hybrid mesh in a single center, between 2019 and 2020. Pre-, intra- and postoperative variables including hernia recurrence, surgical site events (SSE), hernia-specific quality of life (QoL), and financial costs were analyzed. QoL assessments were obtained from preoperative and postoperative patient visits. Kaplan-Meier survival analysis was performed to analyze the estimated recurrence-free time.

Results: Fifty-two high-risk patients, with a mean (±SD) age of 58.6 ± 13.7 years and BMI of 36.9 ± 6.6 kg/m2, were followed for a mean (±SD) period of 22.4 ± 7.1 months. A total of 11 (21.2%) patients experienced postoperative complications, out of which eight were SSEs, including 7 (13.5%) seromas, 1 (1.9%) hematoma, and no infections. Procedural interventions were required for 2 (3.8%) surgical site occurrences. Recurrence was seen in 1 (1.9%) patient. The estimated mean (95% confidence interval) recurrence-free time was 33 (32.3-34.5) months. Postoperative QoL assessments demonstrated significant improvements in comparison to preoperative QoL, with a minimum ∆mean (±SD) of -15.5 ± 2.2 at one month (p < 0.001). The mean (±SD) procedure cost was $13,924.18 ± 7856.95 which includes the average mesh cost ($5390.12 ± 3817.03).

Conclusion: Our study showed favorable early and mid-term outcomes, in addition to significant improvements in QoL, after rRMVHR using Synecor™ hybrid mesh in high-risk patients.

目的:本研究旨在前瞻性评估高风险患者使用新型混合网片进行机器人腹股沟疝修补术(rRMVHR)的手术效果和生活质量(QoL):根据腹股沟疝工作组(VHWG)的改良分级系统,前瞻性地收集了2019年至2020年期间在一个中心使用Synecor™ Pre混合网片进行rRMVHR手术的高危患者的数据。分析的术前、术中和术后变量包括疝气复发、手术部位事件(SSE)、疝气特定生活质量(QoL)和经济成本。QoL 评估来自术前和术后患者访视。对估计的无复发时间进行了 Kaplan-Meier 生存分析:52名高风险患者的平均(±SD)年龄为58.6±13.7岁,体重指数为36.9±6.6 kg/m2,平均(±SD)随访时间为22.4±7.1个月。共有 11 例(21.2%)患者出现术后并发症,其中 8 例为 SSE,包括 7 例(13.5%)血清肿、1 例(1.9%)血肿,无感染。2例(3.8%)手术部位并发症需要进行手术干预。1例(1.9%)患者出现复发。估计平均无复发时间(95% 置信区间)为 33(32.3-34.5)个月。术后 QoL 评估显示,与术前相比,术后 QoL 有了显著改善,一个月时的最小 ∆ 平均值(±SD)为 -15.5 ± 2.2(p 结论:我们的研究显示,术后早期和中期的 QoL 均有良好改善:我们的研究表明,在高危患者中使用 Synecor™ 混合网片进行 rRMVHR 后,除了 QoL 有明显改善外,早期和中期疗效也很好。
{"title":"Quality of life and surgical outcomes of robotic retromuscular ventral hernia repair using a new hybrid mesh reinforcement.","authors":"Omar Yusef Kudsi, Georges Kaoukabani, Naseem Bou-Ayash, Kelly Vallar, Alexandra Chudner, Sara LaGrange, Fahri Gokcal","doi":"10.1007/s10029-022-02619-5","DOIUrl":"10.1007/s10029-022-02619-5","url":null,"abstract":"<p><strong>Purpose: </strong>The purpose of this study is to prospectively evaluate surgical and quality of life (QoL) outcomes of robotic retromuscular ventral hernia repair (rRMVHR) using a new hybrid mesh in high-risk patients.</p><p><strong>Methods: </strong>Data was prospectively collected for patients classified as high-risk based on the modified ventral hernia working group (VHWG) grading system, who underwent rRMVHR using Synecor™ Pre hybrid mesh in a single center, between 2019 and 2020. Pre-, intra- and postoperative variables including hernia recurrence, surgical site events (SSE), hernia-specific quality of life (QoL), and financial costs were analyzed. QoL assessments were obtained from preoperative and postoperative patient visits. Kaplan-Meier survival analysis was performed to analyze the estimated recurrence-free time.</p><p><strong>Results: </strong>Fifty-two high-risk patients, with a mean (±SD) age of 58.6 ± 13.7 years and BMI of 36.9 ± 6.6 kg/m<sup>2</sup>, were followed for a mean (±SD) period of 22.4 ± 7.1 months. A total of 11 (21.2%) patients experienced postoperative complications, out of which eight were SSEs, including 7 (13.5%) seromas, 1 (1.9%) hematoma, and no infections. Procedural interventions were required for 2 (3.8%) surgical site occurrences. Recurrence was seen in 1 (1.9%) patient. The estimated mean (95% confidence interval) recurrence-free time was 33 (32.3-34.5) months. Postoperative QoL assessments demonstrated significant improvements in comparison to preoperative QoL, with a minimum ∆mean (±SD) of -15.5 ± 2.2 at one month (p < 0.001). The mean (±SD) procedure cost was $13,924.18 ± 7856.95 which includes the average mesh cost ($5390.12 ± 3817.03).</p><p><strong>Conclusion: </strong>Our study showed favorable early and mid-term outcomes, in addition to significant improvements in QoL, after rRMVHR using Synecor™ hybrid mesh in high-risk patients.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"20 1","pages":"881-888"},"PeriodicalIF":2.3,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88522489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DOUBLY DEBIASED LASSO: HIGH-DIMENSIONAL INFERENCE UNDER HIDDEN CONFOUNDING. 双重去偏套索:隐藏混杂下的高维推理。
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2022-06-01 Epub Date: 2022-06-16 DOI: 10.1214/21-aos2152
Zijian Guo, Domagoj Ćevid, Peter Bühlmann

Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected by hidden confounding and propose the Doubly Debiased Lasso estimator for individual components of the regression coefficient vector. Our advocated method simultaneously corrects both the bias due to estimation of high-dimensional parameters as well as the bias caused by the hidden confounding. We establish its asymptotic normality and also prove that it is efficient in the Gauss-Markov sense. The validity of our methodology relies on a dense confounding assumption, i.e. that every confounding variable affects many covariates. The finite sample performance is illustrated with an extensive simulation study and a genomic application.

从观测数据推断因果关系或相关关联可能因隐藏混淆的存在而无效。我们专注于一个高维线性回归设置,其中测量的协变量受到隐藏混淆的影响,并提出了回归系数向量的各个分量的双去偏Lasso估计器。我们提出的方法同时修正了由于高维参数估计引起的偏差和由于隐藏混杂引起的偏差。我们建立了它的渐近正态性,并证明了它在高斯-马尔可夫意义上是有效的。我们的方法的有效性依赖于一个密集的混杂假设,即每个混杂变量影响许多协变量。有限样本性能通过广泛的模拟研究和基因组应用来说明。
{"title":"DOUBLY DEBIASED LASSO: HIGH-DIMENSIONAL INFERENCE UNDER HIDDEN CONFOUNDING.","authors":"Zijian Guo,&nbsp;Domagoj Ćevid,&nbsp;Peter Bühlmann","doi":"10.1214/21-aos2152","DOIUrl":"https://doi.org/10.1214/21-aos2152","url":null,"abstract":"<p><p>Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected by hidden confounding and propose the <i>Doubly Debiased Lasso</i> estimator for individual components of the regression coefficient vector. Our advocated method simultaneously corrects both the bias due to estimation of high-dimensional parameters as well as the bias caused by the hidden confounding. We establish its asymptotic normality and also prove that it is efficient in the Gauss-Markov sense. The validity of our methodology relies on a dense confounding assumption, i.e. that every confounding variable affects many covariates. The finite sample performance is illustrated with an extensive simulation study and a genomic application.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"50 3","pages":"1320-1347"},"PeriodicalIF":4.5,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9365063/pdf/nihms-1824950.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40608265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
OPTIMAL FALSE DISCOVERY RATE CONTROL FOR LARGE SCALE MULTIPLE TESTING WITH AUXILIARY INFORMATION. 基于辅助信息的大规模多重测试的最优错误发现率控制。
IF 4.5 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2022-04-01 DOI: 10.1214/21-aos2128
Hongyuan Cao, Jun Chen, Xianyang Zhang

Large-scale multiple testing is a fundamental problem in high dimensional statistical inference. It is increasingly common that various types of auxiliary information, reflecting the structural relationship among the hypotheses, are available. Exploiting such auxiliary information can boost statistical power. To this end, we propose a framework based on a two-group mixture model with varying probabilities of being null for different hypotheses a priori, where a shape-constrained relationship is imposed between the auxiliary information and the prior probabilities of being null. An optimal rejection rule is designed to maximize the expected number of true positives when average false discovery rate is controlled. Focusing on the ordered structure, we develop a robust EM algorithm to estimate the prior probabilities of being null and the distribution of p-values under the alternative hypothesis simultaneously. We show that the proposed method has better power than state-of-the-art competitors while controlling the false discovery rate, both empirically and theoretically. Extensive simulations demonstrate the advantage of the proposed method. Datasets from genome-wide association studies are used to illustrate the new methodology.

大规模多重检验是高维统计推理中的一个基本问题。反映假设之间结构关系的各种类型的辅助信息越来越普遍。利用这些辅助信息可以提高统计能力。为此,我们提出了一个基于两组混合模型的框架,该模型对不同的先验假设具有不同的为零概率,其中辅助信息与为零的先验概率之间施加了形状约束关系。在控制平均错误发现率的情况下,设计了一个最优拒绝规则,使真阳性的期望数量最大化。针对有序结构,我们开发了一种鲁棒的EM算法来同时估计备择假设下为零的先验概率和p值的分布。我们从经验和理论两方面证明了所提出的方法在控制错误发现率的同时具有比最先进的竞争对手更好的能力。大量的仿真实验证明了该方法的优越性。来自全基因组关联研究的数据集被用来说明新的方法。
{"title":"OPTIMAL FALSE DISCOVERY RATE CONTROL FOR LARGE SCALE MULTIPLE TESTING WITH AUXILIARY INFORMATION.","authors":"Hongyuan Cao,&nbsp;Jun Chen,&nbsp;Xianyang Zhang","doi":"10.1214/21-aos2128","DOIUrl":"https://doi.org/10.1214/21-aos2128","url":null,"abstract":"<p><p>Large-scale multiple testing is a fundamental problem in high dimensional statistical inference. It is increasingly common that various types of auxiliary information, reflecting the structural relationship among the hypotheses, are available. Exploiting such auxiliary information can boost statistical power. To this end, we propose a framework based on a two-group mixture model with varying probabilities of being null for different hypotheses <i>a priori</i>, where a shape-constrained relationship is imposed between the auxiliary information and the prior probabilities of being null. An optimal rejection rule is designed to maximize the expected number of true positives when average false discovery rate is controlled. Focusing on the ordered structure, we develop a robust EM algorithm to estimate the prior probabilities of being null and the distribution of <i>p</i>-values under the alternative hypothesis simultaneously. We show that the proposed method has better power than state-of-the-art competitors while controlling the false discovery rate, both empirically and theoretically. Extensive simulations demonstrate the advantage of the proposed method. Datasets from genome-wide association studies are used to illustrate the new methodology.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"50 2","pages":"807-857"},"PeriodicalIF":4.5,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10153594/pdf/nihms-1840915.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9776938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
期刊
Annals of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1