TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA.

IF 3.2 1区 数学 Q1 STATISTICS & PROBABILITY Annals of Statistics Pub Date : 2020-10-01 Epub Date: 2020-09-19 DOI:10.1214/19-aos1900
Ethan X Fang, Yang Ning, Runze Li
{"title":"TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA.","authors":"Ethan X Fang, Yang Ning, Runze Li","doi":"10.1214/19-aos1900","DOIUrl":null,"url":null,"abstract":"<p><p>This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach, which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (2002)'s procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 5","pages":"2622-2645"},"PeriodicalIF":3.2000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8277154/pdf/nihms-1614211.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/19-aos1900","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/9/19 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach, which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (2002)'s procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高维纵向数据的显著性检验。
本文涉及具有超高维度协变量的纵向数据的统计推断。我们首先研究了为感兴趣的低维参数构建置信区间和假设检验的问题。我们面临的主要挑战是如何在纵向数据存在高维滋扰参数和复杂的主体内相关性的情况下构建一个强大的检验统计量。为了应对这一挑战,我们提出了一种新的二次装饰相关推断函数方法,它能同时消除滋扰参数的影响并结合相关性以提高估计过程的效率。当感兴趣的参数是固定维度时,我们证明了所提出的估计器是渐近正态的,并达到了半参数信息约束,在此基础上我们可以构建一个最优的 Wald 检验统计量。我们进一步扩展了这一结果,并在感兴趣参数的维数随样本量以多项式速率增长的情况下,建立了估计器的极限分布。最后,我们研究了当感兴趣的是高维回归参数向量时如何控制误发现率(FDR)。我们证明,在纵向数据中,将 Storey(2002)的程序应用于每个回归参数的拟议检验统计量,可以渐近地控制 FDR。我们进行了模拟研究,以评估所建议程序的有限样本性能。我们的模拟结果表明,新提出的程序既能控制低维感兴趣参数检验的 I 类误差,也能控制多重检验问题中的 FDR。我们还将提出的程序应用于一个真实数据实例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Annals of Statistics
Annals of Statistics 数学-统计学与概率论
CiteScore
9.30
自引率
8.90%
发文量
119
审稿时长
6-12 weeks
期刊介绍: The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.
期刊最新文献
ON BLOCKWISE AND REFERENCE PANEL-BASED ESTIMATORS FOR GENETIC DATA PREDICTION IN HIGH DIMENSIONS. RANK-BASED INDICES FOR TESTING INDEPENDENCE BETWEEN TWO HIGH-DIMENSIONAL VECTORS. Single index Fréchet regression Graphical models for nonstationary time series On lower bounds for the bias-variance trade-off
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1