Statistical Inference for High-Dimensional Generalized Linear Models with Binary Outcomes.

IF 3 1区 数学 Q1 STATISTICS & PROBABILITY Journal of the American Statistical Association Pub Date : 2023-01-01 Epub Date: 2021-12-09 DOI:10.1080/01621459.2021.1990769
T Tony Cai, Zijian Guo, Rong Ma
{"title":"Statistical Inference for High-Dimensional Generalized Linear Models with Binary Outcomes.","authors":"T Tony Cai, Zijian Guo, Rong Ma","doi":"10.1080/01621459.2021.1990769","DOIUrl":null,"url":null,"abstract":"<p><p>This paper develops a unified statistical inference framework for high-dimensional binary generalized linear models (GLMs) with general link functions. Both unknown and known design distribution settings are considered. A two-step weighted bias-correction method is proposed for constructing confidence intervals and simultaneous hypothesis tests for individual components of the regression vector. Minimax lower bound for the expected length is established and the proposed confidence intervals are shown to be rate-optimal up to a logarithmic factor. The numerical performance of the proposed procedure is demonstrated through simulation studies and an analysis of a single cell RNA-seq data set, which yields interesting biological insights that integrate well into the current literature on the cellular immune response mechanisms as characterized by single-cell transcriptomics. The theoretical analysis provides important insights on the adaptivity of optimal confidence intervals with respect to the sparsity of the regression vector. New lower bound techniques are introduced and they can be of independent interest to solve other inference problems in high-dimensional binary GLMs.</p>","PeriodicalId":17227,"journal":{"name":"Journal of the American Statistical Association","volume":"118 542","pages":"1319-1332"},"PeriodicalIF":3.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10292730/pdf/nihms-1824949.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Statistical Association","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/01621459.2021.1990769","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/12/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

This paper develops a unified statistical inference framework for high-dimensional binary generalized linear models (GLMs) with general link functions. Both unknown and known design distribution settings are considered. A two-step weighted bias-correction method is proposed for constructing confidence intervals and simultaneous hypothesis tests for individual components of the regression vector. Minimax lower bound for the expected length is established and the proposed confidence intervals are shown to be rate-optimal up to a logarithmic factor. The numerical performance of the proposed procedure is demonstrated through simulation studies and an analysis of a single cell RNA-seq data set, which yields interesting biological insights that integrate well into the current literature on the cellular immune response mechanisms as characterized by single-cell transcriptomics. The theoretical analysis provides important insights on the adaptivity of optimal confidence intervals with respect to the sparsity of the regression vector. New lower bound techniques are introduced and they can be of independent interest to solve other inference problems in high-dimensional binary GLMs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有二元结果的高维广义线性模型的统计推断。
本文为具有一般链接函数的高维二元广义线性模型(GLM)开发了一个统一的统计推断框架。本文同时考虑了未知和已知的设计分布设置。本文提出了一种两步加权偏差校正方法,用于构建置信区间和同时对回归向量的各个成分进行假设检验。建立了预期长度的最小下限,并证明所提出的置信区间在对数因子范围内是速率最优的。通过模拟研究和对单细胞 RNA-seq 数据集的分析,证明了所提程序的数值性能,并得出了有趣的生物学见解,这些见解很好地融入了以单细胞转录组学为特征的细胞免疫反应机制的现有文献中。理论分析为最优置信区间对回归向量稀疏性的适应性提供了重要见解。研究还引入了新的下界技术,这些技术对于解决高维二元 GLM 中的其他推断问题也很有意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.50
自引率
8.10%
发文量
168
审稿时长
12 months
期刊介绍: Established in 1888 and published quarterly in March, June, September, and December, the Journal of the American Statistical Association ( JASA ) has long been considered the premier journal of statistical science. Articles focus on statistical applications, theory, and methods in economic, social, physical, engineering, and health sciences. Important books contributing to statistical advancement are reviewed in JASA . JASA is indexed in Current Index to Statistics and MathSci Online and reviewed in Mathematical Reviews. JASA is abstracted by Access Company and is indexed and abstracted in the SRM Database of Social Research Methodology.
期刊最新文献
Identifiability and Consistent Estimation for Gaussian Chain Graph Models Data Science and Predictive Analytics: Biomedical and Health Applications using R, 2nd ed. Extremal Random Forests Quantitative Methods for Precision Medicine: Pharmacogenomics in Action. Graphical Principal Component Analysis of Multivariate Functional Time Series
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1