在 AFT 模型中使用 X 模型山寨版进行高维受控变量选择

IF 1 4区 数学 Q3 STATISTICS & PROBABILITY Computational Statistics Pub Date : 2023-12-09 DOI:10.1007/s00180-023-01426-5
Baihua He, Di Xia, Yingli Pan
{"title":"在 AFT 模型中使用 X 模型山寨版进行高维受控变量选择","authors":"Baihua He, Di Xia, Yingli Pan","doi":"10.1007/s00180-023-01426-5","DOIUrl":null,"url":null,"abstract":"<p>Interpretability and stability are two important characteristics required for the application of high dimensional data in statistics. Although the former has been favored by many existing forecasting methods to some extent, the latter in the sense of controlling the fraction of wrongly discovered features is still largely underdeveloped. Under the accelerated failure time model, this paper introduces a controlled variable selection method with the general framework of Model-X knockoffs to tackle high dimensional data. We provide theoretical justifications on the asymptotic false discovery rate (FDR) control. The proposed method has attracted significant interest due to its strong control of the FDR while preserving predictive power. Several simulation examples are conducted to assess the finite sample performance with desired interpretability and stability. A real data example from Acute Myeloid Leukemia study is analyzed to demonstrate the utility of the proposed method in practice.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"23 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High dimensional controlled variable selection with model-X knockoffs in the AFT model\",\"authors\":\"Baihua He, Di Xia, Yingli Pan\",\"doi\":\"10.1007/s00180-023-01426-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Interpretability and stability are two important characteristics required for the application of high dimensional data in statistics. Although the former has been favored by many existing forecasting methods to some extent, the latter in the sense of controlling the fraction of wrongly discovered features is still largely underdeveloped. Under the accelerated failure time model, this paper introduces a controlled variable selection method with the general framework of Model-X knockoffs to tackle high dimensional data. We provide theoretical justifications on the asymptotic false discovery rate (FDR) control. The proposed method has attracted significant interest due to its strong control of the FDR while preserving predictive power. Several simulation examples are conducted to assess the finite sample performance with desired interpretability and stability. A real data example from Acute Myeloid Leukemia study is analyzed to demonstrate the utility of the proposed method in practice.</p>\",\"PeriodicalId\":55223,\"journal\":{\"name\":\"Computational Statistics\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s00180-023-01426-5\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s00180-023-01426-5","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

摘要

可解释性和稳定性是统计中应用高维数据所需的两个重要特征。虽然前者在一定程度上得到了许多现有预测方法的青睐,但后者在控制错误特征发现率的意义上仍有很大欠缺。在加速失效时间模型下,本文介绍了一种受控变量选择方法,该方法具有模型-X山寨版的一般框架,可用于处理高维数据。我们提供了渐近错误发现率(FDR)控制的理论依据。由于能在保持预测能力的同时对 FDR 进行强有力的控制,所提出的方法引起了极大的兴趣。我们通过几个模拟示例来评估有限样本的性能,以及所需的可解释性和稳定性。分析了急性髓性白血病研究的真实数据示例,以证明所提方法在实践中的实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
High dimensional controlled variable selection with model-X knockoffs in the AFT model

Interpretability and stability are two important characteristics required for the application of high dimensional data in statistics. Although the former has been favored by many existing forecasting methods to some extent, the latter in the sense of controlling the fraction of wrongly discovered features is still largely underdeveloped. Under the accelerated failure time model, this paper introduces a controlled variable selection method with the general framework of Model-X knockoffs to tackle high dimensional data. We provide theoretical justifications on the asymptotic false discovery rate (FDR) control. The proposed method has attracted significant interest due to its strong control of the FDR while preserving predictive power. Several simulation examples are conducted to assess the finite sample performance with desired interpretability and stability. A real data example from Acute Myeloid Leukemia study is analyzed to demonstrate the utility of the proposed method in practice.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computational Statistics
Computational Statistics 数学-统计学与概率论
CiteScore
2.90
自引率
0.00%
发文量
122
审稿时长
>12 weeks
期刊介绍: Computational Statistics (CompStat) is an international journal which promotes the publication of applications and methodological research in the field of Computational Statistics. The focus of papers in CompStat is on the contribution to and influence of computing on statistics and vice versa. The journal provides a forum for computer scientists, mathematicians, and statisticians in a variety of fields of statistics such as biometrics, econometrics, data analysis, graphics, simulation, algorithms, knowledge based systems, and Bayesian computing. CompStat publishes hardware, software plus package reports.
期刊最新文献
Bayes estimation of ratio of scale-like parameters for inverse Gaussian distributions and applications to classification Multivariate approaches to investigate the home and away behavior of football teams playing football matches Kendall correlations and radar charts to include goals for and goals against in soccer rankings Bayesian adaptive lasso quantile regression with non-ignorable missing responses Statistical visualisation of tidy and geospatial data in R via kernel smoothing methods in the eks package
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1