Optimal subsampling for the Cox proportional hazards model with massive survival data

IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Journal of Statistical Planning and Inference Pub Date : 2023-12-19 DOI:10.1016/j.jspi.2023.106136
Nan Qiao , Wangcheng Li , Feng Xiao , Cunjie Lin
{"title":"Optimal subsampling for the Cox proportional hazards model with massive survival data","authors":"Nan Qiao ,&nbsp;Wangcheng Li ,&nbsp;Feng Xiao ,&nbsp;Cunjie Lin","doi":"10.1016/j.jspi.2023.106136","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>Massive survival data has become common in survival analysis. In this study, a subsampling algorithm is proposed for </span>Cox proportional hazards model with time-dependent </span>covariates<span> when the sample size is extraordinarily large but the computing resources are relatively limited. A subsample estimator is developed by maximizing a weighted partial likelihood, and shown to have consistency and asymptotic normality<span>. By minimizing the asymptotic mean squared error of the subsample estimator, the optimal subsampling probabilities are formulated with explicit expression. Simulation studies show that the proposed method has satisfactory performances in approximating the full data estimator. The proposed method is applied to the corporate loan data and breast cancer data, with different censoring rates, and the outcome also confirms the practical advantages.</span></span></p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"231 ","pages":"Article 106136"},"PeriodicalIF":0.8000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Planning and Inference","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378375823001052","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Massive survival data has become common in survival analysis. In this study, a subsampling algorithm is proposed for Cox proportional hazards model with time-dependent covariates when the sample size is extraordinarily large but the computing resources are relatively limited. A subsample estimator is developed by maximizing a weighted partial likelihood, and shown to have consistency and asymptotic normality. By minimizing the asymptotic mean squared error of the subsample estimator, the optimal subsampling probabilities are formulated with explicit expression. Simulation studies show that the proposed method has satisfactory performances in approximating the full data estimator. The proposed method is applied to the corporate loan data and breast cancer data, with different censoring rates, and the outcome also confirms the practical advantages.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大量生存数据的考克斯比例危害模型的最佳子采样
海量生存数据已成为生存分析中的常见现象。本研究提出了一种子采样算法,用于具有时间依赖协变量的 Cox 比例危险模型,当样本量超大但计算资源相对有限时。通过最大化加权部分似然建立了一个子样本估计器,并证明其具有一致性和渐近正态性。通过最小化子样本估计器的渐近均方误差,用明确的表达式提出了最优子样本概率。模拟研究表明,所提出的方法在逼近完整数据估计器方面具有令人满意的性能。该方法被应用于具有不同删失率的企业贷款数据和乳腺癌数据,结果也证实了其实用优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Statistical Planning and Inference
Journal of Statistical Planning and Inference 数学-统计学与概率论
CiteScore
2.10
自引率
11.10%
发文量
78
审稿时长
3-6 weeks
期刊介绍: The Journal of Statistical Planning and Inference offers itself as a multifaceted and all-inclusive bridge between classical aspects of statistics and probability, and the emerging interdisciplinary aspects that have a potential of revolutionizing the subject. While we maintain our traditional strength in statistical inference, design, classical probability, and large sample methods, we also have a far more inclusive and broadened scope to keep up with the new problems that confront us as statisticians, mathematicians, and scientists. We publish high quality articles in all branches of statistics, probability, discrete mathematics, machine learning, and bioinformatics. We also especially welcome well written and up to date review articles on fundamental themes of statistics, probability, machine learning, and general biostatistics. Thoughtful letters to the editors, interesting problems in need of a solution, and short notes carrying an element of elegance or beauty are equally welcome.
期刊最新文献
Estimation and group-feature selection in sparse mixture-of-experts with diverging number of parameters Modeling and testing for endpoint-inflated count time series with bounded support Semi-parametric empirical likelihood inference on quantile difference between two samples with length-biased and right-censored data Sieve estimation of the accelerated mean model based on panel count data The proximal bootstrap for constrained estimators
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1