{"title":"大量生存数据的考克斯比例危害模型的最佳子采样","authors":"Nan Qiao , Wangcheng Li , Feng Xiao , Cunjie Lin","doi":"10.1016/j.jspi.2023.106136","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>Massive survival data has become common in survival analysis. In this study, a subsampling algorithm is proposed for </span>Cox proportional hazards model with time-dependent </span>covariates<span> when the sample size is extraordinarily large but the computing resources are relatively limited. A subsample estimator is developed by maximizing a weighted partial likelihood, and shown to have consistency and asymptotic normality<span>. By minimizing the asymptotic mean squared error of the subsample estimator, the optimal subsampling probabilities are formulated with explicit expression. Simulation studies show that the proposed method has satisfactory performances in approximating the full data estimator. The proposed method is applied to the corporate loan data and breast cancer data, with different censoring rates, and the outcome also confirms the practical advantages.</span></span></p></div>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimal subsampling for the Cox proportional hazards model with massive survival data\",\"authors\":\"Nan Qiao , Wangcheng Li , Feng Xiao , Cunjie Lin\",\"doi\":\"10.1016/j.jspi.2023.106136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><span><span>Massive survival data has become common in survival analysis. In this study, a subsampling algorithm is proposed for </span>Cox proportional hazards model with time-dependent </span>covariates<span> when the sample size is extraordinarily large but the computing resources are relatively limited. A subsample estimator is developed by maximizing a weighted partial likelihood, and shown to have consistency and asymptotic normality<span>. By minimizing the asymptotic mean squared error of the subsample estimator, the optimal subsampling probabilities are formulated with explicit expression. Simulation studies show that the proposed method has satisfactory performances in approximating the full data estimator. The proposed method is applied to the corporate loan data and breast cancer data, with different censoring rates, and the outcome also confirms the practical advantages.</span></span></p></div>\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0,\"publicationDate\":\"2023-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378375823001052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378375823001052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimal subsampling for the Cox proportional hazards model with massive survival data
Massive survival data has become common in survival analysis. In this study, a subsampling algorithm is proposed for Cox proportional hazards model with time-dependent covariates when the sample size is extraordinarily large but the computing resources are relatively limited. A subsample estimator is developed by maximizing a weighted partial likelihood, and shown to have consistency and asymptotic normality. By minimizing the asymptotic mean squared error of the subsample estimator, the optimal subsampling probabilities are formulated with explicit expression. Simulation studies show that the proposed method has satisfactory performances in approximating the full data estimator. The proposed method is applied to the corporate loan data and breast cancer data, with different censoring rates, and the outcome also confirms the practical advantages.