Pub Date : 2024-07-01Epub Date: 2024-05-28DOI: 10.1007/s10985-024-09627-w
Anindya Bhadra, Rubin Wei, Ruth Keogh, Victor Kipnis, Douglas Midthune, Dennis W Buckman, Ya Su, Ananya Roy Chowdhury, Raymond J Carroll
We consider measurement error models for two variables observed repeatedly and subject to measurement error. One variable is continuous, while the other variable is a mixture of continuous and zero measurements. This second variable has two sources of zeros. The first source is episodic zeros, wherein some of the measurements for an individual may be zero and others positive. The second source is hard zeros, i.e., some individuals will always report zero. An example is the consumption of alcohol from alcoholic beverages: some individuals consume alcoholic beverages episodically, while others never consume alcoholic beverages. However, with a small number of repeat measurements from individuals, it is not possible to determine those who are episodic zeros and those who are hard zeros. We develop a new measurement error model for this problem, and use Bayesian methods to fit it. Simulations and data analyses are used to illustrate our methods. Extensions to parametric models and survival analysis are discussed briefly.
{"title":"Measurement error models with zero inflation and multiple sources of zeros, with applications to hard zeros.","authors":"Anindya Bhadra, Rubin Wei, Ruth Keogh, Victor Kipnis, Douglas Midthune, Dennis W Buckman, Ya Su, Ananya Roy Chowdhury, Raymond J Carroll","doi":"10.1007/s10985-024-09627-w","DOIUrl":"10.1007/s10985-024-09627-w","url":null,"abstract":"<p><p>We consider measurement error models for two variables observed repeatedly and subject to measurement error. One variable is continuous, while the other variable is a mixture of continuous and zero measurements. This second variable has two sources of zeros. The first source is episodic zeros, wherein some of the measurements for an individual may be zero and others positive. The second source is hard zeros, i.e., some individuals will always report zero. An example is the consumption of alcohol from alcoholic beverages: some individuals consume alcoholic beverages episodically, while others never consume alcoholic beverages. However, with a small number of repeat measurements from individuals, it is not possible to determine those who are episodic zeros and those who are hard zeros. We develop a new measurement error model for this problem, and use Bayesian methods to fit it. Simulations and data analyses are used to illustrate our methods. Extensions to parametric models and survival analysis are discussed briefly.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"600-623"},"PeriodicalIF":1.2,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141162786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-20DOI: 10.1007/s10985-024-09625-y
Mingyue Du, Xiyuan Gao, Ling Chen
Doubly censored failure time data occur in many areas and for the situation, the failure time of interest usually represents the elapsed time between two related events such as an infection and the resulting disease onset. Although many methods have been proposed for regression analysis of such data, most of them are conditional on the occurrence time of the initial event and ignore the relationship between the two events or the ancillary information contained in the initial event. Corresponding to this, a new sieve maximum likelihood approach is proposed that makes use of the ancillary information, and in the method, the logistic model and Cox proportional hazards model are employed to model the initial event and the failure time of interest, respectively. A simulation study is conducted and suggests that the proposed method works well in practice and is more efficient than the existing methods as expected. The approach is applied to an AIDS study that motivated this investigation.
{"title":"Regression analysis of doubly censored failure time data with ancillary information","authors":"Mingyue Du, Xiyuan Gao, Ling Chen","doi":"10.1007/s10985-024-09625-y","DOIUrl":"https://doi.org/10.1007/s10985-024-09625-y","url":null,"abstract":"<p>Doubly censored failure time data occur in many areas and for the situation, the failure time of interest usually represents the elapsed time between two related events such as an infection and the resulting disease onset. Although many methods have been proposed for regression analysis of such data, most of them are conditional on the occurrence time of the initial event and ignore the relationship between the two events or the ancillary information contained in the initial event. Corresponding to this, a new sieve maximum likelihood approach is proposed that makes use of the ancillary information, and in the method, the logistic model and Cox proportional hazards model are employed to model the initial event and the failure time of interest, respectively. A simulation study is conducted and suggests that the proposed method works well in practice and is more efficient than the existing methods as expected. The approach is applied to an AIDS study that motivated this investigation.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":"224 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140625569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-16DOI: 10.1007/s10985-024-09624-z
Myeonggyun Lee, Andrea B. Troxel, Mengling Liu
In studies with time-to-event outcomes, multiple, inter-correlated, and time-varying covariates are commonly observed. It is of great interest to model their joint effects by allowing a flexible functional form and to delineate their relative contributions to survival risk. A class of semiparametric transformation (ST) models offers flexible specifications of the intensity function and can be a general framework to accommodate nonlinear covariate effects. In this paper, we propose a partial-linear single-index (PLSI) transformation model that reduces the dimensionality of multiple covariates into a single index and provides interpretable estimates of the covariate effects. We develop an iterative algorithm using the regression spline technique to model the nonparametric single-index function for possibly nonlinear joint effects, followed by nonparametric maximum likelihood estimation. We also propose a nonparametric testing procedure to formally examine the linearity of covariate effects. We conduct Monte Carlo simulation studies to compare the PLSI transformation model with the standard ST model and apply it to NYU Langone Health de-identified electronic health record data on COVID-19 hospitalized patients’ mortality and a Veteran’s Administration lung cancer trial.
在时间到事件结果的研究中,通常会观察到多个相互关联且随时间变化的协变量。通过灵活的函数形式对它们的联合效应进行建模,并确定它们对生存风险的相对贡献是非常有意义的。半参数变换(ST)模型提供了灵活的强度函数规格,可以作为一个通用框架来适应非线性协变量效应。在本文中,我们提出了一种部分线性单指数(PLSI)转换模型,该模型可将多个协变量的维度降低为单个指数,并提供可解释的协变量效应估计值。我们利用回归样条技术开发了一种迭代算法,为可能的非线性联合效应建立非参数单指数函数模型,然后进行非参数最大似然估计。我们还提出了一种非参数检验程序,用于正式检验协变量效应的线性度。我们进行了蒙特卡罗模拟研究,将 PLSI 转换模型与标准 ST 模型进行比较,并将其应用于纽约大学朗贡卫生院关于 COVID-19 住院患者死亡率的去标识化电子健康记录数据和退伍军人管理局肺癌试验。
{"title":"Partial-linear single-index transformation models with censored data","authors":"Myeonggyun Lee, Andrea B. Troxel, Mengling Liu","doi":"10.1007/s10985-024-09624-z","DOIUrl":"https://doi.org/10.1007/s10985-024-09624-z","url":null,"abstract":"<p>In studies with time-to-event outcomes, multiple, inter-correlated, and time-varying covariates are commonly observed. It is of great interest to model their joint effects by allowing a flexible functional form and to delineate their relative contributions to survival risk. A class of semiparametric transformation (ST) models offers flexible specifications of the intensity function and can be a general framework to accommodate nonlinear covariate effects. In this paper, we propose a partial-linear single-index (PLSI) transformation model that reduces the dimensionality of multiple covariates into a single index and provides interpretable estimates of the covariate effects. We develop an iterative algorithm using the regression spline technique to model the nonparametric single-index function for possibly nonlinear joint effects, followed by nonparametric maximum likelihood estimation. We also propose a nonparametric testing procedure to formally examine the linearity of covariate effects. We conduct Monte Carlo simulation studies to compare the PLSI transformation model with the standard ST model and apply it to NYU Langone Health de-identified electronic health record data on COVID-19 hospitalized patients’ mortality and a Veteran’s Administration lung cancer trial.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":"19 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-02DOI: 10.1007/s10985-024-09621-2
Abstract
The case-cohort design obtains complete covariate data only on cases and on a random sample (the subcohort) of the entire cohort. Subsequent publications described the use of stratification and weight calibration to increase efficiency of estimates of Cox model log-relative hazards, and there has been some work estimating pure risk. Yet there are few examples of these options in the medical literature, and we could not find programs currently online to analyze these various options. We therefore present a unified approach and R software to facilitate such analyses. We used influence functions adapted to the various design and analysis options together with variance calculations that take the two-phase sampling into account. This work clarifies when the widely used “robust” variance estimate of Barlow (Biometrics 50:1064–1072, 1994) is appropriate. The corresponding R software, CaseCohortCoxSurvival, facilitates analysis with and without stratification and/or weight calibration, for subcohort sampling with or without replacement. We also allow for phase-two data to be missing at random for stratified designs. We provide inference not only for log-relative hazards in the Cox model, but also for cumulative baseline hazards and covariate-specific pure risks. We hope these calculations and software will promote wider use of more efficient and principled design and analysis options for case-cohort studies.
摘要 病例队列设计只获得病例和整个队列的随机样本(子队列)的完整协变量数据。随后发表的文章介绍了如何使用分层和权重校准来提高 Cox 模型对数相关危险度估计的效率,并对纯风险进行了一些估计。然而,这些方案在医学文献中鲜有实例,我们目前也无法在网上找到分析这些不同方案的程序。因此,我们提出了一种统一的方法和 R 软件,以方便进行此类分析。我们使用了与各种设计和分析方案相适应的影响函数以及考虑到两阶段采样的方差计算。这项工作明确了巴洛(Barlow,《生物统计学》50:1064-1072,1994 年)广泛使用的 "稳健 "方差估计何时合适。相应的 R 软件 CaseCohortCoxSurvival 可以在分层和/或权重校准的情况下进行分析,也可以在有或没有替换的子队列抽样中进行分析。对于分层设计,我们还允许第二阶段数据随机缺失。我们不仅提供了 Cox 模型中的对数相对危险度推断,还提供了累积基线危险度和共变量特异性纯危险度推断。我们希望这些计算和软件能促进病例队列研究更广泛地使用更高效、更有原则的设计和分析方案。
{"title":"Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data","authors":"","doi":"10.1007/s10985-024-09621-2","DOIUrl":"https://doi.org/10.1007/s10985-024-09621-2","url":null,"abstract":"<h3>Abstract</h3> <p>The case-cohort design obtains complete covariate data only on cases and on a random sample (the subcohort) of the entire cohort. Subsequent publications described the use of stratification and weight calibration to increase efficiency of estimates of Cox model log-relative hazards, and there has been some work estimating pure risk. Yet there are few examples of these options in the medical literature, and we could not find programs currently online to analyze these various options. We therefore present a unified approach and R software to facilitate such analyses. We used influence functions adapted to the various design and analysis options together with variance calculations that take the two-phase sampling into account. This work clarifies when the widely used “robust” variance estimate of Barlow (Biometrics 50:1064–1072, 1994) is appropriate. The corresponding R software, CaseCohortCoxSurvival, facilitates analysis with and without stratification and/or weight calibration, for subcohort sampling with or without replacement. We also allow for phase-two data to be missing at random for stratified designs. We provide inference not only for log-relative hazards in the Cox model, but also for cumulative baseline hazards and covariate-specific pure risks. We hope these calculations and software will promote wider use of more efficient and principled design and analysis options for case-cohort studies.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":"45 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2023-11-26DOI: 10.1007/s10985-023-09611-w
Theresa P Devasia, Alexander Tsodikov
Semiparametric transformation models for failure time data consist of a parametric regression component and an unspecified cumulative baseline hazard. The nonparametric maximum likelihood estimator (NPMLE) of the cumulative baseline hazard can be summarized in terms of weights introduced into a Breslow-type estimator (Weighted Breslow). At any given time point, the weights invoke an integral over the future of the cumulative baseline hazard, which presents theoretical and computational challenges. A simpler non-MLE Breslow-type estimator (Breslow) was derived earlier from a martingale estimating equation (MEE) setting observed and expected counts of failures equal, conditional on the past history. Despite much successful theoretical and computational development, the simpler Breslow estimator continues to be commonly used as a compromise between simplicity and perceived loss of full efficiency. In this paper we derive the relative efficiency of the Breslow estimator and consider the properties of the two estimators using simulations and real data on prostate cancer survival.
{"title":"Efficiency of the Breslow estimator in semiparametric transformation models.","authors":"Theresa P Devasia, Alexander Tsodikov","doi":"10.1007/s10985-023-09611-w","DOIUrl":"10.1007/s10985-023-09611-w","url":null,"abstract":"<p><p>Semiparametric transformation models for failure time data consist of a parametric regression component and an unspecified cumulative baseline hazard. The nonparametric maximum likelihood estimator (NPMLE) of the cumulative baseline hazard can be summarized in terms of weights introduced into a Breslow-type estimator (Weighted Breslow). At any given time point, the weights invoke an integral over the future of the cumulative baseline hazard, which presents theoretical and computational challenges. A simpler non-MLE Breslow-type estimator (Breslow) was derived earlier from a martingale estimating equation (MEE) setting observed and expected counts of failures equal, conditional on the past history. Despite much successful theoretical and computational development, the simpler Breslow estimator continues to be commonly used as a compromise between simplicity and perceived loss of full efficiency. In this paper we derive the relative efficiency of the Breslow estimator and consider the properties of the two estimators using simulations and real data on prostate cancer survival.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"291-309"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11237962/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138441550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2024-01-18DOI: 10.1007/s10985-023-09614-7
Cui-Juan Kong, Han-Ying Liang
In this paper, we define estimators of distribution functions when the data are right-censored and the censoring indicators are missing at random, and establish their strong representations and asymptotic normality. Besides, based on empirical likelihood method, we define maximum empirical likelihood estimators and smoothed log-empirical likelihood ratios of two-sample quantile difference in the presence and absence of auxiliary information, respectively, and prove their asymptotic distributions. Simulation study and real data analysis are conducted to investigate the finite sample behavior of the proposed methods.
{"title":"Quantile difference estimation with censoring indicators missing at random.","authors":"Cui-Juan Kong, Han-Ying Liang","doi":"10.1007/s10985-023-09614-7","DOIUrl":"10.1007/s10985-023-09614-7","url":null,"abstract":"<p><p>In this paper, we define estimators of distribution functions when the data are right-censored and the censoring indicators are missing at random, and establish their strong representations and asymptotic normality. Besides, based on empirical likelihood method, we define maximum empirical likelihood estimators and smoothed log-empirical likelihood ratios of two-sample quantile difference in the presence and absence of auxiliary information, respectively, and prove their asymptotic distributions. Simulation study and real data analysis are conducted to investigate the finite sample behavior of the proposed methods.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"345-382"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139492096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2023-11-28DOI: 10.1007/s10985-023-09613-8
Chun Pan, Bo Cai, Xuemei Sui
The proportional hazards mixture cure model is a popular analysis method for survival data where a subgroup of patients are cured. When the data are interval-censored, the estimation of this model is challenging due to its complex data structure. In this article, we propose a computationally efficient semiparametric Bayesian approach, facilitated by spline approximation and Poisson data augmentation, for model estimation and inference with interval-censored data and a cure rate. The spline approximation and Poisson data augmentation greatly simplify the MCMC algorithm and enhance the convergence of the MCMC chains. The empirical properties of the proposed method are examined through extensive simulation studies and also compared with the R package "GORCure". The use of the proposed method is illustrated through analyzing a data set from the Aerobics Center Longitudinal Study.
{"title":"A Bayesian proportional hazards mixture cure model for interval-censored data.","authors":"Chun Pan, Bo Cai, Xuemei Sui","doi":"10.1007/s10985-023-09613-8","DOIUrl":"10.1007/s10985-023-09613-8","url":null,"abstract":"<p><p>The proportional hazards mixture cure model is a popular analysis method for survival data where a subgroup of patients are cured. When the data are interval-censored, the estimation of this model is challenging due to its complex data structure. In this article, we propose a computationally efficient semiparametric Bayesian approach, facilitated by spline approximation and Poisson data augmentation, for model estimation and inference with interval-censored data and a cure rate. The spline approximation and Poisson data augmentation greatly simplify the MCMC algorithm and enhance the convergence of the MCMC chains. The empirical properties of the proposed method are examined through extensive simulation studies and also compared with the R package \"GORCure\". The use of the proposed method is illustrated through analyzing a data set from the Aerobics Center Longitudinal Study.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"327-344"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138446796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2023-11-13DOI: 10.1007/s10985-023-09612-9
Jayoun Kim, Boram Jeong, Il Do Ha, Kook-Hwan Oh, Ji Yong Jung, Jong Cheol Jeong, Donghwan Lee
In a semi-competing risks model in which a terminal event censors a non-terminal event but not vice versa, the conventional method can predict clinical outcomes by maximizing likelihood estimation. However, this method can produce unreliable or biased estimators when the number of events in the datasets is small. Specifically, parameter estimates may converge to infinity, or their standard errors can be very large. Moreover, terminal and non-terminal event times may be correlated, which can account for the frailty term. Here, we adapt the penalized likelihood with Firth's correction method for gamma frailty models with semi-competing risks data to reduce the bias caused by rare events. The proposed method is evaluated in terms of relative bias, mean squared error, standard error, and standard deviation compared to the conventional methods through simulation studies. The results of the proposed method are stable and robust even when data contain only a few events with the misspecification of the baseline hazard function. We also illustrate a real example with a multi-centre, patient-based cohort study to identify risk factors for chronic kidney disease progression or adverse clinical outcomes. This study will provide a better understanding of semi-competing risk data in which the number of specific diseases or events of interest is rare.
{"title":"Bias reduction for semi-competing risks frailty model with rare events: application to a chronic kidney disease cohort study in South Korea.","authors":"Jayoun Kim, Boram Jeong, Il Do Ha, Kook-Hwan Oh, Ji Yong Jung, Jong Cheol Jeong, Donghwan Lee","doi":"10.1007/s10985-023-09612-9","DOIUrl":"10.1007/s10985-023-09612-9","url":null,"abstract":"<p><p>In a semi-competing risks model in which a terminal event censors a non-terminal event but not vice versa, the conventional method can predict clinical outcomes by maximizing likelihood estimation. However, this method can produce unreliable or biased estimators when the number of events in the datasets is small. Specifically, parameter estimates may converge to infinity, or their standard errors can be very large. Moreover, terminal and non-terminal event times may be correlated, which can account for the frailty term. Here, we adapt the penalized likelihood with Firth's correction method for gamma frailty models with semi-competing risks data to reduce the bias caused by rare events. The proposed method is evaluated in terms of relative bias, mean squared error, standard error, and standard deviation compared to the conventional methods through simulation studies. The results of the proposed method are stable and robust even when data contain only a few events with the misspecification of the baseline hazard function. We also illustrate a real example with a multi-centre, patient-based cohort study to identify risk factors for chronic kidney disease progression or adverse clinical outcomes. This study will provide a better understanding of semi-competing risk data in which the number of specific diseases or events of interest is rare.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"310-326"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89720279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2024-03-04DOI: 10.1007/s10985-024-09619-w
Motahareh Parsa, Seyed Mahmood Taghavi-Shahri, Ingrid Van Keilegom
In clinical studies, one often encounters time-to-event data that are subject to right censoring and for which a fraction of the patients under study never experience the event of interest. Such data can be modeled using cure models in survival analysis. In the presence of cure fraction, the mixture cure model is popular, since it allows to model probability to be cured (called the incidence) and the survival function of the uncured individuals (called the latency). In this paper, we develop a variable selection procedure for the incidence and latency parts of a mixture cure model, consisting of a logistic model for the incidence and a semiparametric accelerated failure time model for the latency. We use a penalized likelihood approach, based on adaptive LASSO penalties for each part of the model, and we consider two algorithms for optimizing the criterion function. Extensive simulations are carried out to assess the accuracy of the proposed selection procedure. Finally, we employ the proposed method to a real dataset regarding heart failure patients with left ventricular systolic dysfunction.
{"title":"On variable selection in a semiparametric AFT mixture cure model.","authors":"Motahareh Parsa, Seyed Mahmood Taghavi-Shahri, Ingrid Van Keilegom","doi":"10.1007/s10985-024-09619-w","DOIUrl":"10.1007/s10985-024-09619-w","url":null,"abstract":"<p><p>In clinical studies, one often encounters time-to-event data that are subject to right censoring and for which a fraction of the patients under study never experience the event of interest. Such data can be modeled using cure models in survival analysis. In the presence of cure fraction, the mixture cure model is popular, since it allows to model probability to be cured (called the incidence) and the survival function of the uncured individuals (called the latency). In this paper, we develop a variable selection procedure for the incidence and latency parts of a mixture cure model, consisting of a logistic model for the incidence and a semiparametric accelerated failure time model for the latency. We use a penalized likelihood approach, based on adaptive LASSO penalties for each part of the model, and we consider two algorithms for optimizing the criterion function. Extensive simulations are carried out to assess the accuracy of the proposed selection procedure. Finally, we employ the proposed method to a real dataset regarding heart failure patients with left ventricular systolic dysfunction.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"472-500"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140023093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2024-02-15DOI: 10.1007/s10985-024-09617-y
Richard A J Post, Edwin R van den Heuvel, Hein Putter
It is known that the hazard ratio lacks a useful causal interpretation. Even for data from a randomized controlled trial, the hazard ratio suffers from so-called built-in selection bias as, over time, the individuals at risk among the exposed and unexposed are no longer exchangeable. In this paper, we formalize how the expectation of the observed hazard ratio evolves and deviates from the causal effect of interest in the presence of heterogeneity of the hazard rate of unexposed individuals (frailty) and heterogeneity in effect (individual modification). For the case of effect heterogeneity, we define the causal hazard ratio. We show that the expected observed hazard ratio equals the ratio of expectations of the latent variables (frailty and modifier) conditionally on survival in the world with and without exposure, respectively. Examples with gamma, inverse Gaussian and compound Poisson distributed frailty and categorical (harming, beneficial or neutral) distributed effect modifiers are presented for illustration. This set of examples shows that an observed hazard ratio with a particular value can arise for all values of the causal hazard ratio. Therefore, the hazard ratio cannot be used as a measure of the causal effect without making untestable assumptions, stressing the importance of using more appropriate estimands, such as contrasts of the survival probabilities.
{"title":"The built-in selection bias of hazard ratios formalized using structural causal models.","authors":"Richard A J Post, Edwin R van den Heuvel, Hein Putter","doi":"10.1007/s10985-024-09617-y","DOIUrl":"10.1007/s10985-024-09617-y","url":null,"abstract":"<p><p>It is known that the hazard ratio lacks a useful causal interpretation. Even for data from a randomized controlled trial, the hazard ratio suffers from so-called built-in selection bias as, over time, the individuals at risk among the exposed and unexposed are no longer exchangeable. In this paper, we formalize how the expectation of the observed hazard ratio evolves and deviates from the causal effect of interest in the presence of heterogeneity of the hazard rate of unexposed individuals (frailty) and heterogeneity in effect (individual modification). For the case of effect heterogeneity, we define the causal hazard ratio. We show that the expected observed hazard ratio equals the ratio of expectations of the latent variables (frailty and modifier) conditionally on survival in the world with and without exposure, respectively. Examples with gamma, inverse Gaussian and compound Poisson distributed frailty and categorical (harming, beneficial or neutral) distributed effect modifiers are presented for illustration. This set of examples shows that an observed hazard ratio with a particular value can arise for all values of the causal hazard ratio. Therefore, the hazard ratio cannot be used as a measure of the causal effect without making untestable assumptions, stressing the importance of using more appropriate estimands, such as contrasts of the survival probabilities.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"404-438"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11300553/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139736518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}