Pub Date : 2024-04-02DOI: 10.1007/s10985-024-09621-2
Abstract
The case-cohort design obtains complete covariate data only on cases and on a random sample (the subcohort) of the entire cohort. Subsequent publications described the use of stratification and weight calibration to increase efficiency of estimates of Cox model log-relative hazards, and there has been some work estimating pure risk. Yet there are few examples of these options in the medical literature, and we could not find programs currently online to analyze these various options. We therefore present a unified approach and R software to facilitate such analyses. We used influence functions adapted to the various design and analysis options together with variance calculations that take the two-phase sampling into account. This work clarifies when the widely used “robust” variance estimate of Barlow (Biometrics 50:1064–1072, 1994) is appropriate. The corresponding R software, CaseCohortCoxSurvival, facilitates analysis with and without stratification and/or weight calibration, for subcohort sampling with or without replacement. We also allow for phase-two data to be missing at random for stratified designs. We provide inference not only for log-relative hazards in the Cox model, but also for cumulative baseline hazards and covariate-specific pure risks. We hope these calculations and software will promote wider use of more efficient and principled design and analysis options for case-cohort studies.
摘要 病例队列设计只获得病例和整个队列的随机样本(子队列)的完整协变量数据。随后发表的文章介绍了如何使用分层和权重校准来提高 Cox 模型对数相关危险度估计的效率,并对纯风险进行了一些估计。然而,这些方案在医学文献中鲜有实例,我们目前也无法在网上找到分析这些不同方案的程序。因此,我们提出了一种统一的方法和 R 软件,以方便进行此类分析。我们使用了与各种设计和分析方案相适应的影响函数以及考虑到两阶段采样的方差计算。这项工作明确了巴洛(Barlow,《生物统计学》50:1064-1072,1994 年)广泛使用的 "稳健 "方差估计何时合适。相应的 R 软件 CaseCohortCoxSurvival 可以在分层和/或权重校准的情况下进行分析,也可以在有或没有替换的子队列抽样中进行分析。对于分层设计,我们还允许第二阶段数据随机缺失。我们不仅提供了 Cox 模型中的对数相对危险度推断,还提供了累积基线危险度和共变量特异性纯危险度推断。我们希望这些计算和软件能促进病例队列研究更广泛地使用更高效、更有原则的设计和分析方案。
{"title":"Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data","authors":"","doi":"10.1007/s10985-024-09621-2","DOIUrl":"https://doi.org/10.1007/s10985-024-09621-2","url":null,"abstract":"<h3>Abstract</h3> <p>The case-cohort design obtains complete covariate data only on cases and on a random sample (the subcohort) of the entire cohort. Subsequent publications described the use of stratification and weight calibration to increase efficiency of estimates of Cox model log-relative hazards, and there has been some work estimating pure risk. Yet there are few examples of these options in the medical literature, and we could not find programs currently online to analyze these various options. We therefore present a unified approach and R software to facilitate such analyses. We used influence functions adapted to the various design and analysis options together with variance calculations that take the two-phase sampling into account. This work clarifies when the widely used “robust” variance estimate of Barlow (Biometrics 50:1064–1072, 1994) is appropriate. The corresponding R software, CaseCohortCoxSurvival, facilitates analysis with and without stratification and/or weight calibration, for subcohort sampling with or without replacement. We also allow for phase-two data to be missing at random for stratified designs. We provide inference not only for log-relative hazards in the Cox model, but also for cumulative baseline hazards and covariate-specific pure risks. We hope these calculations and software will promote wider use of more efficient and principled design and analysis options for case-cohort studies.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":"45 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140574106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2023-11-26DOI: 10.1007/s10985-023-09611-w
Theresa P Devasia, Alexander Tsodikov
Semiparametric transformation models for failure time data consist of a parametric regression component and an unspecified cumulative baseline hazard. The nonparametric maximum likelihood estimator (NPMLE) of the cumulative baseline hazard can be summarized in terms of weights introduced into a Breslow-type estimator (Weighted Breslow). At any given time point, the weights invoke an integral over the future of the cumulative baseline hazard, which presents theoretical and computational challenges. A simpler non-MLE Breslow-type estimator (Breslow) was derived earlier from a martingale estimating equation (MEE) setting observed and expected counts of failures equal, conditional on the past history. Despite much successful theoretical and computational development, the simpler Breslow estimator continues to be commonly used as a compromise between simplicity and perceived loss of full efficiency. In this paper we derive the relative efficiency of the Breslow estimator and consider the properties of the two estimators using simulations and real data on prostate cancer survival.
{"title":"Efficiency of the Breslow estimator in semiparametric transformation models.","authors":"Theresa P Devasia, Alexander Tsodikov","doi":"10.1007/s10985-023-09611-w","DOIUrl":"10.1007/s10985-023-09611-w","url":null,"abstract":"<p><p>Semiparametric transformation models for failure time data consist of a parametric regression component and an unspecified cumulative baseline hazard. The nonparametric maximum likelihood estimator (NPMLE) of the cumulative baseline hazard can be summarized in terms of weights introduced into a Breslow-type estimator (Weighted Breslow). At any given time point, the weights invoke an integral over the future of the cumulative baseline hazard, which presents theoretical and computational challenges. A simpler non-MLE Breslow-type estimator (Breslow) was derived earlier from a martingale estimating equation (MEE) setting observed and expected counts of failures equal, conditional on the past history. Despite much successful theoretical and computational development, the simpler Breslow estimator continues to be commonly used as a compromise between simplicity and perceived loss of full efficiency. In this paper we derive the relative efficiency of the Breslow estimator and consider the properties of the two estimators using simulations and real data on prostate cancer survival.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"291-309"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11237962/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138441550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2024-01-18DOI: 10.1007/s10985-023-09614-7
Cui-Juan Kong, Han-Ying Liang
In this paper, we define estimators of distribution functions when the data are right-censored and the censoring indicators are missing at random, and establish their strong representations and asymptotic normality. Besides, based on empirical likelihood method, we define maximum empirical likelihood estimators and smoothed log-empirical likelihood ratios of two-sample quantile difference in the presence and absence of auxiliary information, respectively, and prove their asymptotic distributions. Simulation study and real data analysis are conducted to investigate the finite sample behavior of the proposed methods.
{"title":"Quantile difference estimation with censoring indicators missing at random.","authors":"Cui-Juan Kong, Han-Ying Liang","doi":"10.1007/s10985-023-09614-7","DOIUrl":"10.1007/s10985-023-09614-7","url":null,"abstract":"<p><p>In this paper, we define estimators of distribution functions when the data are right-censored and the censoring indicators are missing at random, and establish their strong representations and asymptotic normality. Besides, based on empirical likelihood method, we define maximum empirical likelihood estimators and smoothed log-empirical likelihood ratios of two-sample quantile difference in the presence and absence of auxiliary information, respectively, and prove their asymptotic distributions. Simulation study and real data analysis are conducted to investigate the finite sample behavior of the proposed methods.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"345-382"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139492096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2023-11-28DOI: 10.1007/s10985-023-09613-8
Chun Pan, Bo Cai, Xuemei Sui
The proportional hazards mixture cure model is a popular analysis method for survival data where a subgroup of patients are cured. When the data are interval-censored, the estimation of this model is challenging due to its complex data structure. In this article, we propose a computationally efficient semiparametric Bayesian approach, facilitated by spline approximation and Poisson data augmentation, for model estimation and inference with interval-censored data and a cure rate. The spline approximation and Poisson data augmentation greatly simplify the MCMC algorithm and enhance the convergence of the MCMC chains. The empirical properties of the proposed method are examined through extensive simulation studies and also compared with the R package "GORCure". The use of the proposed method is illustrated through analyzing a data set from the Aerobics Center Longitudinal Study.
{"title":"A Bayesian proportional hazards mixture cure model for interval-censored data.","authors":"Chun Pan, Bo Cai, Xuemei Sui","doi":"10.1007/s10985-023-09613-8","DOIUrl":"10.1007/s10985-023-09613-8","url":null,"abstract":"<p><p>The proportional hazards mixture cure model is a popular analysis method for survival data where a subgroup of patients are cured. When the data are interval-censored, the estimation of this model is challenging due to its complex data structure. In this article, we propose a computationally efficient semiparametric Bayesian approach, facilitated by spline approximation and Poisson data augmentation, for model estimation and inference with interval-censored data and a cure rate. The spline approximation and Poisson data augmentation greatly simplify the MCMC algorithm and enhance the convergence of the MCMC chains. The empirical properties of the proposed method are examined through extensive simulation studies and also compared with the R package \"GORCure\". The use of the proposed method is illustrated through analyzing a data set from the Aerobics Center Longitudinal Study.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"327-344"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138446796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2023-11-13DOI: 10.1007/s10985-023-09612-9
Jayoun Kim, Boram Jeong, Il Do Ha, Kook-Hwan Oh, Ji Yong Jung, Jong Cheol Jeong, Donghwan Lee
In a semi-competing risks model in which a terminal event censors a non-terminal event but not vice versa, the conventional method can predict clinical outcomes by maximizing likelihood estimation. However, this method can produce unreliable or biased estimators when the number of events in the datasets is small. Specifically, parameter estimates may converge to infinity, or their standard errors can be very large. Moreover, terminal and non-terminal event times may be correlated, which can account for the frailty term. Here, we adapt the penalized likelihood with Firth's correction method for gamma frailty models with semi-competing risks data to reduce the bias caused by rare events. The proposed method is evaluated in terms of relative bias, mean squared error, standard error, and standard deviation compared to the conventional methods through simulation studies. The results of the proposed method are stable and robust even when data contain only a few events with the misspecification of the baseline hazard function. We also illustrate a real example with a multi-centre, patient-based cohort study to identify risk factors for chronic kidney disease progression or adverse clinical outcomes. This study will provide a better understanding of semi-competing risk data in which the number of specific diseases or events of interest is rare.
{"title":"Bias reduction for semi-competing risks frailty model with rare events: application to a chronic kidney disease cohort study in South Korea.","authors":"Jayoun Kim, Boram Jeong, Il Do Ha, Kook-Hwan Oh, Ji Yong Jung, Jong Cheol Jeong, Donghwan Lee","doi":"10.1007/s10985-023-09612-9","DOIUrl":"10.1007/s10985-023-09612-9","url":null,"abstract":"<p><p>In a semi-competing risks model in which a terminal event censors a non-terminal event but not vice versa, the conventional method can predict clinical outcomes by maximizing likelihood estimation. However, this method can produce unreliable or biased estimators when the number of events in the datasets is small. Specifically, parameter estimates may converge to infinity, or their standard errors can be very large. Moreover, terminal and non-terminal event times may be correlated, which can account for the frailty term. Here, we adapt the penalized likelihood with Firth's correction method for gamma frailty models with semi-competing risks data to reduce the bias caused by rare events. The proposed method is evaluated in terms of relative bias, mean squared error, standard error, and standard deviation compared to the conventional methods through simulation studies. The results of the proposed method are stable and robust even when data contain only a few events with the misspecification of the baseline hazard function. We also illustrate a real example with a multi-centre, patient-based cohort study to identify risk factors for chronic kidney disease progression or adverse clinical outcomes. This study will provide a better understanding of semi-competing risk data in which the number of specific diseases or events of interest is rare.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"310-326"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89720279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2024-03-04DOI: 10.1007/s10985-024-09619-w
Motahareh Parsa, Seyed Mahmood Taghavi-Shahri, Ingrid Van Keilegom
In clinical studies, one often encounters time-to-event data that are subject to right censoring and for which a fraction of the patients under study never experience the event of interest. Such data can be modeled using cure models in survival analysis. In the presence of cure fraction, the mixture cure model is popular, since it allows to model probability to be cured (called the incidence) and the survival function of the uncured individuals (called the latency). In this paper, we develop a variable selection procedure for the incidence and latency parts of a mixture cure model, consisting of a logistic model for the incidence and a semiparametric accelerated failure time model for the latency. We use a penalized likelihood approach, based on adaptive LASSO penalties for each part of the model, and we consider two algorithms for optimizing the criterion function. Extensive simulations are carried out to assess the accuracy of the proposed selection procedure. Finally, we employ the proposed method to a real dataset regarding heart failure patients with left ventricular systolic dysfunction.
{"title":"On variable selection in a semiparametric AFT mixture cure model.","authors":"Motahareh Parsa, Seyed Mahmood Taghavi-Shahri, Ingrid Van Keilegom","doi":"10.1007/s10985-024-09619-w","DOIUrl":"10.1007/s10985-024-09619-w","url":null,"abstract":"<p><p>In clinical studies, one often encounters time-to-event data that are subject to right censoring and for which a fraction of the patients under study never experience the event of interest. Such data can be modeled using cure models in survival analysis. In the presence of cure fraction, the mixture cure model is popular, since it allows to model probability to be cured (called the incidence) and the survival function of the uncured individuals (called the latency). In this paper, we develop a variable selection procedure for the incidence and latency parts of a mixture cure model, consisting of a logistic model for the incidence and a semiparametric accelerated failure time model for the latency. We use a penalized likelihood approach, based on adaptive LASSO penalties for each part of the model, and we consider two algorithms for optimizing the criterion function. Extensive simulations are carried out to assess the accuracy of the proposed selection procedure. Finally, we employ the proposed method to a real dataset regarding heart failure patients with left ventricular systolic dysfunction.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"472-500"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140023093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01Epub Date: 2024-02-15DOI: 10.1007/s10985-024-09617-y
Richard A J Post, Edwin R van den Heuvel, Hein Putter
It is known that the hazard ratio lacks a useful causal interpretation. Even for data from a randomized controlled trial, the hazard ratio suffers from so-called built-in selection bias as, over time, the individuals at risk among the exposed and unexposed are no longer exchangeable. In this paper, we formalize how the expectation of the observed hazard ratio evolves and deviates from the causal effect of interest in the presence of heterogeneity of the hazard rate of unexposed individuals (frailty) and heterogeneity in effect (individual modification). For the case of effect heterogeneity, we define the causal hazard ratio. We show that the expected observed hazard ratio equals the ratio of expectations of the latent variables (frailty and modifier) conditionally on survival in the world with and without exposure, respectively. Examples with gamma, inverse Gaussian and compound Poisson distributed frailty and categorical (harming, beneficial or neutral) distributed effect modifiers are presented for illustration. This set of examples shows that an observed hazard ratio with a particular value can arise for all values of the causal hazard ratio. Therefore, the hazard ratio cannot be used as a measure of the causal effect without making untestable assumptions, stressing the importance of using more appropriate estimands, such as contrasts of the survival probabilities.
{"title":"The built-in selection bias of hazard ratios formalized using structural causal models.","authors":"Richard A J Post, Edwin R van den Heuvel, Hein Putter","doi":"10.1007/s10985-024-09617-y","DOIUrl":"10.1007/s10985-024-09617-y","url":null,"abstract":"<p><p>It is known that the hazard ratio lacks a useful causal interpretation. Even for data from a randomized controlled trial, the hazard ratio suffers from so-called built-in selection bias as, over time, the individuals at risk among the exposed and unexposed are no longer exchangeable. In this paper, we formalize how the expectation of the observed hazard ratio evolves and deviates from the causal effect of interest in the presence of heterogeneity of the hazard rate of unexposed individuals (frailty) and heterogeneity in effect (individual modification). For the case of effect heterogeneity, we define the causal hazard ratio. We show that the expected observed hazard ratio equals the ratio of expectations of the latent variables (frailty and modifier) conditionally on survival in the world with and without exposure, respectively. Examples with gamma, inverse Gaussian and compound Poisson distributed frailty and categorical (harming, beneficial or neutral) distributed effect modifiers are presented for illustration. This set of examples shows that an observed hazard ratio with a particular value can arise for all values of the causal hazard ratio. Therefore, the hazard ratio cannot be used as a measure of the causal effect without making untestable assumptions, stressing the importance of using more appropriate estimands, such as contrasts of the survival probabilities.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"404-438"},"PeriodicalIF":1.2,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11300553/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139736518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-13DOI: 10.1007/s10985-024-09620-3
Zhongqi Liang, Caiya Zhang, Linjun Xu
This paper studies a novel model averaging estimation issue for linear regression models when the responses are right censored and the covariates are measured with error. A novel weighted Mallows-type criterion is proposed for the considered issue by introducing multiple candidate models. The weight vector for model averaging is selected by minimizing the proposed criterion. Under some regularity conditions, the asymptotic optimality of the selected weight vector is established in terms of its ability to achieve the lowest squared loss asymptotically. Simulation results show that the proposed method is superior to the other existing related methods. A real data example is provided to supplement the actual performance.
{"title":"Model averaging for right censored data with measurement error","authors":"Zhongqi Liang, Caiya Zhang, Linjun Xu","doi":"10.1007/s10985-024-09620-3","DOIUrl":"https://doi.org/10.1007/s10985-024-09620-3","url":null,"abstract":"<p>This paper studies a novel model averaging estimation issue for linear regression models when the responses are right censored and the covariates are measured with error. A novel weighted Mallows-type criterion is proposed for the considered issue by introducing multiple candidate models. The weight vector for model averaging is selected by minimizing the proposed criterion. Under some regularity conditions, the asymptotic optimality of the selected weight vector is established in terms of its ability to achieve the lowest squared loss asymptotically. Simulation results show that the proposed method is superior to the other existing related methods. A real data example is provided to supplement the actual performance.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":"23 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140116815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-11DOI: 10.1007/s10985-024-09616-z
Richard A. J. Post, Edwin R. van den Heuvel, Hein Putter
Hazard ratios are prone to selection bias, compromising their use as causal estimands. On the other hand, if Aalen’s additive hazard model applies, the hazard difference has been shown to remain unaffected by the selection of frailty factors over time. Then, in the absence of confounding, observed hazard differences are equal in expectation to the causal hazard differences. However, in the presence of effect (on the hazard) heterogeneity, the observed hazard difference is also affected by selection of survivors. In this work, we formalize how the observed hazard difference (from a randomized controlled trial) evolves by selecting favourable levels of effect modifiers in the exposed group and thus deviates from the causal effect of interest. Such selection may result in a non-linear integrated hazard difference curve even when the individual causal effects are time-invariant. Therefore, a homogeneous time-varying causal additive effect on the hazard cannot be distinguished from a time-invariant but heterogeneous causal effect. We illustrate this causal issue by studying the effect of chemotherapy on the survival time of patients suffering from carcinoma of the oropharynx using data from a clinical trial. The hazard difference can thus not be used as an appropriate measure of the causal effect without making untestable assumptions.
{"title":"Bias of the additive hazard model in the presence of causal effect heterogeneity","authors":"Richard A. J. Post, Edwin R. van den Heuvel, Hein Putter","doi":"10.1007/s10985-024-09616-z","DOIUrl":"https://doi.org/10.1007/s10985-024-09616-z","url":null,"abstract":"<p>Hazard ratios are prone to selection bias, compromising their use as causal estimands. On the other hand, if Aalen’s additive hazard model applies, the hazard difference has been shown to remain unaffected by the selection of frailty factors over time. Then, in the absence of confounding, observed hazard differences are equal in expectation to the causal hazard differences. However, in the presence of effect (on the hazard) heterogeneity, the observed hazard difference is also affected by selection of survivors. In this work, we formalize how the observed hazard difference (from a randomized controlled trial) evolves by selecting favourable levels of effect modifiers in the exposed group and thus deviates from the causal effect of interest. Such selection may result in a non-linear integrated hazard difference curve even when the individual causal effects are time-invariant. Therefore, a homogeneous time-varying causal additive effect on the hazard cannot be distinguished from a time-invariant but heterogeneous causal effect. We illustrate this causal issue by studying the effect of chemotherapy on the survival time of patients suffering from carcinoma of the oropharynx using data from a clinical trial. The hazard difference can thus not be used as an appropriate measure of the causal effect without making untestable assumptions.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":"5 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140099593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-25DOI: 10.1007/s10985-024-09618-x
Alina Schenk, Moritz Berger, Matthias Schmid
This paper presents a semi-parametric modeling technique for estimating the survival function from a set of right-censored time-to-event data. Our method, named pseudo-value regression trees (PRT), is based on the pseudo-value regression framework, modeling individual-specific survival probabilities by computing pseudo-values and relating them to a set of covariates. The standard approach to pseudo-value regression is to fit a main-effects model using generalized estimating equations (GEE). PRT extend this approach by building a multivariate regression tree with pseudo-value outcome and by successively fitting a set of regularized additive models to the data in the nodes of the tree. Due to the combination of tree learning and additive modeling, PRT are able to perform variable selection and to identify relevant interactions between the covariates, thereby addressing several limitations of the standard GEE approach. In addition, PRT include time-dependent effects in the node-wise models. Interpretability of the PRT fits is ensured by controlling the tree depth. Based on the results of two simulation studies, we investigate the properties of the PRT method and compare it to several alternative modeling techniques. Furthermore, we illustrate PRT by analyzing survival in 3,652 patients enrolled for a randomized study on primary invasive breast cancer.
{"title":"Pseudo-value regression trees","authors":"Alina Schenk, Moritz Berger, Matthias Schmid","doi":"10.1007/s10985-024-09618-x","DOIUrl":"https://doi.org/10.1007/s10985-024-09618-x","url":null,"abstract":"<p>This paper presents a semi-parametric modeling technique for estimating the survival function from a set of right-censored time-to-event data. Our method, named pseudo-value regression trees (PRT), is based on the pseudo-value regression framework, modeling individual-specific survival probabilities by computing pseudo-values and relating them to a set of covariates. The standard approach to pseudo-value regression is to fit a main-effects model using generalized estimating equations (GEE). PRT extend this approach by building a multivariate regression tree with pseudo-value outcome and by successively fitting a set of regularized additive models to the data in the nodes of the tree. Due to the combination of tree learning and additive modeling, PRT are able to perform variable selection and to identify relevant interactions between the covariates, thereby addressing several limitations of the standard GEE approach. In addition, PRT include time-dependent effects in the node-wise models. Interpretability of the PRT fits is ensured by controlling the tree depth. Based on the results of two simulation studies, we investigate the properties of the PRT method and compare it to several alternative modeling techniques. Furthermore, we illustrate PRT by analyzing survival in 3,652 patients enrolled for a randomized study on primary invasive breast cancer.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":"6 3 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139968021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}