Pub Date : 2025-10-01Epub Date: 2025-08-18DOI: 10.1007/s10985-025-09667-w
Wei-En Lu, Ai Ni
In large observational studies with survival outcome and low event rates, the case-cohort design is commonly used to reduce the cost associated with covariate measurement. The restricted mean survival time (RMST) difference has been increasingly used as an alternative to hazard ratio when estimating the causal effect on survival outcomes. We investigate the estimation of marginal causal effect on RMST under the stratified case-cohort design while adjusting for measured confounders through propensity score stratification. The asymptotic normality of the estimator is established, and its variance formula is derived. Simulation studies are performed to evaluate the finite sample performance of the proposed method compared to several alternative methods. Finally, we apply the proposed method to the Atherosclerosis Risk in Communities study to estimate the marginal causal effect of high-sensitivity C-reactive protein level on coronary heart disease-free survival.
{"title":"Causal effect estimation on restricted mean survival time under case-cohort design via propensity score stratification.","authors":"Wei-En Lu, Ai Ni","doi":"10.1007/s10985-025-09667-w","DOIUrl":"10.1007/s10985-025-09667-w","url":null,"abstract":"<p><p>In large observational studies with survival outcome and low event rates, the case-cohort design is commonly used to reduce the cost associated with covariate measurement. The restricted mean survival time (RMST) difference has been increasingly used as an alternative to hazard ratio when estimating the causal effect on survival outcomes. We investigate the estimation of marginal causal effect on RMST under the stratified case-cohort design while adjusting for measured confounders through propensity score stratification. The asymptotic normality of the estimator is established, and its variance formula is derived. Simulation studies are performed to evaluate the finite sample performance of the proposed method compared to several alternative methods. Finally, we apply the proposed method to the Atherosclerosis Risk in Communities study to estimate the marginal causal effect of high-sensitivity C-reactive protein level on coronary heart disease-free survival.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"898-931"},"PeriodicalIF":1.0,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12586416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144876417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01Epub Date: 2025-08-27DOI: 10.1007/s10985-025-09666-x
Yuchen Mao, Lianming Wang, Xuemei Sui
Joint modeling of longitudinal responses and survival time has gained great attention in statistics literature over the last few decades. Most existing works focus on joint analysis of longitudinal data and right-censored data. In this article, we propose a new frailty model for joint analysis of a longitudinal response and interval-censored survival time. Such data commonly arise in real-life studies where participants are examined at periodical or irregular follow-up times. The proposed joint model contains a nonlinear mixed effects submodel for the longitudinal response and a semiparametric probit submodel for the survival time given a shared normal frailty. The proposed joint model allows the regression coefficients to be interpreted as the marginal effects up to a multiplicative constant on both the longitudinal and survival responses. Adopting splines allows us to approximate the unknown baseline functions in both submodels with only a finite number of unknown coefficients while providing great modeling flexibility. An efficient Gibbs sampler is developed for posterior computation, in which all parameters and latent variables can be sampled easily from their full conditional distributions. The proposed method shows a good estimation performance in simulation studies and is further illustrated by a real-life application to the patient data from the Aerobics Center Longitudinal Study. The R code for the proposed methodology is made available for public use.
{"title":"Bayesian joint analysis of longitudinal data and interval-censored failure time data.","authors":"Yuchen Mao, Lianming Wang, Xuemei Sui","doi":"10.1007/s10985-025-09666-x","DOIUrl":"10.1007/s10985-025-09666-x","url":null,"abstract":"<p><p>Joint modeling of longitudinal responses and survival time has gained great attention in statistics literature over the last few decades. Most existing works focus on joint analysis of longitudinal data and right-censored data. In this article, we propose a new frailty model for joint analysis of a longitudinal response and interval-censored survival time. Such data commonly arise in real-life studies where participants are examined at periodical or irregular follow-up times. The proposed joint model contains a nonlinear mixed effects submodel for the longitudinal response and a semiparametric probit submodel for the survival time given a shared normal frailty. The proposed joint model allows the regression coefficients to be interpreted as the marginal effects up to a multiplicative constant on both the longitudinal and survival responses. Adopting splines allows us to approximate the unknown baseline functions in both submodels with only a finite number of unknown coefficients while providing great modeling flexibility. An efficient Gibbs sampler is developed for posterior computation, in which all parameters and latent variables can be sampled easily from their full conditional distributions. The proposed method shows a good estimation performance in simulation studies and is further illustrated by a real-life application to the patient data from the Aerobics Center Longitudinal Study. The R code for the proposed methodology is made available for public use.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"950-969"},"PeriodicalIF":1.0,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144976304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-06-14DOI: 10.1007/s10985-025-09658-x
Chi Wing Chu, Hok Kan Ling
We study shape-constrained nonparametric estimation of the underlying survival function in a cross-sectional study without follow-up. Assuming the rate of initiation event is stationary over time, the observed current duration becomes a length-biased and multiplicatively censored counterpart of the underlying failure time of interest. We focus on two shape constraints for the underlying survival function, namely, log-concavity and convexity. The log-concavity constraint is versatile as it allows for log-concave densities, bi-log-concave distributions, increasing densities, and multi-modal densities. We establish the consistency and pointwise asymptotic distribution of the shape-constrained estimators. Specifically, the proposed estimator under log-concavity is consistent and tuning-parameter-free, thus circumventing the well-known inconsistency issue of the Grenander estimator at 0, where correction methods typically involve tuning parameters.
{"title":"Shape-constrained estimation for current duration data in cross-sectional studies.","authors":"Chi Wing Chu, Hok Kan Ling","doi":"10.1007/s10985-025-09658-x","DOIUrl":"10.1007/s10985-025-09658-x","url":null,"abstract":"<p><p>We study shape-constrained nonparametric estimation of the underlying survival function in a cross-sectional study without follow-up. Assuming the rate of initiation event is stationary over time, the observed current duration becomes a length-biased and multiplicatively censored counterpart of the underlying failure time of interest. We focus on two shape constraints for the underlying survival function, namely, log-concavity and convexity. The log-concavity constraint is versatile as it allows for log-concave densities, bi-log-concave distributions, increasing densities, and multi-modal densities. We establish the consistency and pointwise asymptotic distribution of the shape-constrained estimators. Specifically, the proposed estimator under log-concavity is consistent and tuning-parameter-free, thus circumventing the well-known inconsistency issue of the Grenander estimator at 0, where correction methods typically involve tuning parameters.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"595-630"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144295231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-07-02DOI: 10.1007/s10985-025-09662-1
Dongfeng Wu
A probability method to estimate cancer risk for asymptomatic individuals for the rest of life was developed based on one's current age and screening history using the disease progressive model. The risk is a function of the transition probability density from the disease-free to the preclinical state, the sojourn time in the preclinical state and the screening sensitivity if one had a screening history with negative results. The method can be applied to any chronic disease. As an example, the method was applied to estimate women's breast cancer risk using parameters estimated from the Health Insurance Plan of Greater New York under two scenarios: with and without a screening history, and obtain some meaningful results.
基于一个人的当前年龄和使用疾病进展模型的筛查史,开发了一种估计无症状个体余生癌症风险的概率方法。风险是从无病到临床前状态的转移概率密度、临床前状态的停留时间和筛查敏感性(如果有筛查史结果为阴性)的函数。这种方法适用于任何慢性疾病。以该方法为例,在有筛查史和无筛查史两种情况下,利用大纽约健康保险计划(Health Insurance Plan of Greater New York)估算的参数对女性乳腺癌风险进行了估计,并获得了一些有意义的结果。
{"title":"Estimating the risk of cancer with and without a screening history.","authors":"Dongfeng Wu","doi":"10.1007/s10985-025-09662-1","DOIUrl":"10.1007/s10985-025-09662-1","url":null,"abstract":"<p><p>A probability method to estimate cancer risk for asymptomatic individuals for the rest of life was developed based on one's current age and screening history using the disease progressive model. The risk is a function of the transition probability density from the disease-free to the preclinical state, the sojourn time in the preclinical state and the screening sensitivity if one had a screening history with negative results. The method can be applied to any chronic disease. As an example, the method was applied to estimate women's breast cancer risk using parameters estimated from the Health Insurance Plan of Greater New York under two scenarios: with and without a screening history, and obtain some meaningful results.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"702-712"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144555552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-04-29DOI: 10.1007/s10985-025-09654-1
Xin Ye, Shu Yang, Xiaofei Wang, Yanyan Liu
In this study, we focus on estimating the heterogeneous treatment effect (HTE) for survival outcome. The outcome is subject to censoring and the number of covariates is high-dimensional. We utilize data from both the randomized controlled trial (RCT), considered as the gold standard, and real-world data (RWD), possibly affected by hidden confounding factors. To achieve a more efficient HTE estimate, such integrative analysis requires great insight into the data generation mechanism, particularly the accurate characterization of unmeasured confounding effects/bias. With this aim, we propose a penalized-regression-based integrative approach that allows for the simultaneous estimation of parameters, selection of variables, and identification of the existence of unmeasured confounding effects. The consistency, asymptotic normality, and efficiency gains are rigorously established for the proposed estimate. Finally, we apply the proposed method to estimate the HTE of lobar/sublobar resection on the survival of lung cancer patients. The RCT is a multicenter non-inferiority randomized phase 3 trial, and the RWD comes from a clinical oncology cancer registry in the United States. The analysis reveals that the unmeasured confounding exists and the integrative approach does enhance the efficiency for the HTE estimation.
{"title":"Integrative analysis of high-dimensional RCT and RWD subject to censoring and hidden confounding.","authors":"Xin Ye, Shu Yang, Xiaofei Wang, Yanyan Liu","doi":"10.1007/s10985-025-09654-1","DOIUrl":"10.1007/s10985-025-09654-1","url":null,"abstract":"<p><p>In this study, we focus on estimating the heterogeneous treatment effect (HTE) for survival outcome. The outcome is subject to censoring and the number of covariates is high-dimensional. We utilize data from both the randomized controlled trial (RCT), considered as the gold standard, and real-world data (RWD), possibly affected by hidden confounding factors. To achieve a more efficient HTE estimate, such integrative analysis requires great insight into the data generation mechanism, particularly the accurate characterization of unmeasured confounding effects/bias. With this aim, we propose a penalized-regression-based integrative approach that allows for the simultaneous estimation of parameters, selection of variables, and identification of the existence of unmeasured confounding effects. The consistency, asymptotic normality, and efficiency gains are rigorously established for the proposed estimate. Finally, we apply the proposed method to estimate the HTE of lobar/sublobar resection on the survival of lung cancer patients. The RCT is a multicenter non-inferiority randomized phase 3 trial, and the RWD comes from a clinical oncology cancer registry in the United States. The analysis reveals that the unmeasured confounding exists and the integrative approach does enhance the efficiency for the HTE estimation.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"473-497"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12317910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144003384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In survival analysis, researchers commonly focus on variable selection issues in real-world data, particularly when complex network structures exist among covariates. Additionally, due to factors such as data collection costs and delayed entry, real-world data often exhibit censoring and truncation phenomena.This paper addresses left-truncated current status data by employing a copula-based approach to model the relationship between censoring time and failure time. Based on this, we investigate the problem of variable selection in the context of complex network structures among covariates. To this end, we integrate Markov Random Field (MRF) with the Proportional Hazards (PH) model, and extend the latter to more flexibly characterize the correlation structure among covariates. For solving the constructed model, we propose a penalized optimization method and utilize spline functions to estimate the baseline hazard function. Through numerical simulation experiments and case studies of clinical trial data, we comprehensively evaluate the effectiveness and performance of the proposed model and its parameter inference strategy. This evaluation not only demonstrates the robustness of the proposed model in handling complex disease data but also further verifies the high precision and reliability of the parameter estimation method.
{"title":"Regression analysis of a graphical proportional hazards model for informatively left-truncated current status data.","authors":"Mengyue Zhang, Shishu Zhao, Shuying Wang, Xiaolin Xu","doi":"10.1007/s10985-025-09655-0","DOIUrl":"10.1007/s10985-025-09655-0","url":null,"abstract":"<p><p>In survival analysis, researchers commonly focus on variable selection issues in real-world data, particularly when complex network structures exist among covariates. Additionally, due to factors such as data collection costs and delayed entry, real-world data often exhibit censoring and truncation phenomena.This paper addresses left-truncated current status data by employing a copula-based approach to model the relationship between censoring time and failure time. Based on this, we investigate the problem of variable selection in the context of complex network structures among covariates. To this end, we integrate Markov Random Field (MRF) with the Proportional Hazards (PH) model, and extend the latter to more flexibly characterize the correlation structure among covariates. For solving the constructed model, we propose a penalized optimization method and utilize spline functions to estimate the baseline hazard function. Through numerical simulation experiments and case studies of clinical trial data, we comprehensively evaluate the effectiveness and performance of the proposed model and its parameter inference strategy. This evaluation not only demonstrates the robustness of the proposed model in handling complex disease data but also further verifies the high precision and reliability of the parameter estimation method.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"498-542"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144047308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-05-23DOI: 10.1007/s10985-025-09656-z
Yufeng Xia, Yangkuo Li, Xiaobing Zhao, Xuan Xu
To investigate pairwise interactions arising from recurrent event processes in a longitudinal network, the framework of the stochastic block model is followed, where every node belongs to a latent group and interactions between node pairs from two specified groups follow a conditional nonhomogeneous Poisson process. Our focus lies on discrete observation times, which are commonly encountered in reality for cost-saving purposes. The variational EM algorithm and variational maximum likelihood estimation are applied for statistical inference. A specific method based on the defined distribution function F and self-consistency algorithm for recurrent events is used when estimating the intensity functions of edges. Numerical simulations illustrate the performance of our proposed estimation procedure in uncovering the underlying structure in the longitudinal networks with recurrent event processes. The dataset of interactions between French schoolchildren for influenza monitoring is analyzed.
{"title":"Investigating network structures in recurrent event data with discrete observation times.","authors":"Yufeng Xia, Yangkuo Li, Xiaobing Zhao, Xuan Xu","doi":"10.1007/s10985-025-09656-z","DOIUrl":"10.1007/s10985-025-09656-z","url":null,"abstract":"<p><p>To investigate pairwise interactions arising from recurrent event processes in a longitudinal network, the framework of the stochastic block model is followed, where every node belongs to a latent group and interactions between node pairs from two specified groups follow a conditional nonhomogeneous Poisson process. Our focus lies on discrete observation times, which are commonly encountered in reality for cost-saving purposes. The variational EM algorithm and variational maximum likelihood estimation are applied for statistical inference. A specific method based on the defined distribution function F and self-consistency algorithm for recurrent events is used when estimating the intensity functions of edges. Numerical simulations illustrate the performance of our proposed estimation procedure in uncovering the underlying structure in the longitudinal networks with recurrent event processes. The dataset of interactions between French schoolchildren for influenza monitoring is analyzed.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"543-573"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144129445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-06-25DOI: 10.1007/s10985-025-09660-3
Seoyoon Cho, Matthew A Psioda, Joseph G Ibrahim
We propose a joint model for multiple time-to-event outcomes where the outcomes have a cure structure. When a subset of a population is not susceptible to an event of interest, traditional survival models cannot accommodate this type of phenomenon. For example, for patients with melanoma, certain modern treatment options can reduce the mortality and relapse rates. Traditional survival models assume the entire population is at risk for the event of interest, i.e., has a non-zero hazard at all times. However, cure rate models allow a portion of the population to be risk-free of the event of interest. Our proposed model uses a novel truncated Gaussian copula to jointly model bivariate time-to-event outcomes of this type. In oncology studies, multiple time-to-event outcomes (e.g., overall survival and relapse-free or progression-free survival) are typically of interest. Therefore, multivariate methods to analyze time-to-event outcomes with a cure structure are potentially of great utility. We formulate a joint model directly on the time-to-event outcomes (i.e., unconditional on whether an individual is cured or not). Dependency between the time-to-event outcomes is modeled via the correlation matrix of the truncated Gaussian copula. A Markov Chain Monte Carlo procedure is proposed for model fitting. Simulation studies and a real data analysis using a melanoma clinical trial data are presented to illustrate the performance of the method and the proposed model is compared to independent models.
{"title":"Bayesian bivariate cure rate models using Gaussian copulas.","authors":"Seoyoon Cho, Matthew A Psioda, Joseph G Ibrahim","doi":"10.1007/s10985-025-09660-3","DOIUrl":"10.1007/s10985-025-09660-3","url":null,"abstract":"<p><p>We propose a joint model for multiple time-to-event outcomes where the outcomes have a cure structure. When a subset of a population is not susceptible to an event of interest, traditional survival models cannot accommodate this type of phenomenon. For example, for patients with melanoma, certain modern treatment options can reduce the mortality and relapse rates. Traditional survival models assume the entire population is at risk for the event of interest, i.e., has a non-zero hazard at all times. However, cure rate models allow a portion of the population to be risk-free of the event of interest. Our proposed model uses a novel truncated Gaussian copula to jointly model bivariate time-to-event outcomes of this type. In oncology studies, multiple time-to-event outcomes (e.g., overall survival and relapse-free or progression-free survival) are typically of interest. Therefore, multivariate methods to analyze time-to-event outcomes with a cure structure are potentially of great utility. We formulate a joint model directly on the time-to-event outcomes (i.e., unconditional on whether an individual is cured or not). Dependency between the time-to-event outcomes is modeled via the correlation matrix of the truncated Gaussian copula. A Markov Chain Monte Carlo procedure is proposed for model fitting. Simulation studies and a real data analysis using a melanoma clinical trial data are presented to illustrate the performance of the method and the proposed model is compared to independent models.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"658-673"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144486712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-07-28DOI: 10.1007/s10985-025-09659-w
Marina T Dietrich, Dennis Dobler, Mathisca C M de Gunst
The wild bootstrap is a popular resampling method in the context of time-to-event data analysis. Previous works established the large sample properties of it for applications to different estimators and test statistics. It can be used to justify the accuracy of inference procedures such as hypothesis tests or time-simultaneous confidence bands. This paper provides a general framework for establishing large sample properties in a unified way by using martingale structures. This framework includes most of the well-known parametric, semiparametric and nonparametric statistical methods in time-to-event analysis. Along the way of proving the validity of the wild bootstrap, a new variant of Rebolledo's martingale central limit theorem for counting process-based martingales is developed as well.
{"title":"Wild bootstrap for counting process-based statistics: a martingale theory-based approach.","authors":"Marina T Dietrich, Dennis Dobler, Mathisca C M de Gunst","doi":"10.1007/s10985-025-09659-w","DOIUrl":"10.1007/s10985-025-09659-w","url":null,"abstract":"<p><p>The wild bootstrap is a popular resampling method in the context of time-to-event data analysis. Previous works established the large sample properties of it for applications to different estimators and test statistics. It can be used to justify the accuracy of inference procedures such as hypothesis tests or time-simultaneous confidence bands. This paper provides a general framework for establishing large sample properties in a unified way by using martingale structures. This framework includes most of the well-known parametric, semiparametric and nonparametric statistical methods in time-to-event analysis. Along the way of proving the validity of the wild bootstrap, a new variant of Rebolledo's martingale central limit theorem for counting process-based martingales is developed as well.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"631-657"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12317882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144734906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01Epub Date: 2025-07-16DOI: 10.1007/s10985-025-09661-2
Jih-Chang Yu, Yu-Jen Cheng
In this study, we investigate estimation and variable selection for semiparametric transformation models with length-biased survival data-a special case of left truncation commonly encountered in the social sciences and cancer prevention trials. To correct for sampling bias, conventional methods such as conditional likelihood, martingale estimating equations, and composite likelihood have been proposed. However, these methods may be less efficient due to their reliance on only partial information from the full likelihood. In contrast, we adopt a full-likelihood approach under the semiparametric transformation model and propose a unified and more efficient nonparametric maximum likelihood estimator (NPMLE). To perform variable selection, we incorporate an adaptive least absolute shrinkage and selection operator (ALASSO) penalty into the full likelihood. We show that when the NPMLE is used as the initial value, the resulting one-step ALASSO estimator-offering a simplified version of the Newton-Raphson method-achieves oracle properties. Theoretical properties of the proposed methods are established using empirical process techniques. The performance of the methods is evaluated through simulation studies and illustrated with a real data application.
{"title":"Estimation and variable selection for semiparametric transformation models with length-biased survival data.","authors":"Jih-Chang Yu, Yu-Jen Cheng","doi":"10.1007/s10985-025-09661-2","DOIUrl":"10.1007/s10985-025-09661-2","url":null,"abstract":"<p><p>In this study, we investigate estimation and variable selection for semiparametric transformation models with length-biased survival data-a special case of left truncation commonly encountered in the social sciences and cancer prevention trials. To correct for sampling bias, conventional methods such as conditional likelihood, martingale estimating equations, and composite likelihood have been proposed. However, these methods may be less efficient due to their reliance on only partial information from the full likelihood. In contrast, we adopt a full-likelihood approach under the semiparametric transformation model and propose a unified and more efficient nonparametric maximum likelihood estimator (NPMLE). To perform variable selection, we incorporate an adaptive least absolute shrinkage and selection operator (ALASSO) penalty into the full likelihood. We show that when the NPMLE is used as the initial value, the resulting one-step ALASSO estimator-offering a simplified version of the Newton-Raphson method-achieves oracle properties. Theoretical properties of the proposed methods are established using empirical process techniques. The performance of the methods is evaluated through simulation studies and illustrated with a real data application.</p>","PeriodicalId":49908,"journal":{"name":"Lifetime Data Analysis","volume":" ","pages":"674-701"},"PeriodicalIF":1.0,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144643970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}