Yeseul Jeon, Won Chang, Seonghyun Jeong, Sanghoon Han, Jaewoo Park
Convolutional neural networks (CNNs) provide flexible function approximations for a wide variety of applications when the input variables are in the form of images or spatial data. Although CNNs often outperform traditional statistical models in prediction accuracy, statistical inference, such as estimating the effects of covariates and quantifying the prediction uncertainty, is not trivial due to the highly complicated model structure and overparameterization. To address this challenge, we propose a new Bayesian approach by embedding CNNs within the generalized linear models (GLMs) framework. We use extracted nodes from the last hidden layer of CNN with Monte Carlo (MC) dropout as informative covariates in GLM. This improves accuracy in prediction and regression coefficient inference, allowing for the interpretation of coefficients and uncertainty quantification. By fitting ensemble GLMs across multiple realizations from MC dropout, we can account for uncertainties in extracting the features. We apply our methods to biological and epidemiological problems, which have both high-dimensional correlated inputs and vector covariates. Specifically, we consider malaria incidence data, brain tumor image data, and fMRI data. By extracting information from correlated inputs, the proposed method can provide an interpretable Bayesian analysis. The algorithm can be broadly applicable to image regressions or correlated data analysis by enabling accurate Bayesian inference quickly.
当输入变量为图像或空间数据时,卷积神经网络(CNN)可为各种应用提供灵活的函数近似。虽然卷积神经网络在预测准确性上往往优于传统统计模型,但由于模型结构非常复杂且参数过多,统计推断(如估计协变量的影响和量化预测的不确定性)并非易事。为了应对这一挑战,我们提出了一种新的贝叶斯方法,即在广义线性模型(GLM)框架内嵌入 CNN。我们将从 CNN 最后一个隐藏层提取的节点与蒙特卡罗(MC)剔除作为广义线性模型中的信息协变量。这提高了预测和回归系数推断的准确性,允许对系数进行解释和不确定性量化。通过拟合来自 MC 丢失的多个变现的集合 GLM,我们可以考虑提取特征时的不确定性。我们将我们的方法应用于生物和流行病学问题,这些问题既有高维相关输入,也有向量协变量。具体来说,我们考虑了疟疾发病率数据、脑肿瘤图像数据和 fMRI 数据。通过从相关输入中提取信息,所提出的方法可以提供可解释的贝叶斯分析。通过快速实现准确的贝叶斯推理,该算法可广泛应用于图像回归或相关数据分析。
{"title":"A Bayesian convolutional neural network-based generalized linear model.","authors":"Yeseul Jeon, Won Chang, Seonghyun Jeong, Sanghoon Han, Jaewoo Park","doi":"10.1093/biomtc/ujae057","DOIUrl":"https://doi.org/10.1093/biomtc/ujae057","url":null,"abstract":"<p><p>Convolutional neural networks (CNNs) provide flexible function approximations for a wide variety of applications when the input variables are in the form of images or spatial data. Although CNNs often outperform traditional statistical models in prediction accuracy, statistical inference, such as estimating the effects of covariates and quantifying the prediction uncertainty, is not trivial due to the highly complicated model structure and overparameterization. To address this challenge, we propose a new Bayesian approach by embedding CNNs within the generalized linear models (GLMs) framework. We use extracted nodes from the last hidden layer of CNN with Monte Carlo (MC) dropout as informative covariates in GLM. This improves accuracy in prediction and regression coefficient inference, allowing for the interpretation of coefficients and uncertainty quantification. By fitting ensemble GLMs across multiple realizations from MC dropout, we can account for uncertainties in extracting the features. We apply our methods to biological and epidemiological problems, which have both high-dimensional correlated inputs and vector covariates. Specifically, we consider malaria incidence data, brain tumor image data, and fMRI data. By extracting information from correlated inputs, the proposed method can provide an interpretable Bayesian analysis. The algorithm can be broadly applicable to image regressions or correlated data analysis by enabling accurate Bayesian inference quickly.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141417569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To leverage the advancements in genome-wide association studies (GWAS) and quantitative trait loci (QTL) mapping for traits and molecular phenotypes to gain mechanistic understanding of the genetic regulation, biological researchers often investigate the expression QTLs (eQTLs) that colocalize with QTL or GWAS peaks. Our research is inspired by 2 such studies. One aims to identify the causal single nucleotide polymorphisms that are responsible for the phenotypic variation and whose effects can be explained by their impacts at the transcriptomic level in maize. The other study in mouse focuses on uncovering the cis-driver genes that induce phenotypic changes by regulating trans-regulated genes. Both studies can be formulated as mediation problems with potentially high-dimensional exposures, confounders, and mediators that seek to estimate the overall indirect effect (IE) for each exposure. In this paper, we propose MedDiC, a novel procedure to estimate the overall IE based on difference-in-coefficients approach. Our simulation studies find that MedDiC offers valid inference for the IE with higher power, shorter confidence intervals, and faster computing time than competing methods. We apply MedDiC to the 2 aforementioned motivating datasets and find that MedDiC yields reproducible outputs across the analysis of closely related traits, with results supported by external biological evidence. The code and additional information are available on our GitHub page (https://github.com/QiZhangStat/MedDiC).
{"title":"Dissecting the colocalized GWAS and eQTLs with mediation analysis for high-dimensional exposures and confounders.","authors":"Qi Zhang, Zhikai Yang, Jinliang Yang","doi":"10.1093/biomtc/ujae050","DOIUrl":"https://doi.org/10.1093/biomtc/ujae050","url":null,"abstract":"<p><p>To leverage the advancements in genome-wide association studies (GWAS) and quantitative trait loci (QTL) mapping for traits and molecular phenotypes to gain mechanistic understanding of the genetic regulation, biological researchers often investigate the expression QTLs (eQTLs) that colocalize with QTL or GWAS peaks. Our research is inspired by 2 such studies. One aims to identify the causal single nucleotide polymorphisms that are responsible for the phenotypic variation and whose effects can be explained by their impacts at the transcriptomic level in maize. The other study in mouse focuses on uncovering the cis-driver genes that induce phenotypic changes by regulating trans-regulated genes. Both studies can be formulated as mediation problems with potentially high-dimensional exposures, confounders, and mediators that seek to estimate the overall indirect effect (IE) for each exposure. In this paper, we propose MedDiC, a novel procedure to estimate the overall IE based on difference-in-coefficients approach. Our simulation studies find that MedDiC offers valid inference for the IE with higher power, shorter confidence intervals, and faster computing time than competing methods. We apply MedDiC to the 2 aforementioned motivating datasets and find that MedDiC yields reproducible outputs across the analysis of closely related traits, with results supported by external biological evidence. The code and additional information are available on our GitHub page (https://github.com/QiZhangStat/MedDiC).</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141155115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evan Kwiatkowski, Jiawen Zhu, Xiao Li, Herbert Pang, Grazyna Lieberman, Matthew A Psioda
We develop a method for hybrid analyses that uses external controls to augment internal control arms in randomized controlled trials (RCTs) where the degree of borrowing is determined based on similarity between RCT and external control patients to account for systematic differences (e.g., unmeasured confounders). The method represents a novel extension of the power prior where discounting weights are computed separately for each external control based on compatibility with the randomized control data. The discounting weights are determined using the predictive distribution for the external controls derived via the posterior distribution for time-to-event parameters estimated from the RCT. This method is applied using a proportional hazards regression model with piecewise constant baseline hazard. A simulation study and a real-data example are presented based on a completed trial in non-small cell lung cancer. It is shown that the case weighted power prior provides robust inference under various forms of incompatibility between the external controls and RCT population.
{"title":"Case weighted power priors for hybrid control analyses with time-to-event data.","authors":"Evan Kwiatkowski, Jiawen Zhu, Xiao Li, Herbert Pang, Grazyna Lieberman, Matthew A Psioda","doi":"10.1093/biomtc/ujae019","DOIUrl":"10.1093/biomtc/ujae019","url":null,"abstract":"<p><p>We develop a method for hybrid analyses that uses external controls to augment internal control arms in randomized controlled trials (RCTs) where the degree of borrowing is determined based on similarity between RCT and external control patients to account for systematic differences (e.g., unmeasured confounders). The method represents a novel extension of the power prior where discounting weights are computed separately for each external control based on compatibility with the randomized control data. The discounting weights are determined using the predictive distribution for the external controls derived via the posterior distribution for time-to-event parameters estimated from the RCT. This method is applied using a proportional hazards regression model with piecewise constant baseline hazard. A simulation study and a real-data example are presented based on a completed trial in non-small cell lung cancer. It is shown that the case weighted power prior provides robust inference under various forms of incompatibility between the external controls and RCT population.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10968526/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140304678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep learning has continuously attained huge success in diverse fields, while its application to survival data analysis remains limited and deserves further exploration. For the analysis of current status data, a deep partially linear Cox model is proposed to circumvent the curse of dimensionality. Modeling flexibility is attained by using deep neural networks (DNNs) to accommodate nonlinear covariate effects and monotone splines to approximate the baseline cumulative hazard function. We establish the convergence rate of the proposed maximum likelihood estimators. Moreover, we derive that the finite-dimensional estimator for treatment covariate effects is $sqrt{n}$-consistent, asymptotically normal, and attains semiparametric efficiency. Finally, we demonstrate the performance of our procedures through extensive simulation studies and application to real-world data on news popularity.
{"title":"Deep partially linear cox model for current status data.","authors":"Qiang Wu, Xingwei Tong, Xingqiu Zhao","doi":"10.1093/biomtc/ujae024","DOIUrl":"10.1093/biomtc/ujae024","url":null,"abstract":"<p><p>Deep learning has continuously attained huge success in diverse fields, while its application to survival data analysis remains limited and deserves further exploration. For the analysis of current status data, a deep partially linear Cox model is proposed to circumvent the curse of dimensionality. Modeling flexibility is attained by using deep neural networks (DNNs) to accommodate nonlinear covariate effects and monotone splines to approximate the baseline cumulative hazard function. We establish the convergence rate of the proposed maximum likelihood estimators. Moreover, we derive that the finite-dimensional estimator for treatment covariate effects is $sqrt{n}$-consistent, asymptotically normal, and attains semiparametric efficiency. Finally, we demonstrate the performance of our procedures through extensive simulation studies and application to real-world data on news popularity.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140334555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fish growth models are crucial for fisheries stock assessments and are commonly estimated using fish length-at-age data. This data is widely collected using length-stratified age sampling (LSAS), a cost-effective two-phase response-selective sampling method. The data may contain age measurement errors (MEs). We propose a methodology that accounts for both LSAS and age MEs to accurately estimate fish growth. The proposed methods use empirical proportion likelihood methodology for LSAS and the structural errors in variables methodology for age MEs. We provide a measure of uncertainty for parameter estimates and standardized residuals for model validation. To model the age distribution, we employ a continuation ratio-logit model that is consistent with the random nature of the true age distribution. We also apply a discretization approach for age and length distributions, which significantly improves computational efficiency and is consistent with the discrete age and length data typically encountered in practice. Our simulation study shows that neglecting age MEs can lead to significant bias in growth estimation, even with small but non-negligible age MEs. However, our new approach performs well regardless of the magnitude of age MEs and accurately estimates SEs of parameter estimators. Real data analysis demonstrates the effectiveness of the proposed model validation device. Computer codes to implement the methodology are provided.
鱼类生长模型对渔业资源评估至关重要,通常使用鱼类的年龄长度数据进行估算。这些数据广泛采用长度分层年龄取样法(LSAS)收集,这是一种具有成本效益的两阶段反应选择取样法。这些数据可能包含年龄测量误差(ME)。我们提出了一种既考虑 LSAS 又考虑年龄测量误差的方法,以准确估计鱼类的生长情况。建议的方法对 LSAS 采用经验比例似然法,对年龄 ME 采用变量结构误差法。我们为参数估计提供了不确定性度量,并为模型验证提供了标准化残差。为了建立年龄分布模型,我们采用了与真实年龄分布的随机性相一致的延续比对数模型。我们还对年龄和身长分布采用了离散化方法,这大大提高了计算效率,并与实践中通常遇到的离散年龄和身长数据相一致。我们的模拟研究表明,忽略年龄 ME 会导致生长估计出现明显偏差,即使年龄 ME 较小但不可忽略。然而,无论年龄中位数的大小如何,我们的新方法都能表现出色,并能准确估计参数估计值的 SE。实际数据分析证明了所提出的模型验证方法的有效性。本文还提供了实现该方法的计算机代码。
{"title":"Addressing age measurement errors in fish growth estimation from length-stratified samples.","authors":"Nan Zheng, Atefeh Kheirollahi, Yildiz Yilmaz","doi":"10.1093/biomtc/ujae029","DOIUrl":"https://doi.org/10.1093/biomtc/ujae029","url":null,"abstract":"<p><p>Fish growth models are crucial for fisheries stock assessments and are commonly estimated using fish length-at-age data. This data is widely collected using length-stratified age sampling (LSAS), a cost-effective two-phase response-selective sampling method. The data may contain age measurement errors (MEs). We propose a methodology that accounts for both LSAS and age MEs to accurately estimate fish growth. The proposed methods use empirical proportion likelihood methodology for LSAS and the structural errors in variables methodology for age MEs. We provide a measure of uncertainty for parameter estimates and standardized residuals for model validation. To model the age distribution, we employ a continuation ratio-logit model that is consistent with the random nature of the true age distribution. We also apply a discretization approach for age and length distributions, which significantly improves computational efficiency and is consistent with the discrete age and length data typically encountered in practice. Our simulation study shows that neglecting age MEs can lead to significant bias in growth estimation, even with small but non-negligible age MEs. However, our new approach performs well regardless of the magnitude of age MEs and accurately estimates SEs of parameter estimators. Real data analysis demonstrates the effectiveness of the proposed model validation device. Computer codes to implement the methodology are provided.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140849395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The marginal structure quantile model (MSQM) provides a unique lens to understand the causal effect of a time-varying treatment on the full distribution of potential outcomes. Under the semiparametric framework, we derive the efficiency influence function for the MSQM, from which a new doubly robust estimator is proposed for point estimation and inference. We show that the doubly robust estimator is consistent if either of the models associated with treatment assignment or the potential outcome distributions is correctly specified, and is semiparametric efficient if both models are correct. To implement the doubly robust MSQM estimator, we propose to solve a smoothed estimating equation to facilitate efficient computation of the point and variance estimates. In addition, we develop a confounding function approach to investigate the sensitivity of several MSQM estimators when the sequential ignorability assumption is violated. Extensive simulations are conducted to examine the finite-sample performance characteristics of the proposed methods. We apply the proposed methods to the Yale New Haven Health System Electronic Health Record data to study the effect of antihypertensive medications to patients with severe hypertension and assess the robustness of the findings to unmeasured baseline and time-varying confounding.
{"title":"Doubly robust estimation and sensitivity analysis for marginal structural quantile models.","authors":"Chao Cheng, Liangyuan Hu, Fan Li","doi":"10.1093/biomtc/ujae045","DOIUrl":"https://doi.org/10.1093/biomtc/ujae045","url":null,"abstract":"<p><p>The marginal structure quantile model (MSQM) provides a unique lens to understand the causal effect of a time-varying treatment on the full distribution of potential outcomes. Under the semiparametric framework, we derive the efficiency influence function for the MSQM, from which a new doubly robust estimator is proposed for point estimation and inference. We show that the doubly robust estimator is consistent if either of the models associated with treatment assignment or the potential outcome distributions is correctly specified, and is semiparametric efficient if both models are correct. To implement the doubly robust MSQM estimator, we propose to solve a smoothed estimating equation to facilitate efficient computation of the point and variance estimates. In addition, we develop a confounding function approach to investigate the sensitivity of several MSQM estimators when the sequential ignorability assumption is violated. Extensive simulations are conducted to examine the finite-sample performance characteristics of the proposed methods. We apply the proposed methods to the Yale New Haven Health System Electronic Health Record data to study the effect of antihypertensive medications to patients with severe hypertension and assess the robustness of the findings to unmeasured baseline and time-varying confounding.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141330357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brain-effective connectivity analysis quantifies directed influence of one neural element or region over another, and it is of great scientific interest to understand how effective connectivity pattern is affected by variations of subject conditions. Vector autoregression (VAR) is a useful tool for this type of problems. However, there is a paucity of solutions when there is measurement error, when there are multiple subjects, and when the focus is the inference of the transition matrix. In this article, we study the problem of transition matrix inference under the high-dimensional VAR model with measurement error and multiple subjects. We propose a simultaneous testing procedure, with three key components: a modified expectation-maximization (EM) algorithm, a test statistic based on the tensor regression of a bias-corrected estimator of the lagged auto-covariance given the covariates, and a properly thresholded simultaneous test. We establish the uniform consistency for the estimators of our modified EM, and show that the subsequent test achieves both a consistent false discovery control, and its power approaches one asymptotically. We demonstrate the efficacy of our method through both simulations and a brain connectivity study of task-evoked functional magnetic resonance imaging.
大脑有效连接分析量化了一个神经元素或区域对另一个神经元素或区域的定向影响,了解有效连接模式如何受主体条件变化的影响具有重大的科学意义。向量自回归(VAR)是解决这类问题的有效工具。然而,当存在测量误差、有多个受试者以及重点是推断过渡矩阵时,解决方案却非常匮乏。本文研究了具有测量误差和多主体的高维 VAR 模型下的转换矩阵推断问题。我们提出了一种同步检验程序,包括三个关键部分:改进的期望最大化(EM)算法、基于给定协变量的滞后自协方差偏差校正估计器的张量回归的检验统计量,以及适当阈值化的同步检验。我们建立了修正 EM 估计数的统一一致性,并证明随后的检验既实现了一致的误发现控制,其功率也渐近于 1。我们通过模拟和任务诱发功能磁共振成像的大脑连接研究证明了我们方法的有效性。
{"title":"High-dimensional multisubject time series transition matrix inference with application to brain connectivity analysis.","authors":"Xiang Lyu, Jian Kang, Lexin Li","doi":"10.1093/biomtc/ujae021","DOIUrl":"10.1093/biomtc/ujae021","url":null,"abstract":"<p><p>Brain-effective connectivity analysis quantifies directed influence of one neural element or region over another, and it is of great scientific interest to understand how effective connectivity pattern is affected by variations of subject conditions. Vector autoregression (VAR) is a useful tool for this type of problems. However, there is a paucity of solutions when there is measurement error, when there are multiple subjects, and when the focus is the inference of the transition matrix. In this article, we study the problem of transition matrix inference under the high-dimensional VAR model with measurement error and multiple subjects. We propose a simultaneous testing procedure, with three key components: a modified expectation-maximization (EM) algorithm, a test statistic based on the tensor regression of a bias-corrected estimator of the lagged auto-covariance given the covariates, and a properly thresholded simultaneous test. We establish the uniform consistency for the estimators of our modified EM, and show that the subsequent test achieves both a consistent false discovery control, and its power approaches one asymptotically. We demonstrate the efficacy of our method through both simulations and a brain connectivity study of task-evoked functional magnetic resonance imaging.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10988359/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140850327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When studying the treatment effect on time-to-event outcomes, it is common that some individuals never experience failure events, which suggests that they have been cured. However, the cure status may not be observed due to censoring which makes it challenging to define treatment effects. Current methods mainly focus on estimating model parameters in various cure models, ultimately leading to a lack of causal interpretations. To address this issue, we propose 2 causal estimands, the timewise risk difference and mean survival time difference, in the always-uncured based on principal stratification as a complement to the treatment effect on cure rates. These estimands allow us to study the treatment effects on failure times in the always-uncured subpopulation. We show the identifiability using a substitutional variable for the potential cure status under ignorable treatment assignment mechanism, these 2 estimands are identifiable. We also provide estimation methods using mixture cure models. We applied our approach to an observational study that compared the leukemia-free survival rates of different transplantation types to cure acute lymphoblastic leukemia. Our proposed approach yielded insightful results that can be used to inform future treatment decisions.
{"title":"Causal inference for time-to-event data with a cured subpopulation.","authors":"Yi Wang, Yuhao Deng, Xiao-Hua Zhou","doi":"10.1093/biomtc/ujae028","DOIUrl":"https://doi.org/10.1093/biomtc/ujae028","url":null,"abstract":"<p><p>When studying the treatment effect on time-to-event outcomes, it is common that some individuals never experience failure events, which suggests that they have been cured. However, the cure status may not be observed due to censoring which makes it challenging to define treatment effects. Current methods mainly focus on estimating model parameters in various cure models, ultimately leading to a lack of causal interpretations. To address this issue, we propose 2 causal estimands, the timewise risk difference and mean survival time difference, in the always-uncured based on principal stratification as a complement to the treatment effect on cure rates. These estimands allow us to study the treatment effects on failure times in the always-uncured subpopulation. We show the identifiability using a substitutional variable for the potential cure status under ignorable treatment assignment mechanism, these 2 estimands are identifiable. We also provide estimation methods using mixture cure models. We applied our approach to an observational study that compared the leukemia-free survival rates of different transplantation types to cure acute lymphoblastic leukemia. Our proposed approach yielded insightful results that can be used to inform future treatment decisions.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140856016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hajime Uno, Lu Tian, Miki Horiguchi, Satoshi Hattori, Kenneth L Kehl
Limitations of using the traditional Cox's hazard ratio for summarizing the magnitude of the treatment effect on time-to-event outcomes have been widely discussed, and alternative measures that do not have such limitations are gaining attention. One of the alternative methods recently proposed, in a simple 2-sample comparison setting, uses the average hazard with survival weight (AH), which can be interpreted as the general censoring-free person-time incidence rate on a given time window. In this paper, we propose a new regression analysis approach for the AH with a truncation time τ. We investigate 3 versions of AH regression analysis, assuming (1) independent censoring, (2) group-specific censoring, and (3) covariate-dependent censoring. The proposed AH regression methods are closely related to robust Poisson regression. While the new approach needs to require a truncation time τ explicitly, it can be more robust than Poisson regression in the presence of censoring. With the AH regression approach, one can summarize the between-group treatment difference in both absolute difference and relative terms, adjusting for covariates that are associated with the outcome. This property will increase the likelihood that the treatment effect magnitude is correctly interpreted. The AH regression approach can be a useful alternative to the traditional Cox's hazard ratio approach for estimating and reporting the magnitude of the treatment effect on time-to-event outcomes.
{"title":"Regression models for average hazard.","authors":"Hajime Uno, Lu Tian, Miki Horiguchi, Satoshi Hattori, Kenneth L Kehl","doi":"10.1093/biomtc/ujae037","DOIUrl":"10.1093/biomtc/ujae037","url":null,"abstract":"<p><p>Limitations of using the traditional Cox's hazard ratio for summarizing the magnitude of the treatment effect on time-to-event outcomes have been widely discussed, and alternative measures that do not have such limitations are gaining attention. One of the alternative methods recently proposed, in a simple 2-sample comparison setting, uses the average hazard with survival weight (AH), which can be interpreted as the general censoring-free person-time incidence rate on a given time window. In this paper, we propose a new regression analysis approach for the AH with a truncation time τ. We investigate 3 versions of AH regression analysis, assuming (1) independent censoring, (2) group-specific censoring, and (3) covariate-dependent censoring. The proposed AH regression methods are closely related to robust Poisson regression. While the new approach needs to require a truncation time τ explicitly, it can be more robust than Poisson regression in the presence of censoring. With the AH regression approach, one can summarize the between-group treatment difference in both absolute difference and relative terms, adjusting for covariates that are associated with the outcome. This property will increase the likelihood that the treatment effect magnitude is correctly interpreted. The AH regression approach can be a useful alternative to the traditional Cox's hazard ratio approach for estimating and reporting the magnitude of the treatment effect on time-to-event outcomes.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11107592/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141074620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discussion on \"Bayesian meta-analysis of penetrance for cancer risk\" by Thanthirige Lakshika M. Ruberu, Danielle Braun, Giovanni Parmigiani, and Swati Biswas.","authors":"Gianluca Baio","doi":"10.1093/biomtc/ujae041","DOIUrl":"https://doi.org/10.1093/biomtc/ujae041","url":null,"abstract":"","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141178720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}