首页 > 最新文献

Statistics in Medicine最新文献

英文 中文
Multiple Testing of Mix-and-Match Feature Sets in Multi-Omics. 多组学中混合匹配特征集的多重测试。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 DOI: 10.1002/sim.70367
Mitra Ebrahimpoor, Renée Menezes, Ningning Xu, Jelle J Goeman

Integrated analysis of multi-omics datasets holds great promise for uncovering complex biological processes. However, the large dimensionality of omics data poses significant interpretability and multiple testing challenges. Simultaneous enrichment analysis (SEA) was introduced to address these issues in single-omics analysis, providing an in-built multiple testing correction and enabling simultaneous feature set testing. In this article, we introduce OCEAN, an extension of SEA to multi-omics data. OCEAN is a flexible approach to analyze potentially all possible two-way feature sets from any pair of genomics datasets. We also propose two new error rates which are in line with the two-way structure of the data and facilitate interpretation of the results. The power and utility of OCEAN are demonstrated by analyzing copy number and gene expression data for breast and colon cancer.

多组学数据集的综合分析为揭示复杂的生物过程提供了巨大的希望。然而,组学数据的大维度带来了重大的可解释性和多重测试挑战。同时富集分析(SEA)是为了解决单组学分析中的这些问题而引入的,它提供了内置的多个测试校正,并支持同时进行特征集测试。本文介绍了SEA对多组学数据的扩展——OCEAN。OCEAN是一种灵活的方法,可以分析任何一对基因组学数据集中潜在的所有可能的双向特征集。我们还提出了两个新的错误率,它们符合数据的双向结构,便于对结果的解释。通过分析乳腺癌和结肠癌的拷贝数和基因表达数据,证明了OCEAN的功能和实用性。
{"title":"Multiple Testing of Mix-and-Match Feature Sets in Multi-Omics.","authors":"Mitra Ebrahimpoor, Renée Menezes, Ningning Xu, Jelle J Goeman","doi":"10.1002/sim.70367","DOIUrl":"10.1002/sim.70367","url":null,"abstract":"<p><p>Integrated analysis of multi-omics datasets holds great promise for uncovering complex biological processes. However, the large dimensionality of omics data poses significant interpretability and multiple testing challenges. Simultaneous enrichment analysis (SEA) was introduced to address these issues in single-omics analysis, providing an in-built multiple testing correction and enabling simultaneous feature set testing. In this article, we introduce OCEAN, an extension of SEA to multi-omics data. OCEAN is a flexible approach to analyze potentially all possible two-way feature sets from any pair of genomics datasets. We also propose two new error rates which are in line with the two-way structure of the data and facilitate interpretation of the results. The power and utility of OCEAN are demonstrated by analyzing copy number and gene expression data for breast and colon cancer.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 1-2","pages":"e70367"},"PeriodicalIF":1.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12825407/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Sparsening and Smoothing of the Treatment Model for Longitudinal Causal Inference Using Outcome-Adaptive LASSO and Marginal Fused LASSO. 基于结果自适应LASSO和边缘融合LASSO的纵向因果推理处理模型的自适应稀疏和平滑。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 DOI: 10.1002/sim.70316
Mireille E Schnitzer, Denis Talbot, Yan Liu, David Berger, Guanbo Wang, Jennifer O'Loughlin, Marie-Pierre Sylvestre, Ashkan Ertefaie

Causal variable selection in time-varying treatment settings is challenging due to evolving confounding effects. Existing methods mainly focus on time-fixed exposures and are not directly applicable to time-varying scenarios. We propose a novel two-step procedure for variable selection when modeling the treatment probability at each time point. We first introduce a novel approach to longitudinal confounder selection using a Longitudinal Outcome Adaptive LASSO (LOAL) that will data-adaptively select covariates with theoretical justification of variance reduction of the estimator of the causal effect. We then propose an adaptive fused LASSO that can collapse treatment model parameters over time points with the goal of simplifying the models in order to improve the efficiency of the estimator while minimizing model misspecification bias compared with naive pooled logistic regression models. Our simulation studies highlight the need for and usefulness of the proposed approach in practice. We implemented our method on data from the Nicotine Dependence in Teens study to estimate the effect of the timing of alcohol initiation during adolescence on depressive symptoms in early adulthood.

由于不断发展的混杂效应,时变治疗设置中的因果变量选择具有挑战性。现有的方法主要集中于时间固定暴露,不能直接适用于时变场景。我们提出了一个新的两步程序变量选择时,建模在每个时间点的处理概率。我们首先引入了一种纵向混杂选择的新方法,使用纵向结果自适应LASSO (LOAL),该方法将数据自适应地选择协变量,并对因果效应估计量的方差减少进行理论证明。然后,我们提出了一种自适应融合LASSO,该LASSO可以在时间点上崩溃处理模型参数,目的是简化模型,以提高估计器的效率,同时与朴素池逻辑回归模型相比,最大限度地减少模型错配偏差。我们的模拟研究强调了所提出的方法在实践中的必要性和实用性。我们对来自青少年尼古丁依赖研究的数据实施了我们的方法,以估计青春期开始饮酒的时间对成年早期抑郁症状的影响。
{"title":"Adaptive Sparsening and Smoothing of the Treatment Model for Longitudinal Causal Inference Using Outcome-Adaptive LASSO and Marginal Fused LASSO.","authors":"Mireille E Schnitzer, Denis Talbot, Yan Liu, David Berger, Guanbo Wang, Jennifer O'Loughlin, Marie-Pierre Sylvestre, Ashkan Ertefaie","doi":"10.1002/sim.70316","DOIUrl":"10.1002/sim.70316","url":null,"abstract":"<p><p>Causal variable selection in time-varying treatment settings is challenging due to evolving confounding effects. Existing methods mainly focus on time-fixed exposures and are not directly applicable to time-varying scenarios. We propose a novel two-step procedure for variable selection when modeling the treatment probability at each time point. We first introduce a novel approach to longitudinal confounder selection using a Longitudinal Outcome Adaptive LASSO (LOAL) that will data-adaptively select covariates with theoretical justification of variance reduction of the estimator of the causal effect. We then propose an adaptive fused LASSO that can collapse treatment model parameters over time points with the goal of simplifying the models in order to improve the efficiency of the estimator while minimizing model misspecification bias compared with naive pooled logistic regression models. Our simulation studies highlight the need for and usefulness of the proposed approach in practice. We implemented our method on data from the Nicotine Dependence in Teens study to estimate the effect of the timing of alcohol initiation during adolescence on depressive symptoms in early adulthood.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 1-2","pages":"e70316"},"PeriodicalIF":1.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12826353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Doubly Robust Estimation and Sensitivity Analysis With Outcomes Truncated by Death in Multi-Arm Clinical Trials. 多组临床试验中死亡截断结果的双稳健估计和敏感性分析。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70297
Jiaqi Tong, Chao Cheng, Guangyu Tong, Michael O Harhay, Fan Li

In clinical trials, the observation of participant outcomes may frequently be hindered by death, leading to ambiguity in defining a scientifically meaningful final outcome for those who die. Principal stratification methods are valuable tools for addressing the average causal effect among always-survivors, that is, the average treatment effect among a subpopulation defined as those who would survive regardless of treatment assignment. Although robust methods for the truncation-by-death problem in two-arm clinical trials have been previously studied, their expansion to multi-arm clinical trials remains elusive. In this article, we study the identification of a class of survivor average causal effect estimands with multiple treatments under monotonicity and principal ignorability, and first propose simple weighting and regression approaches for point estimation. As a further improvement, we derive the efficient influence function to motivate doubly robust estimators for the survivor average causal effects in multi-arm clinical trials. We also propose sensitivity methods under violations of key causal assumptions. Extensive simulations are conducted to investigate the finite-sample performance of the proposed methods against the existing methods, and a real data example is used to illustrate how to operationalize the proposed estimators and the sensitivity methods in practice.

在临床试验中,对参与者结果的观察可能经常受到死亡的阻碍,导致对死者的科学意义的最终结果的定义不明确。主要分层方法是解决始终存活者的平均因果效应的有价值的工具,也就是说,定义为无论治疗分配如何都能存活的亚群的平均治疗效果。虽然以前已经研究了用于两臂临床试验中死亡截断问题的可靠方法,但将其扩展到多臂临床试验仍然难以捉摸。本文研究了在单调性和主可忽略性条件下多重处理的一类幸存者平均因果效应估计的识别问题,并首次提出了简单的加权和回归方法进行点估计。作为进一步的改进,我们推导了有效的影响函数来激励多组临床试验中幸存者平均因果效应的双稳健估计器。我们还提出了违反关键因果假设的敏感性方法。通过大量的仿真研究了所提出的方法与现有方法的有限样本性能,并通过一个真实的数据实例说明了如何在实际中操作所提出的估计器和灵敏度方法。
{"title":"Doubly Robust Estimation and Sensitivity Analysis With Outcomes Truncated by Death in Multi-Arm Clinical Trials.","authors":"Jiaqi Tong, Chao Cheng, Guangyu Tong, Michael O Harhay, Fan Li","doi":"10.1002/sim.70297","DOIUrl":"https://doi.org/10.1002/sim.70297","url":null,"abstract":"<p><p>In clinical trials, the observation of participant outcomes may frequently be hindered by death, leading to ambiguity in defining a scientifically meaningful final outcome for those who die. Principal stratification methods are valuable tools for addressing the average causal effect among always-survivors, that is, the average treatment effect among a subpopulation defined as those who would survive regardless of treatment assignment. Although robust methods for the truncation-by-death problem in two-arm clinical trials have been previously studied, their expansion to multi-arm clinical trials remains elusive. In this article, we study the identification of a class of survivor average causal effect estimands with multiple treatments under monotonicity and principal ignorability, and first propose simple weighting and regression approaches for point estimation. As a further improvement, we derive the efficient influence function to motivate doubly robust estimators for the survivor average causal effects in multi-arm clinical trials. We also propose sensitivity methods under violations of key causal assumptions. Extensive simulations are conducted to investigate the finite-sample performance of the proposed methods against the existing methods, and a real data example is used to illustrate how to operationalize the proposed estimators and the sensitivity methods in practice.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70297"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145709352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Survival Analysis Under the Aalen's Additive Hazards Model With Covariate Measurement Error: Application to Causal Mediation Analysis. 具有协变量测量误差的Aalen加性风险模型下的生存分析:在因果中介分析中的应用。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70346
Xialing Wen, Liangchen Qin, Hui Wu, Ying Yan

Covariate measurement error is an important problem in survival analysis, which has been well studied under the Cox proportional hazards model. However, measurement error effects have been rarely addressed under the Aalen's additive hazards model, and there is a lack of methods to correct for error effects. In recent years, the Aalen's additive hazards model has been increasingly used in causal mediation analysis. Although the longitudinal mediator is frequently measured with uncertainty, the issue of measurement error in the mediator has received little attention. In this article, we study the general problem of covariate measurement error under the Aalen's additive hazards model and propose a measurement error correction strategy. We then extend the proposed method to causal mediation analysis in the survival setting with an error-prone longitudinal mediator. Corrected estimation of the direct and indirect effects is obtained. The performance of the proposed method is assessed in numerical studies.

协变量测量误差是生存分析中的一个重要问题,在Cox比例风险模型下已经得到了很好的研究。然而,在Aalen的加性危害模型下,测量误差效应很少得到解决,并且缺乏校正误差效应的方法。近年来,Aalen的加性危害模型越来越多地用于因果中介分析。虽然纵向介质的测量经常不确定,但测量误差的问题很少受到关注。本文研究了Aalen加性危害模型下协变量测量误差的一般问题,并提出了一种测量误差修正策略。然后,我们将提出的方法扩展到具有容易出错的纵向中介的生存设置中的因果中介分析。得到了直接和间接影响的修正估计。在数值研究中对该方法的性能进行了评价。
{"title":"Survival Analysis Under the Aalen's Additive Hazards Model With Covariate Measurement Error: Application to Causal Mediation Analysis.","authors":"Xialing Wen, Liangchen Qin, Hui Wu, Ying Yan","doi":"10.1002/sim.70346","DOIUrl":"https://doi.org/10.1002/sim.70346","url":null,"abstract":"<p><p>Covariate measurement error is an important problem in survival analysis, which has been well studied under the Cox proportional hazards model. However, measurement error effects have been rarely addressed under the Aalen's additive hazards model, and there is a lack of methods to correct for error effects. In recent years, the Aalen's additive hazards model has been increasingly used in causal mediation analysis. Although the longitudinal mediator is frequently measured with uncertainty, the issue of measurement error in the mediator has received little attention. In this article, we study the general problem of covariate measurement error under the Aalen's additive hazards model and propose a measurement error correction strategy. We then extend the proposed method to causal mediation analysis in the survival setting with an error-prone longitudinal mediator. Corrected estimation of the direct and indirect effects is obtained. The performance of the proposed method is assessed in numerical studies.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70346"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145701538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects Among the Affected. 受影响人群的影响。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70353
Lina M Montoya, Elvin H Geng, Michael Valancius, Michael R Kosorok, Maya L Petersen

We propose a novel causal estimand that elucidates how response to an earlier treatment (e.g., treatment initiation) modifies the effect of a later treatment (e.g., treatment discontinuation), thus learning if there are effects among the (un)affected. Specifically, we consider a working marginal structural model summarizing how the average effect of a later treatment varies as a function of the (estimated) conditional average effect of an earlier treatment. We define the estimand to be a data-adaptive causal parameter, allowing for estimation of the conditional average treatment effect using machine learning without making strong smoothness assumptions. We show how a sequentially randomized design can be used to identify this causal estimand, and we describe a targeted maximum likelihood estimator for the resulting statistical estimand, with influence curve-based inference. We present simulation studies that evaluate the performance of this estimator under various finite-sample scenarios. Throughout, we use the "Adaptive Strategies for Preventing and Treating Lapses of Retention in HIV Care" trial (NCT02338739) as an illustrative example, showing that discontinuation of conditional cash transfers for HIV care adherence was most harmful among those who had an increase in benefit from them initially.

我们提出了一个新的因果估计,阐明了对早期治疗(如开始治疗)的反应如何改变后来治疗(如停止治疗)的效果,从而了解(未)受影响者之间是否存在影响。具体来说,我们考虑了一个有效的边际结构模型,该模型总结了后期处理的平均效果如何随着早期处理的(估计的)条件平均效果的函数而变化。我们将估计定义为一个数据自适应的因果参数,允许使用机器学习来估计条件平均处理效果,而无需做出强平滑假设。我们展示了如何使用顺序随机设计来识别这种因果估计,并且我们描述了结果统计估计的目标最大似然估计器,具有基于影响曲线的推断。我们提出了模拟研究,以评估该估计器在各种有限样本场景下的性能。在整个研究过程中,我们使用了“预防和治疗艾滋病护理中保留缺失的适应性策略”试验(NCT02338739)作为一个说明性的例子,表明停止有条件的艾滋病护理坚持现金转移对那些最初受益增加的人来说是最有害的。
{"title":"Effects Among the Affected.","authors":"Lina M Montoya, Elvin H Geng, Michael Valancius, Michael R Kosorok, Maya L Petersen","doi":"10.1002/sim.70353","DOIUrl":"10.1002/sim.70353","url":null,"abstract":"<p><p>We propose a novel causal estimand that elucidates how response to an earlier treatment (e.g., treatment initiation) modifies the effect of a later treatment (e.g., treatment discontinuation), thus learning if there are effects among the (un)affected. Specifically, we consider a working marginal structural model summarizing how the average effect of a later treatment varies as a function of the (estimated) conditional average effect of an earlier treatment. We define the estimand to be a data-adaptive causal parameter, allowing for estimation of the conditional average treatment effect using machine learning without making strong smoothness assumptions. We show how a sequentially randomized design can be used to identify this causal estimand, and we describe a targeted maximum likelihood estimator for the resulting statistical estimand, with influence curve-based inference. We present simulation studies that evaluate the performance of this estimator under various finite-sample scenarios. Throughout, we use the \"Adaptive Strategies for Preventing and Treating Lapses of Retention in HIV Care\" trial (NCT02338739) as an illustrative example, showing that discontinuation of conditional cash transfers for HIV care adherence was most harmful among those who had an increase in benefit from them initially.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70353"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12801280/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Integrated Learning of Longitudinal Dose-Response Relationships via Decentralized Clinical Trials. 分散临床试验中纵向剂量-反应关系的贝叶斯综合学习。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70338
Jingyi Zhang, Tuo Wang, Yongming Qu, Fangrong Yan, Suyu Liu, Ruitao Lin

Decentralized clinical trials (DCTs) extend trial activities beyond traditional sites, enhancing access, convenience, efficiency, and result generalizability. They are particularly promising for chronic conditions like diabetes and obesity, which require longer study durations to evaluate drug effects. However, decentralized data collection raises concerns about increased variability and potential biases. This paper presents a novel Bayesian integrated learning procedure to analyze dose-response relationships using longitudinal data from a phase II DCT that combines centralized and decentralized data collection. We generalize a parametric exponential decay model to handle mixed data sources and apply Bayesian spike-and-slab priors to address biases and uncertainties from decentralized measurements. Our model enables data-adaptive integration of information from both centralized and decentralized sources. Through simulations and sensitivity analyses, we show that the proposed approach achieves favorable performance across various scenarios. Notably, the method matches the efficiency of traditional trials when decentralized data collection introduces no additional variability or error. Even when such issues arise, it remains less biased and more efficient than naïve methods that rely solely on centralized data or simply pool data from both sources.

分散临床试验(dct)将试验活动扩展到传统地点之外,提高了可及性、便利性、效率和结果的普遍性。它们对糖尿病和肥胖症等慢性疾病尤其有希望,因为这些疾病需要更长的研究时间来评估药物效果。然而,分散的数据收集引起了对变异性增加和潜在偏差的担忧。本文提出了一种新的贝叶斯综合学习方法,利用来自集中和分散数据收集的II期DCT的纵向数据来分析剂量-反应关系。我们推广了一个参数指数衰减模型来处理混合数据源,并应用贝叶斯尖峰-板先验来解决分散测量的偏差和不确定性。我们的模型支持对来自集中和分散来源的信息进行数据自适应集成。通过仿真和灵敏度分析,我们表明该方法在各种场景下都具有良好的性能。值得注意的是,当分散的数据收集不引入额外的可变性或错误时,该方法与传统试验的效率相匹配。即使出现这样的问题,它仍然比完全依赖集中数据或简单地汇集来自两个来源的数据的naïve方法更少偏见和更有效。
{"title":"Bayesian Integrated Learning of Longitudinal Dose-Response Relationships via Decentralized Clinical Trials.","authors":"Jingyi Zhang, Tuo Wang, Yongming Qu, Fangrong Yan, Suyu Liu, Ruitao Lin","doi":"10.1002/sim.70338","DOIUrl":"10.1002/sim.70338","url":null,"abstract":"<p><p>Decentralized clinical trials (DCTs) extend trial activities beyond traditional sites, enhancing access, convenience, efficiency, and result generalizability. They are particularly promising for chronic conditions like diabetes and obesity, which require longer study durations to evaluate drug effects. However, decentralized data collection raises concerns about increased variability and potential biases. This paper presents a novel Bayesian integrated learning procedure to analyze dose-response relationships using longitudinal data from a phase II DCT that combines centralized and decentralized data collection. We generalize a parametric exponential decay model to handle mixed data sources and apply Bayesian spike-and-slab priors to address biases and uncertainties from decentralized measurements. Our model enables data-adaptive integration of information from both centralized and decentralized sources. Through simulations and sensitivity analyses, we show that the proposed approach achieves favorable performance across various scenarios. Notably, the method matches the efficiency of traditional trials when decentralized data collection introduces no additional variability or error. Even when such issues arise, it remains less biased and more efficient than naïve methods that rely solely on centralized data or simply pool data from both sources.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70338"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12675892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145669375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cluster-Level Analyses to Estimate a Risk Difference in a Cluster Randomized Trial With Confounding Individual-Level Covariates: A Simulation Study. 在混杂个体水平协变量的聚类随机试验中估计风险差异的聚类水平分析:模拟研究。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70341
Jules Antoine Pereira Macedo, Bruno Giraudeau

Cluster randomized trials (CRTs) may be analyzed using cluster-level analyses. For binary outcomes, proportions are estimated for each cluster, and a risk difference can be estimated. The confidence interval is estimated using a Student distribution. However, in doing so, individual-level characteristics are not adjusted for even though CRTs are known to be prone to recruitment/identification bias, possibly implying individual-level confounders. With a simulation study, we compared cluster-level analyses to estimate a risk difference for a two-arm parallel CRT with individual-level confounders and cluster-level covariates. We considered the unadjusted (UN) method, two two-stage procedure (TSP) methods considering a binomial or a Gaussian distribution, G-computation (GC), and targeted maximum likelihood estimation (TMLE) methods. As expected, the UN method was biased. TSP methods were also biased for scenarios with a treatment effect when the number of clusters per arm was small. GC and TMLE methods were unbiased. For these latter methods, adjustment on only individual-level covariates led to better performance measures (type I error rate, coverage rate and relative error of the standard error) than adjustment on both individual- and cluster-level covariates. TSP, GC and TMLE had very similar results except in scenarios with a small number of clusters: Biased results for TSP methods and convergence problems for GC methods. In this case, TMLE should be preferred.

聚类随机试验(crt)可以使用聚类水平分析进行分析。对于二元结果,估计每个集群的比例,并可以估计风险差异。置信区间是使用学生分布估计的。然而,在这样做时,即使已知crt容易产生招募/识别偏差,也没有调整个人水平的特征,这可能意味着个人水平的混杂因素。通过模拟研究,我们比较了群集水平分析,以估计双臂平行CRT与个体水平混杂因素和群集水平协变量的风险差异。我们考虑了未调整(UN)方法、考虑二项分布或高斯分布的两个两阶段过程(TSP)方法、g计算(GC)和目标最大似然估计(TMLE)方法。正如所料,联合国的方法是有偏见的。TSP方法也偏向于治疗效果的情景,当每臂的簇数较小时。GC和TMLE方法均无偏倚。对于后一种方法,仅调整个人水平的协变量会比调整个人和集群水平的协变量产生更好的性能度量(第一类错误率、覆盖率和标准误差的相对误差)。TSP、GC和TMLE的结果非常相似,除了在集群数量较少的情况下:TSP方法的结果有偏差,GC方法的结果有收敛问题。在这种情况下,应该首选TMLE。
{"title":"Cluster-Level Analyses to Estimate a Risk Difference in a Cluster Randomized Trial With Confounding Individual-Level Covariates: A Simulation Study.","authors":"Jules Antoine Pereira Macedo, Bruno Giraudeau","doi":"10.1002/sim.70341","DOIUrl":"https://doi.org/10.1002/sim.70341","url":null,"abstract":"<p><p>Cluster randomized trials (CRTs) may be analyzed using cluster-level analyses. For binary outcomes, proportions are estimated for each cluster, and a risk difference can be estimated. The confidence interval is estimated using a Student distribution. However, in doing so, individual-level characteristics are not adjusted for even though CRTs are known to be prone to recruitment/identification bias, possibly implying individual-level confounders. With a simulation study, we compared cluster-level analyses to estimate a risk difference for a two-arm parallel CRT with individual-level confounders and cluster-level covariates. We considered the unadjusted (UN) method, two two-stage procedure (TSP) methods considering a binomial or a Gaussian distribution, G-computation (GC), and targeted maximum likelihood estimation (TMLE) methods. As expected, the UN method was biased. TSP methods were also biased for scenarios with a treatment effect when the number of clusters per arm was small. GC and TMLE methods were unbiased. For these latter methods, adjustment on only individual-level covariates led to better performance measures (type I error rate, coverage rate and relative error of the standard error) than adjustment on both individual- and cluster-level covariates. TSP, GC and TMLE had very similar results except in scenarios with a small number of clusters: Biased results for TSP methods and convergence problems for GC methods. In this case, TMLE should be preferred.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70341"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145669473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-Stage Drop-the-Losers Design for the Selection of Effective Treatments and Estimating Their Average Worth. 有效治疗方案选择及平均价值估计的两阶段淘汰设计。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70344
Yogesh Katariya, Neeraj Misra

In multi-arm clinical trials, several new treatments are often evaluated concurrently to identify the best and confirm their superiority over a control. In this paper, we propose a framework that introduces an intermediate stage aimed at assessing the collective efficacy of treatments retained after initial screening. Estimating the average effect of the selected treatments provides an interpretable measure of their collective potential and serves as a data-driven criterion for deciding whether to continue or terminate the trial. Consider k ( 2 ) $$ kkern0.3em left(ge 2right) $$ experimental treatments whose effects are described by independent Gaussian responses with unknown means and a common variance. For the purpose of selecting the effective treatments (drugs) and estimating their average worth, we employ a two-stage drop-the-losers design (DLD). To get an idea about the structure of an optimal estimator, we first assume that the common variance is known. In the first stage of the design, data is collected to select a subset of experimental treatments so that the probability of including the best treatment is at least a prespecified level P $$ {P}^{ast } $$ . This selection rule ensures that inferior treatments are eliminated while maintaining a minimum confidence that the best treatment remains among those advanced. Given this requirement, the design either advances all selected treatments to the next stage or stops for futility. The treatment(s) selected in the subset then proceed to the second stage for estimating their collective effectiveness through point estimation of their average worth, defined as the arithmetic average of their mean effects. Since the bias of estimators is crucial in clinical studies, we derive the uniformly minimum variance conditionally unbiased estimator (UMVCUE) of the worth of the selected treatments, conditioned on the indices of treatments selected at the first stage. The mean squared error and bias performances of the UMVCUE are compared with the naive estimator (maximum likelihood estimator) via a simulation study. For the unknown variance scenario, we propose a plug-in estimator based on the structure of the UMVCUE derived for the known variance case and study its performance through simulations. A real-life data example is also provided to illustrate an application of our findings.

在多组临床试验中,经常同时评估几种新疗法,以确定最佳疗法,并确认其优于对照组。在本文中,我们提出了一个框架,引入了一个中间阶段,旨在评估初步筛选后保留的治疗方法的集体功效。对所选治疗方法的平均效果进行估计,可以对其综合潜力进行可解释的衡量,并作为决定是否继续或终止试验的数据驱动标准。考虑k(≥2)$$ kkern0.3em left(ge 2right) $$实验处理,其效果由具有未知均值和共同方差的独立高斯响应描述。为了选择有效的治疗方法(药物)并估计其平均价值,我们采用了两阶段抛弃失败者设计(DLD)。为了了解最优估计量的结构,我们首先假设公共方差是已知的。在设计的第一阶段,收集数据以选择实验处理的一个子集,以便包括最佳处理的概率至少为预先指定的水平P * $$ {P}^{ast } $$。这一选择规则确保了较差的治疗方法被淘汰,同时保持了最好的治疗方法仍然在那些先进的治疗方法中的最低信心。考虑到这一要求,设计要么将所有选定的处理推进到下一阶段,要么因无效而停止。然后,在子集中选择的治疗进入第二阶段,通过对其平均值的点估计来估计其集体有效性,定义为其平均效果的算术平均值。由于估计器的偏差在临床研究中是至关重要的,我们推导了所选治疗价值的一致最小方差条件无偏估计器(UMVCUE),条件是在第一阶段选择的治疗指标。通过仿真研究,比较了UMVCUE与朴素估计(极大似然估计)的均方误差和偏置性能。对于未知方差情况,我们提出了一种基于已知方差情况下衍生的UMVCUE结构的插件估计器,并通过仿真研究了其性能。还提供了一个实际数据示例来说明我们的研究结果的应用。
{"title":"Two-Stage Drop-the-Losers Design for the Selection of Effective Treatments and Estimating Their Average Worth.","authors":"Yogesh Katariya, Neeraj Misra","doi":"10.1002/sim.70344","DOIUrl":"https://doi.org/10.1002/sim.70344","url":null,"abstract":"<p><p>In multi-arm clinical trials, several new treatments are often evaluated concurrently to identify the best and confirm their superiority over a control. In this paper, we propose a framework that introduces an intermediate stage aimed at assessing the collective efficacy of treatments retained after initial screening. Estimating the average effect of the selected treatments provides an interpretable measure of their collective potential and serves as a data-driven criterion for deciding whether to continue or terminate the trial. Consider <math> <semantics><mrow><mi>k</mi> <mspace></mspace> <mo>(</mo> <mo>≥</mo> <mn>2</mn> <mo>)</mo></mrow> <annotation>$$ kkern0.3em left(ge 2right) $$</annotation></semantics> </math> experimental treatments whose effects are described by independent Gaussian responses with unknown means and a common variance. For the purpose of selecting the effective treatments (drugs) and estimating their average worth, we employ a two-stage drop-the-losers design (DLD). To get an idea about the structure of an optimal estimator, we first assume that the common variance is known. In the first stage of the design, data is collected to select a subset of experimental treatments so that the probability of including the best treatment is at least a prespecified level <math> <semantics> <mrow> <msup><mrow><mi>P</mi></mrow> <mrow><mo>∗</mo></mrow> </msup> </mrow> <annotation>$$ {P}^{ast } $$</annotation></semantics> </math> . This selection rule ensures that inferior treatments are eliminated while maintaining a minimum confidence that the best treatment remains among those advanced. Given this requirement, the design either advances all selected treatments to the next stage or stops for futility. The treatment(s) selected in the subset then proceed to the second stage for estimating their collective effectiveness through point estimation of their average worth, defined as the arithmetic average of their mean effects. Since the bias of estimators is crucial in clinical studies, we derive the uniformly minimum variance conditionally unbiased estimator (UMVCUE) of the worth of the selected treatments, conditioned on the indices of treatments selected at the first stage. The mean squared error and bias performances of the UMVCUE are compared with the naive estimator (maximum likelihood estimator) via a simulation study. For the unknown variance scenario, we propose a plug-in estimator based on the structure of the UMVCUE derived for the known variance case and study its performance through simulations. A real-life data example is also provided to illustrate an application of our findings.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70344"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145687921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explaining Individualized Treatment Rules: Integrating LIME and SHAP With Xgboost in Precision Medicine. 个体化治疗规律阐释:精准医学中LIME、SHAP与Xgboost的结合。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70322
Zihuan Liu, Xin Huang

Precision medicine relies on accurate and interpretable predictive models to identify patient subgroups and biomarkers that can guide individualized treatment strategies. While extreme gradient boosting (XGBoost) often achieves state-of-the-art predictive performance, its complexity can impede understanding of how input variables influence outcomes. Building upon existing XGBoost frameworks for estimating individualized treatment rule (ITR), we introduce a global permutation test within this framework to assess treatment effect heterogeneity. Additionally, we incorporate two model-agnostic explanation techniques, local interpretable model-agnostic explanations (LIME) and SHapley Additive exPlanations (SHAP), to enhance interpretability at both global and individual levels. Through simulations and analyses of real-world clinical trial datasets, we illustrate that our permutation-based pipeline can detect empirical signals of treatment effect heterogeneity, while LIME and SHAP offer exploratory insights into feature contributions and ITR.

精准医学依赖于准确和可解释的预测模型来识别患者亚组和生物标志物,从而指导个性化的治疗策略。虽然极端梯度增强(XGBoost)通常可以实现最先进的预测性能,但其复杂性可能会阻碍对输入变量如何影响结果的理解。在现有的估计个性化治疗规则(ITR)的XGBoost框架的基础上,我们在该框架中引入了一个全局排列检验来评估治疗效果的异质性。此外,我们结合了两种模型不可知论解释技术,局部可解释模型不可知论解释(LIME)和SHapley加性解释(SHAP),以增强全球和个人层面的可解释性。通过模拟和分析真实世界的临床试验数据集,我们证明了我们基于排列的管道可以检测治疗效果异质性的经验信号,而LIME和SHAP提供了对特征贡献和ITR的探索性见解。
{"title":"Explaining Individualized Treatment Rules: Integrating LIME and SHAP With Xgboost in Precision Medicine.","authors":"Zihuan Liu, Xin Huang","doi":"10.1002/sim.70322","DOIUrl":"https://doi.org/10.1002/sim.70322","url":null,"abstract":"<p><p>Precision medicine relies on accurate and interpretable predictive models to identify patient subgroups and biomarkers that can guide individualized treatment strategies. While extreme gradient boosting (XGBoost) often achieves state-of-the-art predictive performance, its complexity can impede understanding of how input variables influence outcomes. Building upon existing XGBoost frameworks for estimating individualized treatment rule (ITR), we introduce a global permutation test within this framework to assess treatment effect heterogeneity. Additionally, we incorporate two model-agnostic explanation techniques, local interpretable model-agnostic explanations (LIME) and SHapley Additive exPlanations (SHAP), to enhance interpretability at both global and individual levels. Through simulations and analyses of real-world clinical trial datasets, we illustrate that our permutation-based pipeline can detect empirical signals of treatment effect heterogeneity, while LIME and SHAP offer exploratory insights into feature contributions and ITR.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70322"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145655659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference on Controlled Effects for Assessing Immune Correlates of Protection Based on a Cox Model. 基于Cox模型评估免疫保护相关因素的控制效应推断
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1002/sim.70347
Avi Kenny, Lars van der Laan, Peter Gilbert, Marco Carone

In vaccine research, it is important to identify biomarkers that can reliably predict vaccine efficacy against a clinical endpoint. Such biomarkers are known as immune correlates of protection (CoP) and can serve as surrogate endpoints in vaccine efficacy trials to accelerate the approval process. CoPs must be rigorously validated, and one method of doing so is through the controlled risk (CR) curve, a function that represents the causal effect of the biomarker on population-level risk of experiencing the endpoint of interest by a certain time post-vaccination. The CR curve can be estimated by leveraging a Cox proportional hazards model, but researchers currently rely on the bootstrap for inference, which can be computationally demanding. In this article, we analytically derive the asymptotic variance of this estimator, providing an analytic approach for constructing both pointwise and uniform confidence bands. We evaluate the finite sample performance of these methods in a simulation study and illustrate their use on data from the Coronavirus Efficacy (COVE) placebo-controlled phase 3 trial (NCT04470427) of the mRNA-1273 COVID-19 vaccine.

在疫苗研究中,确定能够可靠地预测疫苗对临床终点疗效的生物标志物是很重要的。这些生物标志物被称为免疫保护相关物(CoP),可以作为疫苗功效试验的替代终点,以加快批准程序。cop必须严格验证,其中一种方法是通过控制风险(CR)曲线,该函数表示生物标志物对接种疫苗后一定时间内经历目标终点的人群水平风险的因果效应。CR曲线可以通过利用Cox比例风险模型来估计,但研究人员目前依赖于自举推断,这可能对计算要求很高。在本文中,我们解析地推导了该估计量的渐近方差,提供了一种构造点态和一致置信带的解析方法。我们在一项模拟研究中评估了这些方法的有限样本性能,并说明了它们在mRNA-1273 COVID-19疫苗冠状病毒疗效(COVE)安慰剂对照3期试验(NCT04470427)数据上的应用。
{"title":"Inference on Controlled Effects for Assessing Immune Correlates of Protection Based on a Cox Model.","authors":"Avi Kenny, Lars van der Laan, Peter Gilbert, Marco Carone","doi":"10.1002/sim.70347","DOIUrl":"10.1002/sim.70347","url":null,"abstract":"<p><p>In vaccine research, it is important to identify biomarkers that can reliably predict vaccine efficacy against a clinical endpoint. Such biomarkers are known as immune correlates of protection (CoP) and can serve as surrogate endpoints in vaccine efficacy trials to accelerate the approval process. CoPs must be rigorously validated, and one method of doing so is through the controlled risk (CR) curve, a function that represents the causal effect of the biomarker on population-level risk of experiencing the endpoint of interest by a certain time post-vaccination. The CR curve can be estimated by leveraging a Cox proportional hazards model, but researchers currently rely on the bootstrap for inference, which can be computationally demanding. In this article, we analytically derive the asymptotic variance of this estimator, providing an analytic approach for constructing both pointwise and uniform confidence bands. We evaluate the finite sample performance of these methods in a simulation study and illustrate their use on data from the Coronavirus Efficacy (COVE) placebo-controlled phase 3 trial (NCT04470427) of the mRNA-1273 COVID-19 vaccine.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 28-30","pages":"e70347"},"PeriodicalIF":1.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145715815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistics in Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1