首页 > 最新文献

Statistics in Medicine最新文献

英文 中文
Benchmarking Sparse Variable Selection Methods for Genomic Data Analyses. 基因组数据分析的基准稀疏变量选择方法。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70428
Hema Sri Sai Kollipara, Tapabrata Maiti, Sanjukta Chakraborty, Samiran Sinha

Genomics and other studies encounter many features and a selection of essential features with high accuracy is desired. In recent years, there has been a significant advancement in the use of Bayesian inference for variable (or feature) selection. However, there needs to be more practical information regarding their implementation and assessment of their relative performance. Our goal in this paper is to perform a comparative analysis of approaches, mainly from different Bayesian genres that apply to genomic analysis. In particular, we are examining how well shrinkage, global-local, and mixture priors, SUSIE, and a simple two-step procedure-namely, RFSFS, which we propose-perform in terms of several metrics: FDR, FNR, F-score, and mean squared prediction error under various simulation scenarios. There is no single method that outperforms others uniformly across all scenarios and in terms of variable selection and prediction performance metrics. So, we order the methods based on the average ranking across different scenarios. We found LASSO, spike-and-slab prior with normal slab (SN), and RFSFS are the most competitive methods for FDR and F-score when features are uncorrelated. When features are correlated, SN, SuSIE, and RFSFS are the most competitive methods for FDR whereas LASSO has an edge over SuSIE in terms of F-score. For illustration, we have applied these methods to analyzed The Cancer Genome Atlas Program (TCGA) renal cell carcinoma (RCC) data and have offered methodological direction.

基因组学和其他研究遇到了许多特征,需要高精度地选择基本特征。近年来,在使用贝叶斯推理进行变量(或特征)选择方面取得了重大进展。但是,需要有更多关于其执行情况和评估其相对绩效的实际资料。我们在本文中的目标是执行方法的比较分析,主要来自不同的贝叶斯流派,适用于基因组分析。特别是,我们正在研究收缩、全局-局部和混合先验、SUSIE和我们提出的一个简单的两步程序(即RFSFS)在几个指标方面的表现:FDR、FNR、F-score和各种模拟场景下的均方预测误差。在变量选择和预测性能指标方面,没有一种方法可以在所有场景中都优于其他方法。因此,我们根据不同场景的平均排名对方法进行排序。我们发现,当特征不相关时,LASSO、spike-and-slab prior with normal slab (SN)和RFSFS是FDR和F-score最具竞争力的方法。当特征相关时,SN、SuSIE和RFSFS是FDR最具竞争力的方法,而LASSO在f分方面比SuSIE更有优势。举例来说,我们应用这些方法分析了癌症基因组图谱计划(TCGA)肾细胞癌(RCC)的数据,并提供了方法学方向。
{"title":"Benchmarking Sparse Variable Selection Methods for Genomic Data Analyses.","authors":"Hema Sri Sai Kollipara, Tapabrata Maiti, Sanjukta Chakraborty, Samiran Sinha","doi":"10.1002/sim.70428","DOIUrl":"10.1002/sim.70428","url":null,"abstract":"<p><p>Genomics and other studies encounter many features and a selection of essential features with high accuracy is desired. In recent years, there has been a significant advancement in the use of Bayesian inference for variable (or feature) selection. However, there needs to be more practical information regarding their implementation and assessment of their relative performance. Our goal in this paper is to perform a comparative analysis of approaches, mainly from different Bayesian genres that apply to genomic analysis. In particular, we are examining how well shrinkage, global-local, and mixture priors, SUSIE, and a simple two-step procedure-namely, RFSFS, which we propose-perform in terms of several metrics: FDR, FNR, F-score, and mean squared prediction error under various simulation scenarios. There is no single method that outperforms others uniformly across all scenarios and in terms of variable selection and prediction performance metrics. So, we order the methods based on the average ranking across different scenarios. We found LASSO, spike-and-slab prior with normal slab (SN), and RFSFS are the most competitive methods for FDR and F-score when features are uncorrelated. When features are correlated, SN, SuSIE, and RFSFS are the most competitive methods for FDR whereas LASSO has an edge over SuSIE in terms of F-score. For illustration, we have applied these methods to analyzed The Cancer Genome Atlas Program (TCGA) renal cell carcinoma (RCC) data and have offered methodological direction.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70428"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12888550/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146150695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rforce: Random Forests for Composite Endpoints. Rforce:复合端点的随机森林。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70413
Yu Wang, Soyoung Kim, Chien-Wei Lin, Kwang Woo Ahn

Medical research often involves the study of composite endpoints that combine multiple clinical events to assess the efficacy of treatments. When constructing composite endpoints, it is a common practice to analyze the time to the first event. However, this approach overlooks outcomes that occur after the first event, resulting in information loss. Furthermore, the terminal event can not only be of interest but also, be a competing risk for other types of outcomes. While existing semi-parametric regression models can be used to analyze both fatal (terminal) and non-fatal composite events, potential nonlinear covariate effects on the logarithm of the rate function have not been addressed. To address this important issue, we introduce random forests for composite endpoints (Rforce) consisting of non-fatal composite events and terminal events. Rforce utilizes generalized estimating equations to build trees and handles the dependent censoring due to the terminal events with the concept of pseudo-at-risk duration. Simulation studies and real data analysis are conducted to demonstrate the performance of Rforce.

医学研究通常涉及综合终点的研究,结合多个临床事件来评估治疗的疗效。在构造复合端点时,通常的做法是分析到第一个事件的时间。然而,这种方法忽略了第一个事件之后发生的结果,从而导致信息丢失。此外,最终事件不仅可能是有趣的,而且可能是其他类型结果的竞争风险。虽然现有的半参数回归模型可以用于分析致命(终端)和非致命的复合事件,但对速率函数对数的潜在非线性协变量影响尚未得到解决。为了解决这个重要问题,我们引入了由非致命复合事件和终端事件组成的复合端点随机森林(Rforce)。Rforce利用广义估计方程构建树,并以伪风险持续时间的概念处理终端事件的相关审查。通过仿真研究和实际数据分析,验证了Rforce的性能。
{"title":"Rforce: Random Forests for Composite Endpoints.","authors":"Yu Wang, Soyoung Kim, Chien-Wei Lin, Kwang Woo Ahn","doi":"10.1002/sim.70413","DOIUrl":"https://doi.org/10.1002/sim.70413","url":null,"abstract":"<p><p>Medical research often involves the study of composite endpoints that combine multiple clinical events to assess the efficacy of treatments. When constructing composite endpoints, it is a common practice to analyze the time to the first event. However, this approach overlooks outcomes that occur after the first event, resulting in information loss. Furthermore, the terminal event can not only be of interest but also, be a competing risk for other types of outcomes. While existing semi-parametric regression models can be used to analyze both fatal (terminal) and non-fatal composite events, potential nonlinear covariate effects on the logarithm of the rate function have not been addressed. To address this important issue, we introduce random forests for composite endpoints (Rforce) consisting of non-fatal composite events and terminal events. Rforce utilizes generalized estimating equations to build trees and handles the dependent censoring due to the terminal events with the concept of pseudo-at-risk duration. Simulation studies and real data analysis are conducted to demonstrate the performance of Rforce.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70413"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Covariate Selection for the Regression Calibration Method for Exposure Measurement Error Bias Correction. 暴露测量误差偏差校正回归校准方法的因果协变量选择。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70430
Wenze Tang, Donna Spiegelman, Yujie Wu, Molin Wang

In this paper, we investigate the selection of minimal and efficient covariate adjustment sets for the imputation-based regression calibration method, which corrects for bias due to continuous exposure measurement error. We use directed acyclic graphs to illustrate how subject-matter knowledge aids in selecting these sets. For unbiased measurement error correction, researchers must collect, in both main and validation studies, (I) common causes of both the true exposure and the outcome, and (II) common causes of both measurement error and the outcome. For regression calibration under linear models, at minimum, covariate set (I) must be adjusted for in both the measurement error model (MEM) and the outcome model, while set (II) should be adjusted for in at least the MEM. Adjusting for non-risk factors that are correlates of true exposure or measurement error within the MEM alone improves efficiency. We apply this covariate selection approach to the Health Professionals Follow-up Study, assessing fiber intake's effect on cardiovascular disease. We also highlight potential pitfalls in data-driven MEM building that ignores structural assumptions. Additionally, we extend existing estimators to allow for effect modification. Finally, we caution against using regression calibration to estimate the effect of true nutritional intake through calibrating biomarkers.

在本文中,我们研究了基于假设的回归校准方法的最小和有效协变量调整集的选择,该方法校正了由于连续暴露测量误差引起的偏差。我们使用有向无环图来说明主题知识如何帮助选择这些集合。对于无偏测量误差校正,研究人员必须在主要研究和验证研究中收集(I)真实暴露和结果的共同原因,以及(II)测量误差和结果的共同原因。对于线性模型下的回归校准,至少,协变量集(I)必须在测量误差模型(MEM)和结果模型中进行调整,而集(II)至少应该在MEM中进行调整。调整与MEM内真实暴露或测量误差相关的非风险因素可提高效率。我们将这种协变量选择方法应用于卫生专业人员随访研究,评估纤维摄入量对心血管疾病的影响。我们还强调了在数据驱动的MEM建设中忽视结构性假设的潜在缺陷。此外,我们扩展了现有的估计器以允许效果修改。最后,我们警告不要使用回归校准来通过校准生物标志物来估计真实营养摄入量的影响。
{"title":"Causal Covariate Selection for the Regression Calibration Method for Exposure Measurement Error Bias Correction.","authors":"Wenze Tang, Donna Spiegelman, Yujie Wu, Molin Wang","doi":"10.1002/sim.70430","DOIUrl":"10.1002/sim.70430","url":null,"abstract":"<p><p>In this paper, we investigate the selection of minimal and efficient covariate adjustment sets for the imputation-based regression calibration method, which corrects for bias due to continuous exposure measurement error. We use directed acyclic graphs to illustrate how subject-matter knowledge aids in selecting these sets. For unbiased measurement error correction, researchers must collect, in both main and validation studies, (I) common causes of both the true exposure and the outcome, and (II) common causes of both measurement error and the outcome. For regression calibration under linear models, at minimum, covariate set (I) must be adjusted for in both the measurement error model (MEM) and the outcome model, while set (II) should be adjusted for in at least the MEM. Adjusting for non-risk factors that are correlates of true exposure or measurement error within the MEM alone improves efficiency. We apply this covariate selection approach to the Health Professionals Follow-up Study, assessing fiber intake's effect on cardiovascular disease. We also highlight potential pitfalls in data-driven MEM building that ignores structural assumptions. Additionally, we extend existing estimators to allow for effect modification. Finally, we caution against using regression calibration to estimate the effect of true nutritional intake through calibrating biomarkers.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70430"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12870316/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Network Meta-Analysis With One or Two Continuous Outcomes Measured at Multiple Time Points Using Gaussian Random Walks With Drift. 使用带漂移的高斯随机漫步在多个时间点测量一个或两个连续结果的贝叶斯网络元分析。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70373
Pai-Shan Cheng, Bruno R da Costa, George Tomlinson

Network meta-analysis of randomized controlled trials is traditionally conducted on a single outcome measured at one time point. However, many trials also feature a secondary outcome and both outcomes may have been reported at multiple time points. Existing network meta-analysis methods for synthesizing continuous outcome data from such trials focus on either the longitudinal data aspect or the multiple outcomes aspect, but not on both simultaneously. In this paper, we present two Bayesian network meta-analysis models that account for the correlation of outcome measurements over time using Gaussian random walks with drift. The first model is suitable for a single continuous outcome measured at multiple time points, while the second model extends the first model to allow incorporation of a second outcome through cointegration of random walks. A simulation study to evaluate several statistical properties of these models is conducted. The results indicate that both proposed models produce unbiased estimates of relative treatment effect and drift parameters, as well as reasonable coverage. Furthermore, in some scenarios, using the cointegration model yields small gains in precision over using the single outcome model. Based on various performance measures, both proposed models also outperform an existing random walk network meta-analysis model previously used by investigators to synthesize osteoarthritis trials data. The proposed models are illustrated with an application to trials evaluating treatments for knee and hip osteoarthritis. Both models are useful additions to existing tools available to investigators undertaking a network meta-analysis of continuous outcome data at multiple time points.

随机对照试验的网络荟萃分析传统上是对一个时间点测量的单一结果进行的。然而,许多试验也有次要结局,两个结局可能在多个时间点被报道。现有用于综合此类试验连续结局数据的网络meta分析方法要么侧重于纵向数据方面,要么侧重于多结局方面,而不是同时关注这两个方面。在本文中,我们提出了两个贝叶斯网络元分析模型,这些模型使用带有漂移的高斯随机游走来解释结果测量随时间的相关性。第一个模型适用于在多个时间点测量的单个连续结果,而第二个模型扩展了第一个模型,允许通过随机漫步协整纳入第二个结果。对这些模型的几种统计特性进行了仿真研究。结果表明,两种模型均能对相对处理效果和漂移参数进行无偏估计,并具有合理的覆盖范围。此外,在某些情况下,使用协整模型比使用单一结果模型在精度上有很小的提高。基于各种性能指标,这两种提出的模型也优于研究人员先前用于合成骨关节炎试验数据的现有随机行走网络元分析模型。提出的模型说明了应用试验评估治疗膝关节和髋关节骨关节炎。这两种模型都是对现有工具的有用补充,可供研究人员在多个时间点对连续结果数据进行网络荟萃分析。
{"title":"Bayesian Network Meta-Analysis With One or Two Continuous Outcomes Measured at Multiple Time Points Using Gaussian Random Walks With Drift.","authors":"Pai-Shan Cheng, Bruno R da Costa, George Tomlinson","doi":"10.1002/sim.70373","DOIUrl":"10.1002/sim.70373","url":null,"abstract":"<p><p>Network meta-analysis of randomized controlled trials is traditionally conducted on a single outcome measured at one time point. However, many trials also feature a secondary outcome and both outcomes may have been reported at multiple time points. Existing network meta-analysis methods for synthesizing continuous outcome data from such trials focus on either the longitudinal data aspect or the multiple outcomes aspect, but not on both simultaneously. In this paper, we present two Bayesian network meta-analysis models that account for the correlation of outcome measurements over time using Gaussian random walks with drift. The first model is suitable for a single continuous outcome measured at multiple time points, while the second model extends the first model to allow incorporation of a second outcome through cointegration of random walks. A simulation study to evaluate several statistical properties of these models is conducted. The results indicate that both proposed models produce unbiased estimates of relative treatment effect and drift parameters, as well as reasonable coverage. Furthermore, in some scenarios, using the cointegration model yields small gains in precision over using the single outcome model. Based on various performance measures, both proposed models also outperform an existing random walk network meta-analysis model previously used by investigators to synthesize osteoarthritis trials data. The proposed models are illustrated with an application to trials evaluating treatments for knee and hip osteoarthritis. Both models are useful additions to existing tools available to investigators undertaking a network meta-analysis of continuous outcome data at multiple time points.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70373"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12871091/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing Treatment Effects in Observational Data With Missing Confounders: A Comparative Study of Practical Doubly-Robust and Traditional Missing Data Methods. 在缺失混杂因素的观察数据中评估治疗效果:实用双稳健和传统缺失数据方法的比较研究。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70366
Brian D Williamson, Chloe Krakauer, Eric Johnson, Susan Gruber, Bryan E Shepherd, Mark J van der Laan, Thomas Lumley, Hana Lee, José J Hernández-Muñoz, Fengyu Zhao, Sarah K Dutcher, Rishi Desai, Gregory E Simon, Susan M Shortreed, Jennifer C Nelson, Pamela A Shaw

In pharmacoepidemiology, safety and effectiveness are frequently evaluated using readily available administrative and electronic health records data. In these settings, detailed confounder data are often not available in all data sources and therefore missing on a subset of individuals. Multiple imputation (MI) and inverse-probability weighting (IPW) are go-to analytical methods to handle missing data and are dominant in the biomedical literature. Doubly-robust methods, which are consistent under fewer assumptions, can be more efficient with respect to mean-squared error. We discuss two practical-to-implement doubly-robust estimators, generalized raking and inverse probability-weighted targeted maximum likelihood estimation (TMLE), which are both currently under-utilized in biomedical studies. We compare their performance to IPW and MI in a detailed numerical study for a variety of synthetic data-generating and missingness scenarios, including scenarios with rare outcomes and a high missingness proportion. Further, we consider plasmode simulation studies that emulate the complex data structure of a large electronic health records cohort in order to compare anti-depressant therapies in a rare-outcome setting where a key confounder is prone to more than 50% missingness. We provide guidance on selecting a missing data analysis approach, based on which methods excelled with respect to the bias-variance trade-off across the different scenarios studied.

在药物流行病学中,安全性和有效性经常使用现成的行政和电子健康记录数据进行评估。在这些设置中,详细的混杂数据通常无法在所有数据源中获得,因此在个体的子集中缺失。多重输入(MI)和反概率加权(IPW)是处理缺失数据的常用分析方法,在生物医学文献中占主导地位。双鲁棒方法在更少的假设下是一致的,对于均方误差来说可以更有效。本文讨论了目前在生物医学研究中应用不足的两种实用的双鲁棒估计方法——广义耙法和逆概率加权目标最大似然估计。在详细的数值研究中,我们将它们的性能与IPW和MI进行了比较,研究了各种合成数据生成和缺失场景,包括罕见结果和高缺失比例的场景。此外,我们考虑等离子模式模拟研究,模拟大型电子健康记录队列的复杂数据结构,以便在罕见结果设置中比较抗抑郁治疗,其中关键混杂因素容易丢失超过50%。我们提供了关于选择缺失数据分析方法的指导,基于哪些方法在不同研究场景的偏差-方差权衡方面表现出色。
{"title":"Assessing Treatment Effects in Observational Data With Missing Confounders: A Comparative Study of Practical Doubly-Robust and Traditional Missing Data Methods.","authors":"Brian D Williamson, Chloe Krakauer, Eric Johnson, Susan Gruber, Bryan E Shepherd, Mark J van der Laan, Thomas Lumley, Hana Lee, José J Hernández-Muñoz, Fengyu Zhao, Sarah K Dutcher, Rishi Desai, Gregory E Simon, Susan M Shortreed, Jennifer C Nelson, Pamela A Shaw","doi":"10.1002/sim.70366","DOIUrl":"10.1002/sim.70366","url":null,"abstract":"<p><p>In pharmacoepidemiology, safety and effectiveness are frequently evaluated using readily available administrative and electronic health records data. In these settings, detailed confounder data are often not available in all data sources and therefore missing on a subset of individuals. Multiple imputation (MI) and inverse-probability weighting (IPW) are go-to analytical methods to handle missing data and are dominant in the biomedical literature. Doubly-robust methods, which are consistent under fewer assumptions, can be more efficient with respect to mean-squared error. We discuss two practical-to-implement doubly-robust estimators, generalized raking and inverse probability-weighted targeted maximum likelihood estimation (TMLE), which are both currently under-utilized in biomedical studies. We compare their performance to IPW and MI in a detailed numerical study for a variety of synthetic data-generating and missingness scenarios, including scenarios with rare outcomes and a high missingness proportion. Further, we consider plasmode simulation studies that emulate the complex data structure of a large electronic health records cohort in order to compare anti-depressant therapies in a rare-outcome setting where a key confounder is prone to more than 50% missingness. We provide guidance on selecting a missing data analysis approach, based on which methods excelled with respect to the bias-variance trade-off across the different scenarios studied.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70366"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875654/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probability of Success for Establishing Noninferiority Across Multiple Visits: Extension of Covariate-Adjusted Bayesian Hierarchical Modeling Framework. 跨多次访问建立非劣效性的成功概率:协变量调整贝叶斯层次建模框架的扩展。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70423
Yujie Zhao, Yiran Hu, Xiaotian Chen, Jenny Jiao, Qiang Guo, Li Wang

Effective decision-making plays a vital role throughout the drug development process, particularly when a proof-of-concept (POC) or phase II study has been completed. To determine whether to proceed to a larger-scale, confirmatory phase III study, assessing the uncertainty about the underlying treatment effect and the probability of success (POS) in the phase III study is of critical importance. In this paper, we proposed and investigated a Bayesian covariate-adjusted hierarchical modeling approach leveraging historical data with longitudinal outcome to quantitatively assess the POS of the confirmatory phase III trial. Although historical data borrowing methods are widely used and known for the advantages in alleviating recruitment and ethical challenges as well as improving trial operational efficiency, its application to predicting future trial POS with longitudinal outcome over multiple visits pose methodological challenges. This paper not only provided a comprehensive modeling approach but also demonstrated how the proposed model can be used in a Go/No-Go decision-making framework with a glaucoma eye care project example. For the approval of new drugs targeting glaucoma, regulatory agencies typically require a pivotal phase III trial to demonstrate noninferiority compared to a standard of care treatment. This may involve meeting both statistical and clinical margins across multiple visits simultaneously. Simulations were performed to evaluate the key factors that affect the operating characteristics, such as between-trial heterogeneity, subject-level variance and between-visit correlation. The proposed decision-making framework can also be applied to studies in other therapeutical areas with similar settings.

有效的决策在整个药物开发过程中起着至关重要的作用,特别是当概念验证(POC)或II期研究已经完成时。为了确定是否进行更大规模的验证性III期研究,评估III期研究中潜在治疗效果的不确定性和成功概率(POS)至关重要。在本文中,我们提出并研究了一种利用具有纵向结果的历史数据的贝叶斯协变量调整分层建模方法,以定量评估验证性III期试验的POS。虽然历史数据借用方法被广泛使用,并以其在减轻招募和道德挑战以及提高试验操作效率方面的优势而闻名,但将其应用于预测未来具有多次就诊纵向结果的试验POS存在方法上的挑战。本文不仅提供了一种全面的建模方法,而且通过青光眼眼保健项目的实例说明了所提出的模型如何用于Go/No-Go决策框架。为了批准针对青光眼的新药,监管机构通常需要进行关键的三期试验,以证明与标准护理治疗相比非劣效性。这可能涉及在多次访问中同时满足统计和临床边缘。通过模拟评估试验间异质性、受试者水平方差和访间相关性等影响操作特征的关键因素。所提出的决策框架也可以应用于具有类似背景的其他治疗领域的研究。
{"title":"Probability of Success for Establishing Noninferiority Across Multiple Visits: Extension of Covariate-Adjusted Bayesian Hierarchical Modeling Framework.","authors":"Yujie Zhao, Yiran Hu, Xiaotian Chen, Jenny Jiao, Qiang Guo, Li Wang","doi":"10.1002/sim.70423","DOIUrl":"https://doi.org/10.1002/sim.70423","url":null,"abstract":"<p><p>Effective decision-making plays a vital role throughout the drug development process, particularly when a proof-of-concept (POC) or phase II study has been completed. To determine whether to proceed to a larger-scale, confirmatory phase III study, assessing the uncertainty about the underlying treatment effect and the probability of success (POS) in the phase III study is of critical importance. In this paper, we proposed and investigated a Bayesian covariate-adjusted hierarchical modeling approach leveraging historical data with longitudinal outcome to quantitatively assess the POS of the confirmatory phase III trial. Although historical data borrowing methods are widely used and known for the advantages in alleviating recruitment and ethical challenges as well as improving trial operational efficiency, its application to predicting future trial POS with longitudinal outcome over multiple visits pose methodological challenges. This paper not only provided a comprehensive modeling approach but also demonstrated how the proposed model can be used in a Go/No-Go decision-making framework with a glaucoma eye care project example. For the approval of new drugs targeting glaucoma, regulatory agencies typically require a pivotal phase III trial to demonstrate noninferiority compared to a standard of care treatment. This may involve meeting both statistical and clinical margins across multiple visits simultaneously. Simulations were performed to evaluate the key factors that affect the operating characteristics, such as between-trial heterogeneity, subject-level variance and between-visit correlation. The proposed decision-making framework can also be applied to studies in other therapeutical areas with similar settings.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70423"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146150698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classification-Specific Predictive Performance: A Unified Estimation and Inference Framework for Multi-Category Tests. 特定分类的预测性能:多类别测试的统一估计和推理框架。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70431
A Gregory DiRienzo, Elie Massaad, Hutan Ashrafian

Multi-cancer testing with localization aims to detect signals from any of a set of targeted cancer types and predict the cancer signal origin from a single biological sample. Such tests have the potential to aid clinical decisions and significantly improve health outcomes. When used for multi-cancer screening in an asymptomatic population, these tests are referred to as multi-cancer early detection (MCED) tests. MCED testing has not yet achieved regulatory approval, reimbursement or broad clinical adoption. Some major reasons for this are that the clinical benefits and harms are not well understood, including the risk of unnecessary work-ups and false reassurance from a negative test that could reduce uptake of standard-of-care screening. Part of this uncertainty stems from the use of clinically obtuse metrics to assess the test's clinical validity. Traditionally, performance of MCED tests has been quantified using aggregate measures, disregarding the joint distribution of cancer type, stage (both at intended-use incidence rates) and predicted cancer signal origin, thereby obscuring biological variability and underlying differences in the test's behavior and limiting insight into true effectiveness. Clinically informative evaluation of an MCED test's performance requires metrics that are specific to cancer type, stage and predicted cancer origin at expected incidence rates in the intended-use population. In the context of a case-control sampling design, this paper derives analytical methods that allow for unbiased estimation of cancer-specific intrinsic accuracy, predicted cancer signal origin-specific predictive value and the marginal test classification distribution, each with corresponding valid confidence interval formulae. A simulation study is presented that evaluates performance of the proposed methodology and provides guidance for implementation. An application to a published MCED test dataset is given. The derived statistical analysis framework in general allows for estimation and inference for pointed metrics of a multi-category test that enables precisely informed decision-making, supports optimized trial designs across classical, digital, AI-driven, and hybrid stratified diagnostic screening platforms, and facilitates informed healthcare decisions by clinicians, policymakers, regulators, scientists, and patients.

多肿瘤定位检测旨在检测任意一组靶向癌症类型的信号,并从单个生物样本中预测癌症信号的起源。这些测试有可能帮助临床决策并显著改善健康结果。当用于无症状人群的多癌筛查时,这些测试被称为多癌早期检测(MCED)测试。MCED检测尚未获得监管部门的批准、报销或广泛的临床应用。造成这种情况的一些主要原因是临床益处和危害尚未得到很好的了解,包括不必要的检查风险和阴性测试的错误保证,可能会减少对标准护理筛查的接受。这种不确定性部分源于使用临床钝化指标来评估测试的临床有效性。传统上,MCED检测的表现是使用综合指标来量化的,忽略了癌症类型、分期(包括预期使用发生率)和预测癌症信号来源的联合分布,从而模糊了生物学变异性和检测行为的潜在差异,限制了对真正有效性的了解。临床信息性评价MCED测试的性能需要特定于癌症类型、阶段和预测癌症起源的指标,以及预期使用人群的预期发病率。在病例对照抽样设计的背景下,本文导出了允许无偏估计癌症特异性固有精度的分析方法,预测癌症信号来源特异性预测值和边际检验分类分布,每个都有相应的有效置信区间公式。仿真研究评估了所提出的方法的性能,并为实施提供了指导。给出了在已发布的MCED测试数据集上的应用。衍生的统计分析框架一般允许对多类别测试的特定指标进行估计和推断,从而实现精确的知情决策,支持跨经典、数字、人工智能驱动和混合分层诊断筛查平台的优化试验设计,并促进临床医生、政策制定者、监管机构、科学家和患者做出知情的医疗保健决策。
{"title":"Classification-Specific Predictive Performance: A Unified Estimation and Inference Framework for Multi-Category Tests.","authors":"A Gregory DiRienzo, Elie Massaad, Hutan Ashrafian","doi":"10.1002/sim.70431","DOIUrl":"https://doi.org/10.1002/sim.70431","url":null,"abstract":"<p><p>Multi-cancer testing with localization aims to detect signals from any of a set of targeted cancer types and predict the cancer signal origin from a single biological sample. Such tests have the potential to aid clinical decisions and significantly improve health outcomes. When used for multi-cancer screening in an asymptomatic population, these tests are referred to as multi-cancer early detection (MCED) tests. MCED testing has not yet achieved regulatory approval, reimbursement or broad clinical adoption. Some major reasons for this are that the clinical benefits and harms are not well understood, including the risk of unnecessary work-ups and false reassurance from a negative test that could reduce uptake of standard-of-care screening. Part of this uncertainty stems from the use of clinically obtuse metrics to assess the test's clinical validity. Traditionally, performance of MCED tests has been quantified using aggregate measures, disregarding the joint distribution of cancer type, stage (both at intended-use incidence rates) and predicted cancer signal origin, thereby obscuring biological variability and underlying differences in the test's behavior and limiting insight into true effectiveness. Clinically informative evaluation of an MCED test's performance requires metrics that are specific to cancer type, stage and predicted cancer origin at expected incidence rates in the intended-use population. In the context of a case-control sampling design, this paper derives analytical methods that allow for unbiased estimation of cancer-specific intrinsic accuracy, predicted cancer signal origin-specific predictive value and the marginal test classification distribution, each with corresponding valid confidence interval formulae. A simulation study is presented that evaluates performance of the proposed methodology and provides guidance for implementation. An application to a published MCED test dataset is given. The derived statistical analysis framework in general allows for estimation and inference for pointed metrics of a multi-category test that enables precisely informed decision-making, supports optimized trial designs across classical, digital, AI-driven, and hybrid stratified diagnostic screening platforms, and facilitates informed healthcare decisions by clinicians, policymakers, regulators, scientists, and patients.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70431"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146182629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Response-Adaptive Randomization for Cluster Randomized Controlled Trials. 聚类随机对照试验的贝叶斯响应-自适应随机化。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 DOI: 10.1002/sim.70386
Yunyi Liu, Maile Young Karris, Sonia Jain

Cluster randomized controlled trials where groups (or clusters) of individuals, rather than single individuals, are randomized are especially useful when individual-level randomization is not feasible or when interventions are naturally delivered at the group level. Balanced randomization in the cluster randomized trial setting can pose logistical challenges and strain resources if subjects are randomized to a non-optimal arm. We propose a Bayesian response-adaptive randomization design for cluster randomized controlled trials based on Thompson sampling, which dynamically allocates clusters to the most efficacious treatment arm based on the interim posterior distributions of treatment effects using Markov chain Monte Carlo sampling. Our design also incorporates early stopping rules for efficacy and futility determined by prespecified posterior probability thresholds. The performance of the proposed design is evaluated across various operating characteristics under multiple settings, including varying intra-cluster correlation coefficients, cluster sizes, and effect sizes. Our adaptive approach is also compared with a standard, parallel two-arm cluster randomized controlled clinical trial design, highlighting improvements in both ethical considerations and efficiency. From our simulation studies based on an HIV behavioral trial, we demonstrate these improvements by preferentially assigning more clusters to the more efficacious intervention while maintaining robust statistical power and controlling false positive rates.

当个体水平的随机化不可行的时候,或者干预措施自然地在群体水平上进行的时候,对个体群体(或群体)而不是单个个体进行随机化的聚类随机对照试验特别有用。如果受试者被随机分配到非最优组,在集群随机试验设置中的平衡随机化可能会带来后勤挑战和资源紧张。本文提出了一种基于汤普森抽样的聚类随机对照试验贝叶斯响应自适应随机化设计,该设计基于马尔科夫链蒙特卡罗抽样的治疗效果的中期后验分布动态地将聚类分配到最有效的治疗组。我们的设计还结合了由预先指定的后验概率阈值决定的疗效和无效的早期停止规则。在多种设置下,包括不同的簇内相关系数、簇大小和效应大小,评估了所建议设计的各种操作特征的性能。我们的自适应方法还与标准的平行双臂随机对照临床试验设计进行了比较,突出了伦理考虑和效率的改进。从我们基于HIV行为试验的模拟研究中,我们通过优先分配更多簇来进行更有效的干预,同时保持强大的统计能力并控制假阳性率,从而证明了这些改进。
{"title":"Bayesian Response-Adaptive Randomization for Cluster Randomized Controlled Trials.","authors":"Yunyi Liu, Maile Young Karris, Sonia Jain","doi":"10.1002/sim.70386","DOIUrl":"10.1002/sim.70386","url":null,"abstract":"<p><p>Cluster randomized controlled trials where groups (or clusters) of individuals, rather than single individuals, are randomized are especially useful when individual-level randomization is not feasible or when interventions are naturally delivered at the group level. Balanced randomization in the cluster randomized trial setting can pose logistical challenges and strain resources if subjects are randomized to a non-optimal arm. We propose a Bayesian response-adaptive randomization design for cluster randomized controlled trials based on Thompson sampling, which dynamically allocates clusters to the most efficacious treatment arm based on the interim posterior distributions of treatment effects using Markov chain Monte Carlo sampling. Our design also incorporates early stopping rules for efficacy and futility determined by prespecified posterior probability thresholds. The performance of the proposed design is evaluated across various operating characteristics under multiple settings, including varying intra-cluster correlation coefficients, cluster sizes, and effect sizes. Our adaptive approach is also compared with a standard, parallel two-arm cluster randomized controlled clinical trial design, highlighting improvements in both ethical considerations and efficiency. From our simulation studies based on an HIV behavioral trial, we demonstrate these improvements by preferentially assigning more clusters to the more efficacious intervention while maintaining robust statistical power and controlling false positive rates.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 1-2","pages":"e70386"},"PeriodicalIF":1.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12824830/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Bayesian Pick-the-Winner (IBPW) Design for Randomized Phase II Clinical Trials. 随机II期临床试验的改进贝叶斯选择赢家(IBPW)设计。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 DOI: 10.1002/sim.70348
Wanni Lei, Maosen Peng, Nasser Altorki, Xi Kathy Zhou

Phase II clinical trials play a pivotal role in drug development by screening a large number of drug candidates to identify those with promising preliminary efficacy for phase III testing. Trial designs that enable efficient decision-making with small sample sizes and early futility stopping while controlling for type I and type II errors in hypothesis testing, such as Simon's two-stage design, are preferred. Randomized multi-arm trials are increasingly used in phase II settings to overcome the limitations associated with using historical controls as the reference. However, how to effectively balance efficiency and accurate decision-making continues to be an important research topic. A notable development in phase II randomized design methodology is the Bayesian pick-the-winner (BPW) design proposed by Chen et al. [1]. Despite multiple appealing features, this method cannot easily control for overall type I and type II errors for winner selection. Here, we introduce an improved randomized two-stage Bayesian pick-the-winner (IBPW) design that formalizes the winner-selection based hypothesis testing, optimizes sample sizes and decision cut-offs by strictly controlling the type I and type II errors under a set of flexible hypotheses for winner-selection across two treatment arms. Simulation studies demonstrate that our new design offers improved operating characteristics for winner selection while retaining the desirable features of the BPW design.

II期临床试验在药物开发中起着关键作用,通过筛选大量候选药物来确定那些有希望进行III期试验的初步疗效。在控制假设检验中的I型和II型错误的同时,能够以小样本量和早期无效停止进行有效决策的试验设计,如Simon的两阶段设计,是首选。随机多组试验越来越多地用于II期研究,以克服使用历史对照作为参考的局限性。然而,如何有效地平衡效率和准确决策仍然是一个重要的研究课题。第二阶段随机设计方法的一个显著发展是Chen等人提出的贝叶斯选择赢家(BPW)设计。尽管有许多吸引人的特点,但这种方法不能轻易控制获胜者选择的整体I型和II型错误。在这里,我们引入了一种改进的随机两阶段贝叶斯选择赢家(IBPW)设计,该设计形式化了基于赢家选择的假设检验,通过严格控制两个治疗组中赢家选择的一组灵活假设下的I型和II型错误来优化样本量和决策截止点。仿真研究表明,我们的新设计在保留BPW设计的理想特性的同时,为获胜者选择提供了改进的操作特性。
{"title":"An Improved Bayesian Pick-the-Winner (IBPW) Design for Randomized Phase II Clinical Trials.","authors":"Wanni Lei, Maosen Peng, Nasser Altorki, Xi Kathy Zhou","doi":"10.1002/sim.70348","DOIUrl":"10.1002/sim.70348","url":null,"abstract":"<p><p>Phase II clinical trials play a pivotal role in drug development by screening a large number of drug candidates to identify those with promising preliminary efficacy for phase III testing. Trial designs that enable efficient decision-making with small sample sizes and early futility stopping while controlling for type I and type II errors in hypothesis testing, such as Simon's two-stage design, are preferred. Randomized multi-arm trials are increasingly used in phase II settings to overcome the limitations associated with using historical controls as the reference. However, how to effectively balance efficiency and accurate decision-making continues to be an important research topic. A notable development in phase II randomized design methodology is the Bayesian pick-the-winner (BPW) design proposed by Chen et al. [1]. Despite multiple appealing features, this method cannot easily control for overall type I and type II errors for winner selection. Here, we introduce an improved randomized two-stage Bayesian pick-the-winner (IBPW) design that formalizes the winner-selection based hypothesis testing, optimizes sample sizes and decision cut-offs by strictly controlling the type I and type II errors under a set of flexible hypotheses for winner-selection across two treatment arms. Simulation studies demonstrate that our new design offers improved operating characteristics for winner selection while retaining the desirable features of the BPW design.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 1-2","pages":"e70348"},"PeriodicalIF":1.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12826356/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146030892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overview and Practical Recommendations on Using Shapley Values for Identifying Predictive Biomarkers via CATE Modeling. 概述和实用建议使用沙普利值识别预测性生物标志物通过CATE建模。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 DOI: 10.1002/sim.70375
David Svensson, Erik Hermansson, Nikolaos Nikolaou, Konstantinos Sechidis, Ilya Lipkovich

In recent years, two parallel research trends have emerged in machine learning, yet their intersections remain largely unexplored. On one hand, there has been a significant increase in literature focused on Individual Treatment Effect (ITE) modeling, particularly targeting the Conditional Average Treatment Effect (CATE) using meta-learner techniques. These approaches often aim to identify causal effects from observational data. On the other hand, the field of Explainable Machine Learning (XML) has gained traction, with various approaches developed to explain complex models and make their predictions more interpretable. A prominent technique in this area is Shapley Additive Explanations (SHAP), which has become mainstream in data science for analyzing supervised learning models. However, there has been limited exploration of SHAP's application in identifying predictive biomarkers through CATE models, a crucial aspect in pharmaceutical precision medicine. We address inherent challenges associated with the SHAP concept in multi-stage CATE strategies and introduce a surrogate estimation approach that is agnostic to the choice of CATE strategy, effectively reducing computational burdens in high-dimensional data. Using this approach, we conduct simulation benchmarking to evaluate the ability to accurately identify biomarkers using SHAP values derived from various CATE meta-learners and Causal Forest.

近年来,机器学习领域出现了两种平行的研究趋势,但它们的交集在很大程度上仍未被探索。一方面,关注个体治疗效应(ITE)建模的文献显著增加,特别是针对使用元学习者技术的条件平均治疗效应(CATE)。这些方法通常旨在从观测数据中确定因果关系。另一方面,可解释机器学习(XML)领域获得了牵引力,开发了各种方法来解释复杂模型并使其预测更具可解释性。该领域的一个突出技术是Shapley加性解释(SHAP),它已成为数据科学中分析监督学习模型的主流。然而,SHAP在通过CATE模型识别预测性生物标志物方面的应用探索有限,这是制药精准医疗的一个关键方面。我们解决了与多阶段CATE策略中SHAP概念相关的固有挑战,并引入了一种与CATE策略选择无关的代理估计方法,有效地减少了高维数据中的计算负担。使用这种方法,我们进行了模拟基准测试,以评估使用来自各种CATE元学习器和因果森林的SHAP值准确识别生物标志物的能力。
{"title":"Overview and Practical Recommendations on Using Shapley Values for Identifying Predictive Biomarkers via CATE Modeling.","authors":"David Svensson, Erik Hermansson, Nikolaos Nikolaou, Konstantinos Sechidis, Ilya Lipkovich","doi":"10.1002/sim.70375","DOIUrl":"10.1002/sim.70375","url":null,"abstract":"<p><p>In recent years, two parallel research trends have emerged in machine learning, yet their intersections remain largely unexplored. On one hand, there has been a significant increase in literature focused on Individual Treatment Effect (ITE) modeling, particularly targeting the Conditional Average Treatment Effect (CATE) using meta-learner techniques. These approaches often aim to identify causal effects from observational data. On the other hand, the field of Explainable Machine Learning (XML) has gained traction, with various approaches developed to explain complex models and make their predictions more interpretable. A prominent technique in this area is Shapley Additive Explanations (SHAP), which has become mainstream in data science for analyzing supervised learning models. However, there has been limited exploration of SHAP's application in identifying predictive biomarkers through CATE models, a crucial aspect in pharmaceutical precision medicine. We address inherent challenges associated with the SHAP concept in multi-stage CATE strategies and introduce a surrogate estimation approach that is agnostic to the choice of CATE strategy, effectively reducing computational burdens in high-dimensional data. Using this approach, we conduct simulation benchmarking to evaluate the ability to accurately identify biomarkers using SHAP values derived from various CATE meta-learners and Causal Forest.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 1-2","pages":"e70375"},"PeriodicalIF":1.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistics in Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1