首页 > 最新文献

Statistics in Medicine最新文献

英文 中文
Robust Distribution-Free Tests for the Linear Model. 线性模型的鲁棒无分布检验。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70404
Torey Hilbert, Steven N MacEachern, Yuan Zhang

Recently, there has been growing concern about heavy-tailed and skewed noise in biological data. We introduce RobustPALMRT, a flexible permutation framework for testing the association of a covariate of interest adjusted for control covariates. RobustPALMRT controls type I error rate for finite-samples, even in the presence of heavy-tailed or skewed noise. The new framework expands the scope of state-of-the-art tests in three directions. First, our method applies to robust and quantile regressions, even with the necessary hyper-parameter tuning. Second, by separating model-fitting and model-evaluation, we discover that performance improves when using a robust loss function in the model-evaluation step, regardless of how the model is fit. Third, we allow fitting multiple models to detect specialized features of interest in a distribution. To demonstrate this, we introduce DispersionPALMRT, which tests for differences in dispersion between treatment and control groups. We establish theoretical guarantees, identify settings where our method has greater power than existing methods, and analyze existing immunological data on Long-COVID patients. Using RobustPALMRT, we unveil novel differences between Long-COVID patients and others even in the presence of highly skewed noise.

最近,人们越来越关注生物数据中的重尾和偏斜噪声。我们引入了RobustPALMRT,这是一个灵活的排列框架,用于测试对控制协变量进行调整的协变量的关联。RobustPALMRT控制有限样本的I型错误率,即使在存在重尾或偏斜噪声的情况下。新框架从三个方面扩大了最先进测试的范围。首先,我们的方法适用于鲁棒和分位数回归,即使有必要的超参数调整。其次,通过分离模型拟合和模型评估,我们发现当在模型评估步骤中使用鲁棒损失函数时,无论模型如何拟合,性能都有所提高。第三,我们允许拟合多个模型来检测分布中感兴趣的专门特征。为了证明这一点,我们引入了DispersionPALMRT,它测试了治疗组和对照组之间的分散差异。我们建立了理论保证,确定了我们的方法比现有方法更有效的设置,并分析了长期covid患者的现有免疫学数据。使用RobustPALMRT,即使在存在高度偏斜噪声的情况下,我们也揭示了长covid患者与其他人之间的新差异。
{"title":"Robust Distribution-Free Tests for the Linear Model.","authors":"Torey Hilbert, Steven N MacEachern, Yuan Zhang","doi":"10.1002/sim.70404","DOIUrl":"10.1002/sim.70404","url":null,"abstract":"<p><p>Recently, there has been growing concern about heavy-tailed and skewed noise in biological data. We introduce RobustPALMRT, a flexible permutation framework for testing the association of a covariate of interest adjusted for control covariates. RobustPALMRT controls type I error rate for finite-samples, even in the presence of heavy-tailed or skewed noise. The new framework expands the scope of state-of-the-art tests in three directions. First, our method applies to robust and quantile regressions, even with the necessary hyper-parameter tuning. Second, by separating model-fitting and model-evaluation, we discover that performance improves when using a robust loss function in the model-evaluation step, regardless of how the model is fit. Third, we allow fitting multiple models to detect specialized features of interest in a distribution. To demonstrate this, we introduce DispersionPALMRT, which tests for differences in dispersion between treatment and control groups. We establish theoretical guarantees, identify settings where our method has greater power than existing methods, and analyze existing immunological data on Long-COVID patients. Using RobustPALMRT, we unveil novel differences between Long-COVID patients and others even in the presence of highly skewed noise.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70404"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875190/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is UWLS Really Better for Medical Research? UWLS真的更适合医学研究吗?
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70411
Sanghyun Hong, W Robert Reed

This study evaluates the performance of the Unrestricted Weighted Least Squares (UWLS) estimator in meta-analyses of medical research. Using a large-scale simulation approach, it addresses the limitations of model selection criteria in small-sample contexts. Prior research using the Cochrane Database of Systematic Reviews (CDSR) reported that UWLS outperformed Random Effects (RE) and, in some cases, Fixed Effect (FE) estimators when assessed using AIC and BIC. However, we show that idiosyncratic characteristics of the CDSR datasets, notably their small sample sizes and weak-signal settings (where key parameters are often small in magnitude), undermine the reliability of AIC and BIC for model selection. Accordingly, we simulate 108 000 datasets mirroring the original CDSR data. This allows us to know the true model parameters and evaluate the estimators more accurately. While all estimators performed similarly with respect to bias and efficiency, RE consistently produced more accurate standard errors than UWLS, making confidence intervals and hypothesis testing more reliable. The comparison with FE was less clear. We therefore recommend continued use of the RE estimator as a reliable general-purpose approach for medical research, with the choice between UWLS and FE made in light of the likely extent of effect heterogeneity in the data.

本研究评估非限制加权最小二乘(UWLS)估计量在医学研究荟萃分析中的表现。使用大规模模拟方法,它解决了小样本环境中模型选择标准的局限性。先前使用Cochrane系统评价数据库(CDSR)的研究报告称,当使用AIC和BIC评估时,UWLS优于随机效应(RE),在某些情况下优于固定效应(FE)估计器。然而,我们表明,CDSR数据集的特质特征,特别是它们的小样本量和弱信号设置(其中关键参数通常很小),破坏了AIC和BIC模型选择的可靠性。因此,我们模拟了108,000个镜像原始CDSR数据集。这使我们能够知道真实的模型参数并更准确地评估估计器。虽然所有估计器在偏倚和效率方面的表现相似,但RE始终比UWLS产生更准确的标准误差,使置信区间和假设检验更可靠。与FE的比较不太清楚。因此,我们建议继续使用RE估计器作为医学研究中可靠的通用方法,并根据数据中效应异质性的可能程度在UWLS和FE之间进行选择。
{"title":"Is UWLS Really Better for Medical Research?","authors":"Sanghyun Hong, W Robert Reed","doi":"10.1002/sim.70411","DOIUrl":"10.1002/sim.70411","url":null,"abstract":"<p><p>This study evaluates the performance of the Unrestricted Weighted Least Squares (UWLS) estimator in meta-analyses of medical research. Using a large-scale simulation approach, it addresses the limitations of model selection criteria in small-sample contexts. Prior research using the Cochrane Database of Systematic Reviews (CDSR) reported that UWLS outperformed Random Effects (RE) and, in some cases, Fixed Effect (FE) estimators when assessed using AIC and BIC. However, we show that idiosyncratic characteristics of the CDSR datasets, notably their small sample sizes and weak-signal settings (where key parameters are often small in magnitude), undermine the reliability of AIC and BIC for model selection. Accordingly, we simulate 108 000 datasets mirroring the original CDSR data. This allows us to know the true model parameters and evaluate the estimators more accurately. While all estimators performed similarly with respect to bias and efficiency, RE consistently produced more accurate standard errors than UWLS, making confidence intervals and hypothesis testing more reliable. The comparison with FE was less clear. We therefore recommend continued use of the RE estimator as a reliable general-purpose approach for medical research, with the choice between UWLS and FE made in light of the likely extent of effect heterogeneity in the data.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70411"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12874514/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Patient-Centric Pragmatic Clinical Trials: Opening the DOOR. 以患者为中心的实用临床试验:打开大门。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70328
Scott R Evans, Qihang Wu, Toshimitsu Hamasaki

Randomized clinical trials are the gold standard for evaluating the benefits and harms of interventions, though they often fail to provide the necessary evidence to inform medical decision-making. Primary reasons are failure to recognize the most important questions for informing clinical practice, and that traditional approaches do not directly address these most important questions, and subsequently not using these most important questions as the motivation for the design, monitoring, analysis, and reporting of clinical trials. The standard approach of analyzing one outcome at a time fails to incorporate associations between or the cumulative nature of multiple outcomes in individual patients, suffers from competing risk complexities during interpretation of individual outcomes, fails to recognize important gradations of patient-centric responses, and since efficacy and safety analyses are often conducted on different populations, benefit:risk estimands and generalizability are unclear. Cardiovascular event prevention trials typically utilize: (1) major adverse cardiovascular events (MACE), for example, stroke, myocardial infarction, and death as the primary endpoint, which fails to recognize multiple events or the differential importance of events, and (2) relative risk models which rely on robustness-challenging modeling assumptions and are contraindicated in benefit:risk and multiple outcome evaluation. The Desirability Of Outcome Ranking (DOOR) is a paradigm for the design, data monitoring, analysis, interpretation, and reporting of clinical trials based on comprehensive patient-centric benefit:risk evaluation, developed to address these issues and advance clinical trial science. The rationale and the methodology for the design and analyses for the DOOR paradigm are described. The methods are illustrated using an example. Freely available online tools for the design and analysis of studies implementing the DOOR are provided.

随机临床试验是评估干预措施利弊的黄金标准,尽管它们往往不能提供必要的证据来指导医疗决策。主要原因是未能认识到为临床实践提供信息的最重要问题,传统方法不能直接解决这些最重要的问题,随后也没有将这些最重要的问题作为临床试验设计、监测、分析和报告的动机。一次分析一个结果的标准方法未能纳入个体患者中多个结果之间的关联或累积性质,在解释个体结果时存在相互竞争的风险复杂性,未能识别以患者为中心的反应的重要分级,并且由于疗效和安全性分析通常在不同人群中进行,因此获益:风险估计和推广尚不清楚。心血管事件预防试验通常使用:(1)主要不良心血管事件(MACE),例如中风、心肌梗死和死亡作为主要终点,它不能识别多个事件或事件的不同重要性;(2)相对风险模型依赖于具有鲁棒性的建模假设,并且在获益、风险和多结果评估中是禁忌的。结果期望排序(DOOR)是一种基于以患者为中心的综合获益风险评估的临床试验设计、数据监测、分析、解释和报告的范式,旨在解决这些问题并推进临床试验科学。描述了DOOR范式的设计和分析的基本原理和方法。通过一个实例说明了这些方法。提供了免费的在线工具,用于设计和分析实施DOOR的研究。
{"title":"Patient-Centric Pragmatic Clinical Trials: Opening the DOOR.","authors":"Scott R Evans, Qihang Wu, Toshimitsu Hamasaki","doi":"10.1002/sim.70328","DOIUrl":"https://doi.org/10.1002/sim.70328","url":null,"abstract":"<p><p>Randomized clinical trials are the gold standard for evaluating the benefits and harms of interventions, though they often fail to provide the necessary evidence to inform medical decision-making. Primary reasons are failure to recognize the most important questions for informing clinical practice, and that traditional approaches do not directly address these most important questions, and subsequently not using these most important questions as the motivation for the design, monitoring, analysis, and reporting of clinical trials. The standard approach of analyzing one outcome at a time fails to incorporate associations between or the cumulative nature of multiple outcomes in individual patients, suffers from competing risk complexities during interpretation of individual outcomes, fails to recognize important gradations of patient-centric responses, and since efficacy and safety analyses are often conducted on different populations, benefit:risk estimands and generalizability are unclear. Cardiovascular event prevention trials typically utilize: (1) major adverse cardiovascular events (MACE), for example, stroke, myocardial infarction, and death as the primary endpoint, which fails to recognize multiple events or the differential importance of events, and (2) relative risk models which rely on robustness-challenging modeling assumptions and are contraindicated in benefit:risk and multiple outcome evaluation. The Desirability Of Outcome Ranking (DOOR) is a paradigm for the design, data monitoring, analysis, interpretation, and reporting of clinical trials based on comprehensive patient-centric benefit:risk evaluation, developed to address these issues and advance clinical trial science. The rationale and the methodology for the design and analyses for the DOOR paradigm are described. The methods are illustrated using an example. Freely available online tools for the design and analysis of studies implementing the DOOR are provided.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70328"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146166817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Powerful and Self-Adaptive Weighted Logrank Test. 一个强大的自适应加权Logrank检验。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70390
Zhiguo Li, Xiaofei Wang

In a weighted logrank test, such as the Harrington-Fleming test and the Tarone-Ware test, predetermined weights are used to emphasize early, middle, or late differences in survival distributions to maximize the test's power. The optimal weight function under an alternative, which depends on the true hazard functions of the groups being compared, has been derived. However, that optimal weight function cannot be directly used to construct an optimal test since the resulting test does not properly control the type I error rate. We further show that the power of a weighted logrank test with proper type I error control has an upper bound that cannot be achieved. Based on the theory, we propose a weighted logrank test that self-adaptively determines an "optimal" weight function. The new test is more powerful than existing standard and weighted logrank tests while maintaining proper type I error rates by tuning a parameter. We demonstrate through extensive simulation studies that the proposed test is both powerful and highly robust in a wide range of scenarios. The method is illustrated with data from several clinical trials in lung cancer.

在加权logrank测试中,如Harrington-Fleming测试和Tarone-Ware测试,预先确定的权重用于强调生存分布的早期、中期或晚期差异,以最大限度地发挥测试的作用。在一个备选方案下的最优权函数,它依赖于被比较的组的真实危险函数,已经被导出。然而,该最优权重函数不能直接用于构造最优测试,因为得到的测试不能适当地控制第一类错误率。我们进一步证明了具有适当的I型误差控制的加权logrank检验的幂有一个不能达到的上界。在此基础上,提出了一种自适应确定“最优”权函数的加权logrank检验。新的测试比现有的标准和加权logrank测试更强大,同时通过调优参数保持适当的I型错误率。我们通过广泛的模拟研究证明,所提出的测试在广泛的场景中既强大又高度稳健。该方法用几个肺癌临床试验的数据来说明。
{"title":"A Powerful and Self-Adaptive Weighted Logrank Test.","authors":"Zhiguo Li, Xiaofei Wang","doi":"10.1002/sim.70390","DOIUrl":"https://doi.org/10.1002/sim.70390","url":null,"abstract":"<p><p>In a weighted logrank test, such as the Harrington-Fleming test and the Tarone-Ware test, predetermined weights are used to emphasize early, middle, or late differences in survival distributions to maximize the test's power. The optimal weight function under an alternative, which depends on the true hazard functions of the groups being compared, has been derived. However, that optimal weight function cannot be directly used to construct an optimal test since the resulting test does not properly control the type I error rate. We further show that the power of a weighted logrank test with proper type I error control has an upper bound that cannot be achieved. Based on the theory, we propose a weighted logrank test that self-adaptively determines an \"optimal\" weight function. The new test is more powerful than existing standard and weighted logrank tests while maintaining proper type I error rates by tuning a parameter. We demonstrate through extensive simulation studies that the proposed test is both powerful and highly robust in a wide range of scenarios. The method is illustrated with data from several clinical trials in lung cancer.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70390"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Pliable Lasso With Horseshoe Prior for Interaction Effects in GLMs With Missing Responses. 马蹄形先验贝叶斯柔性套索对缺失响应GLMs相互作用效应的研究。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70406
The Tien Mai

Sparse regression problems, where the goal is to identify a small set of relevant predictors, often require modeling not only main effects but also meaningful interactions through other variables. While the pliable lasso has emerged as a powerful frequentist tool for modeling such interactions under strong heredity constraints, it lacks a natural framework for uncertainty quantification and incorporation of prior knowledge. In this paper, we propose a Bayesian pliable lasso that extends this approach by placing sparsity-inducing priors, such as the horseshoe, on both main and interaction effects. The hierarchical prior structure enforces heredity constraints while adaptively shrinking irrelevant coefficients and allowing important effects to persist. We extend this framework to generalized linear models and develop a tailored approach to handle missing responses. To facilitate posterior inference, we develop an efficient Gibbs sampling algorithm based on a reparameterization of the horseshoe prior. Our Bayesian framework yields sparse, interpretable interaction structures, and principled measures of uncertainty. Through simulations and real-data studies, we demonstrate its advantages over existing methods in recovering complex interaction patterns under both complete and incomplete data. Our method is implemented in the package hspliable available on Github: https://github.com/tienmt/hspliable.

稀疏回归问题的目标是识别一小组相关预测因子,通常不仅需要对主要影响进行建模,还需要对其他变量之间有意义的相互作用进行建模。虽然柔性套索已经成为一种强大的频率学工具,可以在强遗传约束下对这种相互作用进行建模,但它缺乏不确定性量化和整合先验知识的自然框架。在本文中,我们提出了一个贝叶斯柔性套索,通过在主效应和交互效应上放置稀疏诱导先验(如马蹄铁)来扩展该方法。在自适应地缩小不相关系数并允许重要影响持续存在的同时,分层先验结构加强了遗传约束。我们将此框架扩展到广义线性模型,并开发了一种定制的方法来处理缺失响应。为了便于后验推理,我们开发了一种基于马蹄先验重新参数化的高效吉布斯采样算法。我们的贝叶斯框架产生稀疏的、可解释的交互结构,以及不确定性的原则度量。通过仿真和实际数据研究,证明了该方法在完全和不完全数据下恢复复杂交互模式方面优于现有方法。我们的方法在Github上的hsplable包中实现:https://github.com/tienmt/hspliable。
{"title":"Bayesian Pliable Lasso With Horseshoe Prior for Interaction Effects in GLMs With Missing Responses.","authors":"The Tien Mai","doi":"10.1002/sim.70406","DOIUrl":"https://doi.org/10.1002/sim.70406","url":null,"abstract":"<p><p>Sparse regression problems, where the goal is to identify a small set of relevant predictors, often require modeling not only main effects but also meaningful interactions through other variables. While the pliable lasso has emerged as a powerful frequentist tool for modeling such interactions under strong heredity constraints, it lacks a natural framework for uncertainty quantification and incorporation of prior knowledge. In this paper, we propose a Bayesian pliable lasso that extends this approach by placing sparsity-inducing priors, such as the horseshoe, on both main and interaction effects. The hierarchical prior structure enforces heredity constraints while adaptively shrinking irrelevant coefficients and allowing important effects to persist. We extend this framework to generalized linear models and develop a tailored approach to handle missing responses. To facilitate posterior inference, we develop an efficient Gibbs sampling algorithm based on a reparameterization of the horseshoe prior. Our Bayesian framework yields sparse, interpretable interaction structures, and principled measures of uncertainty. Through simulations and real-data studies, we demonstrate its advantages over existing methods in recovering complex interaction patterns under both complete and incomplete data. Our method is implemented in the package hspliable available on Github: https://github.com/tienmt/hspliable.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70406"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Misclassification Simulation Extrapolation (MC-SIMEX) Algorithm. 一种改进的误分类模拟外推(MC-SIMEX)算法。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70418
Varadan Sevilimedu, Lili Yu

Misclassification Simulation-Extrapolation (MC-SIMEX) is an established method to correct for misclassification in binary covariates in a model. It involves the use of a simulation component which simulates pseudo-datasets with added degree of misclassification in the binary covariate and an extrapolation component which models the covariate's regression coefficients obtained at each level of misclassification using a quadratic function. This quadratic function is then used to extrapolate the covariate's regression coefficients to a point of "no error" in the classification of the binary covariate under question. However, extrapolation functions are not usually known accurately beforehand and are therefore only approximated versions. In this article, we propose an innovative method that uses the exact (not approximated) extrapolation function through the use of a derived relationship between the naïve regression coefficient estimates and the true coefficients in generalized linear models. Simulation studies are conducted to study and compare the numerical properties of the resulting estimator to the original MC-SIMEX estimator. Real data analysis using colon cancer data from the MSKCC cancer registry is also provided.

错误分类模拟外推法(MC-SIMEX)是一种修正模型中二元协变量错误分类的方法。它涉及使用模拟组件来模拟二元协变量中添加了错误分类程度的伪数据集,以及使用二次函数对每个错误分类级别上获得的协变量回归系数进行建模的外推组件。然后使用这个二次函数将协变量的回归系数外推到所讨论的二元协变量分类中的“无误差”点。然而,外推函数通常事先不知道准确,因此只是近似的版本。在本文中,我们提出了一种创新的方法,通过使用广义线性模型中naïve回归系数估计值与真实系数之间的推导关系,使用精确(非近似)外推函数。进行了仿真研究,研究并比较了所得估计器与原始MC-SIMEX估计器的数值特性。还提供了使用来自MSKCC癌症登记处的结肠癌数据的真实数据分析。
{"title":"An Improved Misclassification Simulation Extrapolation (MC-SIMEX) Algorithm.","authors":"Varadan Sevilimedu, Lili Yu","doi":"10.1002/sim.70418","DOIUrl":"https://doi.org/10.1002/sim.70418","url":null,"abstract":"<p><p>Misclassification Simulation-Extrapolation (MC-SIMEX) is an established method to correct for misclassification in binary covariates in a model. It involves the use of a simulation component which simulates pseudo-datasets with added degree of misclassification in the binary covariate and an extrapolation component which models the covariate's regression coefficients obtained at each level of misclassification using a quadratic function. This quadratic function is then used to extrapolate the covariate's regression coefficients to a point of \"no error\" in the classification of the binary covariate under question. However, extrapolation functions are not usually known accurately beforehand and are therefore only approximated versions. In this article, we propose an innovative method that uses the exact (not approximated) extrapolation function through the use of a derived relationship between the naïve regression coefficient estimates and the true coefficients in generalized linear models. Simulation studies are conducted to study and compare the numerical properties of the resulting estimator to the original MC-SIMEX estimator. Real data analysis using colon cancer data from the MSKCC cancer registry is also provided.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70418"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Functional Joint Model for Survival and Multivariate Sparse Functional Data in Multi-Cohort Alzheimer's Disease Study. 多队列阿尔茨海默病研究中生存和多变量稀疏功能数据的功能联合模型。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70442
Wenyi Wang, Luo Xiao, Ruonan Li, Sheng Luo

We develop an integrative joint model for multivariate sparse functional and survival data to analyze Alzheimer's disease (AD) across multiple studies. To address missing-by-design outcomes in multi-cohort studies, our approach extends the multivariate functional mixed model (MFMM), which integrates longitudinal outcomes to extract shared disease progression trajectories and links these outcomes to time-to-event data through a parsimonious survival model. This framework balances flexibility and interpretability by modeling shared progression trajectories while accommodating cohort-specific mean functions and survival parameters. For efficient estimation, we incorporate penalized splines into an EM algorithm. Application to three AD cohorts demonstrates the model's ability to capture disease trajectories and account for inter-cohort variability. Simulation studies confirm its robustness and accuracy, highlighting its value in advancing the understanding of AD progression and supporting clinical decision-making in multi-cohort settings.

我们开发了一个多变量稀疏功能和生存数据的综合联合模型,用于跨多个研究分析阿尔茨海默病(AD)。为了解决多队列研究中设计缺失的结果,我们的方法扩展了多变量功能混合模型(MFMM),该模型集成了纵向结果以提取共享的疾病进展轨迹,并通过简约的生存模型将这些结果与事件时间数据联系起来。该框架通过建模共享的进展轨迹来平衡灵活性和可解释性,同时适应特定队列的平均函数和生存参数。为了有效估计,我们将惩罚样条合并到EM算法中。对三个AD队列的应用证明了该模型捕捉疾病轨迹和解释队列间变异性的能力。模拟研究证实了它的稳健性和准确性,强调了它在促进对阿尔茨海默病进展的理解和支持多队列环境下的临床决策方面的价值。
{"title":"A Functional Joint Model for Survival and Multivariate Sparse Functional Data in Multi-Cohort Alzheimer's Disease Study.","authors":"Wenyi Wang, Luo Xiao, Ruonan Li, Sheng Luo","doi":"10.1002/sim.70442","DOIUrl":"https://doi.org/10.1002/sim.70442","url":null,"abstract":"<p><p>We develop an integrative joint model for multivariate sparse functional and survival data to analyze Alzheimer's disease (AD) across multiple studies. To address missing-by-design outcomes in multi-cohort studies, our approach extends the multivariate functional mixed model (MFMM), which integrates longitudinal outcomes to extract shared disease progression trajectories and links these outcomes to time-to-event data through a parsimonious survival model. This framework balances flexibility and interpretability by modeling shared progression trajectories while accommodating cohort-specific mean functions and survival parameters. For efficient estimation, we incorporate penalized splines into an EM algorithm. Application to three AD cohorts demonstrates the model's ability to capture disease trajectories and account for inter-cohort variability. Simulation studies confirm its robustness and accuracy, highlighting its value in advancing the understanding of AD progression and supporting clinical decision-making in multi-cohort settings.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70442"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146182651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Group Lasso Based Selection for High-Dimensional Mediation Analysis. 基于群体套索的高维中介分析选择。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70351
Allan Jérolon, Flora Alarcon, Florence Pittion, Magali Richard, Olivier François, Etienne Birmelé, Vittorio Perduca

Mediation analysis aims to identify and estimate the effect of an exposure on an outcome that is mediated through one or more intermediate variables. In the presence of multiple intermediate variables, two pertinent methodological questions arise: estimating mediated effects when mediators are correlated, and performing high-dimensional mediation analyses when the number of mediators exceeds the sample size. This paper presents a two-step procedure for high-dimensional mediation analyses. The first step selects a reduced number of candidate mediators using an ad-hoc lasso penalty. The second step applies a procedure we previously developed to estimate the mediated effects, accounting for the correlation structure among the retained candidate mediators. We compare the performance of the proposed two-step procedure with state-of-the-art methods using simulated data. Additionally, we demonstrate its practical application by estimating the causal role of DNA methylation (DNAm) in the pathway between smoking and rheumatoid arthritis (RA) using real data.

中介分析旨在识别和估计暴露对通过一个或多个中间变量中介的结果的影响。在存在多个中间变量的情况下,出现了两个相关的方法学问题:当中介因子相关时估计中介效应,当中介因子数量超过样本量时进行高维中介分析。本文提出了一个高维中介分析的两步程序。第一步使用特别套索惩罚选择减少数量的候选中介。第二步应用我们之前开发的程序来估计中介效应,考虑保留的候选中介之间的相关结构。我们比较了性能提出的两步程序与国家的最先进的方法使用模拟数据。此外,我们通过使用真实数据估计DNA甲基化(DNAm)在吸烟和类风湿性关节炎(RA)之间的途径中的因果作用来证明其实际应用。
{"title":"Group Lasso Based Selection for High-Dimensional Mediation Analysis.","authors":"Allan Jérolon, Flora Alarcon, Florence Pittion, Magali Richard, Olivier François, Etienne Birmelé, Vittorio Perduca","doi":"10.1002/sim.70351","DOIUrl":"https://doi.org/10.1002/sim.70351","url":null,"abstract":"<p><p>Mediation analysis aims to identify and estimate the effect of an exposure on an outcome that is mediated through one or more intermediate variables. In the presence of multiple intermediate variables, two pertinent methodological questions arise: estimating mediated effects when mediators are correlated, and performing high-dimensional mediation analyses when the number of mediators exceeds the sample size. This paper presents a two-step procedure for high-dimensional mediation analyses. The first step selects a reduced number of candidate mediators using an ad-hoc lasso penalty. The second step applies a procedure we previously developed to estimate the mediated effects, accounting for the correlation structure among the retained candidate mediators. We compare the performance of the proposed two-step procedure with state-of-the-art methods using simulated data. Additionally, we demonstrate its practical application by estimating the causal role of DNA methylation (DNAm) in the pathway between smoking and rheumatoid arthritis (RA) using real data.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70351"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the Benefits and Burdens of Preventive Interventions. 评估预防干预措施的利益和负担。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70410
Yi Xiong, Kwun C G Chan, Malka Gorfine, Li Hsu

Cancer prevention is recognized as a key strategy for reducing disease incidence, mortality, and the overall burden on individuals and society. However, determining when to begin preventive interventions presents a significant challenge: starting too early may lead to more interventions and increased lifetime burdens due to repeated administrations, while delaying may miss opportunities to prevent cancer. Evidence-based recommendations require a benefit-burden analysis that weighs life-years gained against the burden of interventions. With the growing availability of large-scale observational data, there is now an opportunity to empirically evaluate these trade-offs. In this paper, we propose a causal framework for assessing the benefit and burden of cancer prevention, using an illness-death model with semi-competing risks. Extensive simulations demonstrate that the proposed estimators are unbiased, with robust inference across realistic scenarios. We apply this approach to a benefit-burden analysis of the preventive screening for colorectal cancer, utilizing data from the large-scale Women's Health Initiative. Our findings suggest that initiating screening at age 50 years achieves the highest life-year gains with an acceptable incremental burden-to-benefit ratio compared to no screening, contributing valuable real-world evidence to the field of preventive cancer interventions.

癌症预防被认为是降低疾病发病率、死亡率以及个人和社会总体负担的关键战略。然而,确定何时开始预防性干预是一项重大挑战:过早开始可能导致更多的干预,并因反复给药而增加终生负担,而拖延可能会错过预防癌症的机会。基于证据的建议需要进行利益负担分析,权衡获得的生命年数与干预措施的负担。随着大规模观测数据的日益可用性,现在有机会对这些权衡进行经验评估。在本文中,我们提出了一个因果框架来评估癌症预防的利益和负担,使用一个半竞争风险的疾病-死亡模型。大量的模拟表明,所提出的估计器是无偏的,具有跨现实场景的鲁棒推断。我们利用大规模妇女健康倡议的数据,将这种方法应用于结直肠癌预防性筛查的利益-负担分析。我们的研究结果表明,与不进行筛查相比,在50岁开始筛查可获得最高的生命年收益,并具有可接受的增量负担-收益比,为预防性癌症干预领域提供了有价值的现实证据。
{"title":"Assessing the Benefits and Burdens of Preventive Interventions.","authors":"Yi Xiong, Kwun C G Chan, Malka Gorfine, Li Hsu","doi":"10.1002/sim.70410","DOIUrl":"https://doi.org/10.1002/sim.70410","url":null,"abstract":"<p><p>Cancer prevention is recognized as a key strategy for reducing disease incidence, mortality, and the overall burden on individuals and society. However, determining when to begin preventive interventions presents a significant challenge: starting too early may lead to more interventions and increased lifetime burdens due to repeated administrations, while delaying may miss opportunities to prevent cancer. Evidence-based recommendations require a benefit-burden analysis that weighs life-years gained against the burden of interventions. With the growing availability of large-scale observational data, there is now an opportunity to empirically evaluate these trade-offs. In this paper, we propose a causal framework for assessing the benefit and burden of cancer prevention, using an illness-death model with semi-competing risks. Extensive simulations demonstrate that the proposed estimators are unbiased, with robust inference across realistic scenarios. We apply this approach to a benefit-burden analysis of the preventive screening for colorectal cancer, utilizing data from the large-scale Women's Health Initiative. Our findings suggest that initiating screening at age 50 years achieves the highest life-year gains with an acceptable incremental burden-to-benefit ratio compared to no screening, contributing valuable real-world evidence to the field of preventive cancer interventions.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70410"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146126418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multivariate and Online Transfer Learning With Uncertainty Quantification. 不确定量化的多元在线迁移学习。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-02-01 DOI: 10.1002/sim.70398
Jimmy Hickey, Jonathan P Williams, Brian J Reich, Emily C Hector

Untreated periodontitis causes inflammation within the supporting tissue of the teeth and can ultimately lead to tooth loss. Modeling periodontal outcomes is beneficial as they are difficult and time-consuming to measure, but disparities in representation between demographic groups must be considered. There may not be enough participants to build group-specific models, and it can be ineffective, and even dangerous, to apply a model to participants in an underrepresented group if demographic differences were not considered during training. We propose an extension to the RECaST Bayesian transfer learning framework. Our method jointly models multivariate outcomes, exhibiting significant improvement over the previous univariate RECaST method. Further, we introduce an online approach to model sequential data sets. Negative transfer is mitigated to ensure that the information shared from the other demographic groups does not negatively impact the modeling of the underrepresented participants. The Bayesian framework naturally provides uncertainty quantification on predictions. Especially important in medical applications, our method does not share data between domains. We demonstrate the effectiveness of our method in both predictive performance and uncertainty quantification on simulated data and on a database of dental records from the HealthPartners Institute.

牙周炎未经治疗会导致牙齿的支撑组织发炎,最终导致牙齿脱落。牙周结果建模是有益的,因为测量它们是困难和耗时的,但必须考虑到人口群体之间代表性的差异。可能没有足够的参与者来建立特定群体的模型,如果在培训期间没有考虑人口统计学差异,将模型应用于代表性不足的群体的参与者可能是无效的,甚至是危险的。我们提出了对RECaST贝叶斯迁移学习框架的扩展。我们的方法联合建模多变量结果,比以前的单变量RECaST方法有显著改进。此外,我们还引入了一种在线方法来对序列数据集进行建模。减少负迁移,以确保从其他人口统计群体共享的信息不会对代表性不足的参与者的建模产生负面影响。贝叶斯框架自然地为预测提供了不确定性量化。在医疗应用中尤其重要的是,我们的方法不会在域之间共享数据。我们在模拟数据和HealthPartners研究所牙科记录数据库上证明了我们的方法在预测性能和不确定性量化方面的有效性。
{"title":"Multivariate and Online Transfer Learning With Uncertainty Quantification.","authors":"Jimmy Hickey, Jonathan P Williams, Brian J Reich, Emily C Hector","doi":"10.1002/sim.70398","DOIUrl":"10.1002/sim.70398","url":null,"abstract":"<p><p>Untreated periodontitis causes inflammation within the supporting tissue of the teeth and can ultimately lead to tooth loss. Modeling periodontal outcomes is beneficial as they are difficult and time-consuming to measure, but disparities in representation between demographic groups must be considered. There may not be enough participants to build group-specific models, and it can be ineffective, and even dangerous, to apply a model to participants in an underrepresented group if demographic differences were not considered during training. We propose an extension to the RECaST Bayesian transfer learning framework. Our method jointly models multivariate outcomes, exhibiting significant improvement over the previous univariate RECaST method. Further, we introduce an online approach to model sequential data sets. Negative transfer is mitigated to ensure that the information shared from the other demographic groups does not negatively impact the modeling of the underrepresented participants. The Bayesian framework naturally provides uncertainty quantification on predictions. Especially important in medical applications, our method does not share data between domains. We demonstrate the effectiveness of our method in both predictive performance and uncertainty quantification on simulated data and on a database of dental records from the HealthPartners Institute.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"45 3-5","pages":"e70398"},"PeriodicalIF":1.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12872040/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistics in Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1