首页 > 最新文献

Statistics in Medicine最新文献

英文 中文
What's the Weight? Estimating Controlled Outcome Differences in Complex Surveys for Health Disparities Research. 重量是多少?估计健康差异研究中复杂调查的受控结果差异。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70289
Stephen Salerno, Emily K Roberts, Belinda L Needham, Tyler H McCormick, Fan Li, Bhramar Mukherjee, Xu Shi

In this work, we are motivated by the problem of estimating racial disparities in health outcomes, specifically the average controlled difference (ACD) in telomere length between Black and White individuals, using data from the National Health and Nutrition Examination Survey (NHANES). To do so, we build a propensity for race to properly adjust for other social determinants while characterizing the controlled effect of race on telomere length. Propensity score methods are broadly employed with observational data as a tool to achieve covariate balance, but how to implement them in complex surveys is less studied-in particular, when the survey weights depend on the group variable under comparison (as the NHANES sampling scheme depends on self-reported race). We propose identification formulas to properly estimate the ACD in outcomes between Black and White individuals, with appropriate weighting for both covariate imbalance across the two racial groups and generalizability. Via extensive simulation, we show that our proposed methods outperform traditional analytic approaches in terms of bias, mean squared error, and coverage when estimating the ACD for our setting of interest. In our data, we find that evidence of racial differences in telomere length between Black and White individuals attenuates after accounting for confounding by socioeconomic factors and utilizing appropriate propensity score and survey weighting techniques. Software to implement these methods and code to reproduce our results can be found in the R package svycdiff, available through the Comprehensive R Archive Network (CRAN) at cran.r-project.org/web/packages/svycdiff/, or in a development version on GitHub at github.com/salernos/svycdiff.

在这项工作中,我们的动机是估计健康结果的种族差异问题,特别是黑人和白人之间端粒长度的平均控制差异(ACD),使用来自国家健康和营养检查调查(NHANES)的数据。为此,我们建立了种族倾向,以适当地调整其他社会决定因素,同时表征种族对端粒长度的控制效应。倾向评分方法广泛用于观察数据作为实现协变量平衡的工具,但如何在复杂调查中实施这些方法的研究较少-特别是当调查权重依赖于比较的组变量时(如NHANES抽样方案依赖于自我报告的种族)。我们提出了识别公式,以适当地估计黑人和白人个体之间结果的ACD,并为两个种族群体之间的协变量不平衡和概括性提供适当的权重。通过广泛的模拟,我们表明,在估计我们感兴趣的设置的ACD时,我们提出的方法在偏差、均方误差和覆盖率方面优于传统的分析方法。在我们的数据中,我们发现黑人和白人之间端粒长度的种族差异的证据在考虑了社会经济因素的混淆和使用适当的倾向评分和调查加权技术后减弱。实现这些方法的软件和复制我们结果的代码可以在R包svycdiff中找到,它可以通过综合R存档网络(CRAN)在cran.r-project.org/web/packages/svycdiff/上获得,或者在GitHub上的开发版本github.com/salernos/svycdiff上获得。
{"title":"What's the Weight? Estimating Controlled Outcome Differences in Complex Surveys for Health Disparities Research.","authors":"Stephen Salerno, Emily K Roberts, Belinda L Needham, Tyler H McCormick, Fan Li, Bhramar Mukherjee, Xu Shi","doi":"10.1002/sim.70289","DOIUrl":"10.1002/sim.70289","url":null,"abstract":"<p><p>In this work, we are motivated by the problem of estimating racial disparities in health outcomes, specifically the average controlled difference (ACD) in telomere length between Black and White individuals, using data from the National Health and Nutrition Examination Survey (NHANES). To do so, we build a propensity for race to properly adjust for other social determinants while characterizing the controlled effect of race on telomere length. Propensity score methods are broadly employed with observational data as a tool to achieve covariate balance, but how to implement them in complex surveys is less studied-in particular, when the survey weights depend on the group variable under comparison (as the NHANES sampling scheme depends on self-reported race). We propose identification formulas to properly estimate the ACD in outcomes between Black and White individuals, with appropriate weighting for both covariate imbalance across the two racial groups and generalizability. Via extensive simulation, we show that our proposed methods outperform traditional analytic approaches in terms of bias, mean squared error, and coverage when estimating the ACD for our setting of interest. In our data, we find that evidence of racial differences in telomere length between Black and White individuals attenuates after accounting for confounding by socioeconomic factors and utilizing appropriate propensity score and survey weighting techniques. Software to implement these methods and code to reproduce our results can be found in the R package svycdiff, available through the Comprehensive R Archive Network (CRAN) at cran.r-project.org/web/packages/svycdiff/, or in a development version on GitHub at github.com/salernos/svycdiff.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70289"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12636266/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145239759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Prediction Using Functional Latent Trait Joint Models for Multivariate Longitudinal Outcomes: An Application to Parkinson's Disease. 多变量纵向结果的功能潜在特征联合模型动态预测:在帕金森病中的应用。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70285
Mohammad Samsul Alam, Dongrak Choi, Salil Koner, Sheng Luo

The progressive and multifaceted nature of Parkinson's disease (PD) calls for the integration of diverse data types, including continuous, ordinal, and binary, in longitudinal studies for a comprehensive understanding of symptom progression and disease trajectory. Significant terminal events, such as severe disability or mortality, highlight the need for joint modeling approaches that simultaneously address multivariate outcomes and time-to-event data. We introduce functional latent trait model-joint model (FLTM-JM), a novel joint modeling framework based on the functional latent trait model (FLTM), to jointly analyze multivariate longitudinal data and survival outcomes. The FLTM component leverages a non-parametric, function-on-scalar regression framework, enabling flexible modeling of complex relationships between covariates and patient outcomes over time. This joint modeling approach supports dynamic, subject-specific predictions, offering valuable insights for personalized treatment strategies. Applied to Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS) data from the Parkinson's Progression Markers Initiative (PPMI), our model effectively identifies the influence of key covariates and demonstrates the utility of dynamic predictions in clinical decision-making. Extensive simulation studies validate the accuracy, robustness, and computational efficiency of FLTM-JM, even under model misspecification.

帕金森病(PD)的进行性和多面性要求在纵向研究中整合不同的数据类型,包括连续的、有序的和二进制的,以全面了解症状进展和疾病轨迹。重大的终末事件,如严重残疾或死亡,强调需要联合建模方法,同时处理多变量结果和事件时间数据。在功能潜在性状模型(FLTM)的基础上,引入一种新的联合建模框架——功能潜在性状模型-联合模型(FLTM- jm),对多变量纵向数据和生存结果进行联合分析。FLTM组件利用非参数、标量函数回归框架,可以灵活地建模协变量与患者预后之间的复杂关系。这种联合建模方法支持动态的、特定主题的预测,为个性化治疗策略提供了有价值的见解。应用于运动障碍学会统一帕金森病评定量表(MDS-UPDRS)数据,我们的模型有效地识别了关键协变量的影响,并展示了动态预测在临床决策中的效用。大量的仿真研究验证了FLTM-JM的准确性、鲁棒性和计算效率,即使在模型不规范的情况下也是如此。
{"title":"Dynamic Prediction Using Functional Latent Trait Joint Models for Multivariate Longitudinal Outcomes: An Application to Parkinson's Disease.","authors":"Mohammad Samsul Alam, Dongrak Choi, Salil Koner, Sheng Luo","doi":"10.1002/sim.70285","DOIUrl":"10.1002/sim.70285","url":null,"abstract":"<p><p>The progressive and multifaceted nature of Parkinson's disease (PD) calls for the integration of diverse data types, including continuous, ordinal, and binary, in longitudinal studies for a comprehensive understanding of symptom progression and disease trajectory. Significant terminal events, such as severe disability or mortality, highlight the need for joint modeling approaches that simultaneously address multivariate outcomes and time-to-event data. We introduce functional latent trait model-joint model (FLTM-JM), a novel joint modeling framework based on the functional latent trait model (FLTM), to jointly analyze multivariate longitudinal data and survival outcomes. The FLTM component leverages a non-parametric, function-on-scalar regression framework, enabling flexible modeling of complex relationships between covariates and patient outcomes over time. This joint modeling approach supports dynamic, subject-specific predictions, offering valuable insights for personalized treatment strategies. Applied to Movement Disorder Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS) data from the Parkinson's Progression Markers Initiative (PPMI), our model effectively identifies the influence of key covariates and demonstrates the utility of dynamic predictions in clinical decision-making. Extensive simulation studies validate the accuracy, robustness, and computational efficiency of FLTM-JM, even under model misspecification.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70285"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12614809/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145309197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Risk Factors for Pathogenic Dose Accrual From Longitudinal Data. 从纵向数据估计致病剂量累积的危险因素。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70291
Daniel K Sewell, Kelly K Baker

Estimating risk factors for the incidence of a disease is crucial for understanding its etiology. For diseases caused by enteric pathogens, off-the-shelf statistical model-based approaches do not consider the biological mechanisms through which infection occurs and thus can only be used to make comparatively weak statements about the association between risk factors and incidence. Building off of established work in quantitative microbiological risk assessment, we propose a new approach to determining the association between risk factors and dose accrual rates. Our more mechanistic approach achieves a higher degree of biological plausibility, incorporates currently ignored sources of variability, and provides regression parameters that are easily interpretable as the dose accrual rate ratio due to changes in the risk factors under study. We also describe a method for leveraging information across multiple pathogens. The proposed methods are available as an R package at https://github.com/dksewell/dare. Our simulation study shows unacceptable coverage rates from generalized linear models, while the proposed approach empirically maintains the nominal rate even when the model is misspecified. Finally, we demonstrated our proposed approach by applying our method to infant data obtained through the PATHOME study (https://reporter.nih.gov/project-details/10227256), discovering the impact of various environmental factors on infant enteric infections.

估计疾病发生的危险因素对于了解其病因是至关重要的。对于由肠道病原体引起的疾病,现有的基于统计模型的方法没有考虑感染发生的生物学机制,因此只能对危险因素与发病率之间的关系作出相对薄弱的陈述。在定量微生物风险评估的既定工作基础上,我们提出了一种确定风险因素与剂量应计率之间关系的新方法。我们更机械的方法实现了更高程度的生物学合理性,纳入了目前被忽视的变异性来源,并提供了回归参数,这些参数很容易解释为由于所研究的危险因素变化而产生的剂量应计率比率。我们还描述了一种跨多种病原体利用信息的方法。建议的方法可以在https://github.com/dksewell/dare上以R包的形式获得。我们的模拟研究表明广义线性模型的覆盖率是不可接受的,而所提出的方法即使在模型被错误指定的情况下也能经验地保持名义率。最后,我们通过将我们的方法应用于通过PATHOME研究(https://reporter.nih.gov/project-details/10227256)获得的婴儿数据来证明我们提出的方法,发现各种环境因素对婴儿肠道感染的影响。
{"title":"Estimating Risk Factors for Pathogenic Dose Accrual From Longitudinal Data.","authors":"Daniel K Sewell, Kelly K Baker","doi":"10.1002/sim.70291","DOIUrl":"10.1002/sim.70291","url":null,"abstract":"<p><p>Estimating risk factors for the incidence of a disease is crucial for understanding its etiology. For diseases caused by enteric pathogens, off-the-shelf statistical model-based approaches do not consider the biological mechanisms through which infection occurs and thus can only be used to make comparatively weak statements about the association between risk factors and incidence. Building off of established work in quantitative microbiological risk assessment, we propose a new approach to determining the association between risk factors and dose accrual rates. Our more mechanistic approach achieves a higher degree of biological plausibility, incorporates currently ignored sources of variability, and provides regression parameters that are easily interpretable as the dose accrual rate ratio due to changes in the risk factors under study. We also describe a method for leveraging information across multiple pathogens. The proposed methods are available as an R package at https://github.com/dksewell/dare. Our simulation study shows unacceptable coverage rates from generalized linear models, while the proposed approach empirically maintains the nominal rate even when the model is misspecified. Finally, we demonstrated our proposed approach by applying our method to infant data obtained through the PATHOME study (https://reporter.nih.gov/project-details/10227256), discovering the impact of various environmental factors on infant enteric infections.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70291"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12503088/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145239741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Approach to Design Adaptive Clinical Trials With Time-to-Event Outcomes Based on a General Bayesian Posterior Distribution. 基于一般贝叶斯后验分布的具有事件时间结果的适应性临床试验设计方法。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70207
James M McGree, Antony M Overstall, Mark Jones, Robert K Mahar

Clinical trials are an integral component of medical research. Trials require careful design to, for example, maintain the safety of participants and to use resources efficiently. Adaptive clinical trials are often more efficient and ethical than standard or non-adaptive trials because they can require fewer participants, target more promising treatments, and stop early with sufficient evidence of effectiveness or harm. The design of adaptive trials is usually undertaken via simulation, which requires assumptions about the data-generating process to be specified a priori. Unfortunately, if such assumptions are misspecified, then the resulting trial design may not perform as expected, leading to, for example, reduced statistical power or an increased Type I error. Motivated by a clinical trial of a vaccine to protect against gastroenteritis in infants, we propose an approach to design adaptive clinical trials with time-to-event outcomes without needing to explicitly define the data-generating process. To facilitate this, we consider trial design within a general Bayesian framework where inference about the treatment effect is based on the partial likelihood. As a result, inference is robust to the form of the baseline hazard function, and we exploit this property to undertake trial design when the data-generating process is only implicitly defined. The benefits of this approach are demonstrated via an illustrative example and via redesigning our motivating clinical trial.

临床试验是医学研究的重要组成部分。试验需要精心设计,例如,维护参与者的安全,并有效地利用资源。适应性临床试验通常比标准或非适应性试验更有效、更合乎道德,因为它们需要的参与者更少,针对更有希望的治疗方法,并且在有足够的有效性或危害证据的情况下尽早停止。适应性试验的设计通常是通过模拟进行的,这需要对数据生成过程进行先验的假设。不幸的是,如果这些假设是错误的,那么最终的试验设计可能不会像预期的那样执行,例如,导致统计能力降低或I型误差增加。在一项婴儿肠胃炎疫苗临床试验的激励下,我们提出了一种设计具有事件时间结果的适应性临床试验的方法,而无需明确定义数据生成过程。为了促进这一点,我们在一般贝叶斯框架内考虑试验设计,其中关于治疗效果的推断是基于部分似然的。因此,推断对基线危险函数的形式是鲁棒的,当数据生成过程只是隐式定义时,我们利用这一特性进行试验设计。这种方法的好处是通过一个说明性的例子和通过重新设计我们的激励临床试验来证明的。
{"title":"An Approach to Design Adaptive Clinical Trials With Time-to-Event Outcomes Based on a General Bayesian Posterior Distribution.","authors":"James M McGree, Antony M Overstall, Mark Jones, Robert K Mahar","doi":"10.1002/sim.70207","DOIUrl":"10.1002/sim.70207","url":null,"abstract":"<p><p>Clinical trials are an integral component of medical research. Trials require careful design to, for example, maintain the safety of participants and to use resources efficiently. Adaptive clinical trials are often more efficient and ethical than standard or non-adaptive trials because they can require fewer participants, target more promising treatments, and stop early with sufficient evidence of effectiveness or harm. The design of adaptive trials is usually undertaken via simulation, which requires assumptions about the data-generating process to be specified a priori. Unfortunately, if such assumptions are misspecified, then the resulting trial design may not perform as expected, leading to, for example, reduced statistical power or an increased Type I error. Motivated by a clinical trial of a vaccine to protect against gastroenteritis in infants, we propose an approach to design adaptive clinical trials with time-to-event outcomes without needing to explicitly define the data-generating process. To facilitate this, we consider trial design within a general Bayesian framework where inference about the treatment effect is based on the partial likelihood. As a result, inference is robust to the form of the baseline hazard function, and we exploit this property to undertake trial design when the data-generating process is only implicitly defined. The benefits of this approach are demonstrated via an illustrative example and via redesigning our motivating clinical trial.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70207"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12510400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145252732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Mixture of Linear Mixed Models for Complex Longitudinal Data. 复杂纵向数据线性混合模型的深度混合。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70288
Lucas Kock, Nadja Klein, David J Nott

Mixtures of linear mixed models are widely used for modeling longitudinal data for which observation times differ between subjects. In typical applications, temporal trends are described using a basis expansion, with basis coefficients treated as random effects varying by subject. Additional random effects can describe variation between mixture components or other known sources of variation in complex designs. A key advantage of these models is that they provide a natural mechanism for clustering. Current versions of mixtures of linear mixed models are not specifically designed for the case where there are many observations per subject and complex temporal trends, which require a large number of basis functions to capture. In this case, the subject-specific basis coefficients are a high-dimensional random effects vector, for which the covariance matrix is hard to specify and estimate, especially if it varies between mixture components. To address this issue, we consider the use of deep mixture of factor analyzers models as a prior for the random effects. The resulting deep mixture of linear mixed models is well suited for high-dimensional settings, and we describe an efficient variational inference approach to posterior computation. The efficacy of the method is demonstrated in biomedical applications and on simulated data.

线性混合模型被广泛用于模拟不同对象观测时间不同的纵向数据。在典型的应用中,时间趋势是用基展开来描述的,基系数被视为随主题而变化的随机效应。附加的随机效应可以描述混合成分之间的变化或复杂设计中其他已知的变化来源。这些模型的一个关键优势是它们为集群提供了一种自然的机制。当前版本的线性混合模型并不是专门为每个主体有许多观测值和复杂的时间趋势的情况而设计的,这需要大量的基函数来捕获。在这种情况下,特定主题的基系数是一个高维随机效应向量,其协方差矩阵难以指定和估计,特别是当它在混合成分之间变化时。为了解决这个问题,我们考虑使用深度混合因子分析模型作为随机效应的先验。所得到的线性混合模型的深度混合非常适合高维设置,并且我们描述了一种有效的变分推理方法来进行后验计算。在生物医学应用和模拟数据上证明了该方法的有效性。
{"title":"Deep Mixture of Linear Mixed Models for Complex Longitudinal Data.","authors":"Lucas Kock, Nadja Klein, David J Nott","doi":"10.1002/sim.70288","DOIUrl":"10.1002/sim.70288","url":null,"abstract":"<p><p>Mixtures of linear mixed models are widely used for modeling longitudinal data for which observation times differ between subjects. In typical applications, temporal trends are described using a basis expansion, with basis coefficients treated as random effects varying by subject. Additional random effects can describe variation between mixture components or other known sources of variation in complex designs. A key advantage of these models is that they provide a natural mechanism for clustering. Current versions of mixtures of linear mixed models are not specifically designed for the case where there are many observations per subject and complex temporal trends, which require a large number of basis functions to capture. In this case, the subject-specific basis coefficients are a high-dimensional random effects vector, for which the covariance matrix is hard to specify and estimate, especially if it varies between mixture components. To address this issue, we consider the use of deep mixture of factor analyzers models as a prior for the random effects. The resulting deep mixture of linear mixed models is well suited for high-dimensional settings, and we describe an efficient variational inference approach to posterior computation. The efficacy of the method is demonstrated in biomedical applications and on simulated data.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70288"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12503021/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145239632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accounting for Misclassification of Cause of Death in Weighted Cumulative Incidence Functions for Causal Analyses. 因果分析中加权累积发生率函数中死因分类错误的解释。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70281
Jessie K Edwards, Bonnie E Shook-Sa, Giorgos Bakoyannis, Paul N Zivich, Michael E Herce, Stephen R Cole

Misclassification between causes of death can produce bias in estimated cumulative incidence functions. When estimating causal quantities, such as comparing the cumulative incidence of death due to specific causes under interventions, such bias can lead to suboptimal decision making. Here, a consistent semiparametric estimator of the cumulative incidence function under interventions in settings with misclassification between two event types is presented. The measurement parameters for this estimator can be informed by validation data or expert knowledge. Moreover, a modified bootstrap approach to variance estimation is proposed for confidence interval construction. The proposed estimator was applied to estimate the cumulative incidence of AIDS-related mortality in the Multicenter AIDS Cohort Study under single- versus combination-drug antiretroviral therapy regimens that may be subject to confounding. The proposed estimator is shown to be consistent and performed well in finite samples via a series of simulation experiments.

死亡原因之间的错误分类可能在估计累积发生率函数时产生偏差。在估计因果数量时,例如比较干预措施下特定原因导致的累积死亡发生率时,这种偏差可能导致次优决策。本文给出了在两种事件类型之间存在误分类的干预情况下累积关联函数的一致半参数估计。该估计器的测量参数可以通过验证数据或专家知识得到。此外,提出了一种改进的自举方差估计方法来构造置信区间。在多中心艾滋病队列研究中,采用单药与联合抗逆转录病毒治疗方案估计艾滋病相关死亡率的累积发生率,这可能会引起混淆。通过一系列的仿真实验,证明了所提出的估计器在有限样本下的一致性和良好的性能。
{"title":"Accounting for Misclassification of Cause of Death in Weighted Cumulative Incidence Functions for Causal Analyses.","authors":"Jessie K Edwards, Bonnie E Shook-Sa, Giorgos Bakoyannis, Paul N Zivich, Michael E Herce, Stephen R Cole","doi":"10.1002/sim.70281","DOIUrl":"10.1002/sim.70281","url":null,"abstract":"<p><p>Misclassification between causes of death can produce bias in estimated cumulative incidence functions. When estimating causal quantities, such as comparing the cumulative incidence of death due to specific causes under interventions, such bias can lead to suboptimal decision making. Here, a consistent semiparametric estimator of the cumulative incidence function under interventions in settings with misclassification between two event types is presented. The measurement parameters for this estimator can be informed by validation data or expert knowledge. Moreover, a modified bootstrap approach to variance estimation is proposed for confidence interval construction. The proposed estimator was applied to estimate the cumulative incidence of AIDS-related mortality in the Multicenter AIDS Cohort Study under single- versus combination-drug antiretroviral therapy regimens that may be subject to confounding. The proposed estimator is shown to be consistent and performed well in finite samples via a series of simulation experiments.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70281"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12695060/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145239714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating and Evaluating Counterfactual Prediction Models. 估计和评估反事实预测模型。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70287
Christopher B Boyer, Issa J Dahabreh, Jon A Steingrimsson

Counterfactual prediction methods are required when a model will be deployed in a setting where treatment policies differ from the setting where the model was developed, or when a model provides predictions under hypothetical interventions to support decision-making. However, estimating and evaluating counterfactual prediction models is challenging because, unlike traditional (factual) prediction, one does not observe the potential outcomes for all individuals under all treatment strategies of interest. Here, we discuss how to estimate a counterfactual prediction model, how to assess the model's performance, and how to perform model and tuning parameter selection. We provide identification and estimation results for counterfactual prediction models and for multiple measures of counterfactual model performance, including loss-based measures, the area under the receiver operating characteristic curve, and the calibration curve. Importantly, our results allow valid estimates of model performance under counterfactual intervention even if the candidate prediction model is misspecified, permitting a wider array of use cases. We illustrate these methods using simulation and apply them to the task of developing a statin-naïve risk prediction model for cardiovascular disease.

当一个模型将在一个治疗政策不同于模型开发环境的环境中部署时,或者当一个模型在假设的干预措施下提供预测以支持决策时,就需要使用反事实预测方法。然而,估计和评估反事实预测模型是具有挑战性的,因为与传统的(事实)预测不同,人们不能在所有感兴趣的治疗策略下观察到所有个体的潜在结果。在这里,我们讨论如何估计反事实预测模型,如何评估模型的性能,以及如何执行模型和调优参数选择。我们为反事实预测模型和反事实模型性能的多种度量提供了识别和估计结果,包括基于损失的度量、接收器工作特性曲线下的面积和校准曲线。重要的是,我们的结果允许在反事实干预下对模型性能进行有效估计,即使候选预测模型被错误指定,也允许更广泛的用例。我们使用模拟来说明这些方法,并将它们应用于开发心血管疾病statin-naïve风险预测模型的任务。
{"title":"Estimating and Evaluating Counterfactual Prediction Models.","authors":"Christopher B Boyer, Issa J Dahabreh, Jon A Steingrimsson","doi":"10.1002/sim.70287","DOIUrl":"10.1002/sim.70287","url":null,"abstract":"<p><p>Counterfactual prediction methods are required when a model will be deployed in a setting where treatment policies differ from the setting where the model was developed, or when a model provides predictions under hypothetical interventions to support decision-making. However, estimating and evaluating counterfactual prediction models is challenging because, unlike traditional (factual) prediction, one does not observe the potential outcomes for all individuals under all treatment strategies of interest. Here, we discuss how to estimate a counterfactual prediction model, how to assess the model's performance, and how to perform model and tuning parameter selection. We provide identification and estimation results for counterfactual prediction models and for multiple measures of counterfactual model performance, including loss-based measures, the area under the receiver operating characteristic curve, and the calibration curve. Importantly, our results allow valid estimates of model performance under counterfactual intervention even if the candidate prediction model is misspecified, permitting a wider array of use cases. We illustrate these methods using simulation and apply them to the task of developing a statin-naïve risk prediction model for cardiovascular disease.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70287"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12503020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145239796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling the Role of Baseline Risk and Additional Study-Level Covariates in Meta-Analysis of Treatment Effects. 基线风险和其他研究水平协变量在治疗效果荟萃分析中的作用建模。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70278
Phuc T Tran, Annamaria Guolo

The relationship between the treatment effect and the baseline risk is a recognized tool to investigate the heterogeneity of treatment effects in meta-analyses of clinical trials. Since the baseline risk is difficult to measure, a proxy is adopted, which is based on the rate of events for the subject under the control condition. The use of the proxy in terms of aggregated information at the study level implies that the data are affected by measurement errors, a problem that the literature has explored and addressed in recent years. This paper proposes an extension of the classical meta-analysis with baseline risk information, which includes additional study-specific covariates other than the rate of events to explain heterogeneity. Likelihood-based inference is carried out by including measurement error correction techniques necessary to prevent unreliable inference due to the measurement errors affecting the covariates summarized at the study level. Within-study covariances between risk measures and the covariate components are computed using Taylor expansions based on study-level covariate subgroup summary information. When such information is not available and, more generally, in order to reduce computational difficulties, a pseudo-likelihood solution is developed under a working independence assumption between the observed error-prone measures. The performance of the methods is investigated in a series of simulation studies under different specifications for the sample size, the between-study heterogeneity, and the underlying risk distribution. They are applied to a meta-analysis about the association between COVID-19 and schizophrenia.

治疗效果与基线风险之间的关系是临床试验荟萃分析中研究治疗效果异质性的公认工具。由于基线风险难以度量,因此采用了一个代理,该代理基于控制条件下受试者的事件发生率。在研究水平上使用汇总信息的代理意味着数据受到测量误差的影响,这是近年来文献探索和解决的一个问题。本文提出了经典荟萃分析与基线风险信息的扩展,其中包括额外的研究特定协变量,而不是事件发生率来解释异质性。基于似然的推断是通过包括必要的测量误差校正技术来进行的,以防止由于测量误差影响研究水平上总结的协变量而导致的不可靠推断。基于研究水平协变量子组汇总信息,使用泰勒展开计算风险度量和协变量成分之间的研究内协方差。当这些信息不可用时,更一般地说,为了减少计算困难,在观察到的容易出错的度量之间的工作独立性假设下,开发了伪似然解。在不同规格的样本量、研究间异质性和潜在风险分布下,对这些方法的性能进行了一系列模拟研究。它们被应用于一项关于COVID-19和精神分裂症之间关系的荟萃分析。
{"title":"Modeling the Role of Baseline Risk and Additional Study-Level Covariates in Meta-Analysis of Treatment Effects.","authors":"Phuc T Tran, Annamaria Guolo","doi":"10.1002/sim.70278","DOIUrl":"10.1002/sim.70278","url":null,"abstract":"<p><p>The relationship between the treatment effect and the baseline risk is a recognized tool to investigate the heterogeneity of treatment effects in meta-analyses of clinical trials. Since the baseline risk is difficult to measure, a proxy is adopted, which is based on the rate of events for the subject under the control condition. The use of the proxy in terms of aggregated information at the study level implies that the data are affected by measurement errors, a problem that the literature has explored and addressed in recent years. This paper proposes an extension of the classical meta-analysis with baseline risk information, which includes additional study-specific covariates other than the rate of events to explain heterogeneity. Likelihood-based inference is carried out by including measurement error correction techniques necessary to prevent unreliable inference due to the measurement errors affecting the covariates summarized at the study level. Within-study covariances between risk measures and the covariate components are computed using Taylor expansions based on study-level covariate subgroup summary information. When such information is not available and, more generally, in order to reduce computational difficulties, a pseudo-likelihood solution is developed under a working independence assumption between the observed error-prone measures. The performance of the methods is investigated in a series of simulation studies under different specifications for the sample size, the between-study heterogeneity, and the underlying risk distribution. They are applied to a meta-analysis about the association between COVID-19 and schizophrenia.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70278"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12548021/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145347463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Specification of Estimands for Complex Disease Processes Using Multistate Models and Utility Functions. 用多状态模型和效用函数说明复杂疾病过程的估计。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70269
Alexandra Bühler, Richard J Cook, Jerald F Lawless

In complex diseases, individuals are often at risk of several types of possibly semi-competing events and may experience recurrent symptomatic episodes. This complex disease course makes it challenging to define target estimands for clinical trials. While composite endpoints are routinely adopted, recent innovations involving the win ratio and other methods based on ranking the disease course have received considerable attention. We emphasize the usefulness of multistate models for addressing challenges arising in complex diseases, along with the simplicity and interpretability that come from defining utilities to synthesize evidence of treatment effects on different aspects of the disease process. Robust variance estimation based on the infinitesimal jackknife means that such methods can be used as the basis of primary analyses of clinical trials. We illustrate the use of utilities for the assessment of bleeding outcomes in a trial of cancer patients with thrombocytopenia.

在复杂疾病中,个体往往面临几种可能的半竞争性事件的风险,并可能经历反复的症状发作。这种复杂的疾病过程使得确定临床试验的目标估计值具有挑战性。虽然通常采用复合终点,但最近涉及胜率和其他基于病程排序的方法的创新已受到相当多的关注。我们强调多状态模型对于解决复杂疾病中出现的挑战的有用性,以及定义效用以综合疾病过程不同方面的治疗效果证据的简单性和可解释性。基于无穷小折刀的稳健方差估计意味着这种方法可以作为临床试验初步分析的基础。我们说明了在一项癌症患者伴血小板减少症的试验中使用实用工具来评估出血结果。
{"title":"Specification of Estimands for Complex Disease Processes Using Multistate Models and Utility Functions.","authors":"Alexandra Bühler, Richard J Cook, Jerald F Lawless","doi":"10.1002/sim.70269","DOIUrl":"10.1002/sim.70269","url":null,"abstract":"<p><p>In complex diseases, individuals are often at risk of several types of possibly semi-competing events and may experience recurrent symptomatic episodes. This complex disease course makes it challenging to define target estimands for clinical trials. While composite endpoints are routinely adopted, recent innovations involving the win ratio and other methods based on ranking the disease course have received considerable attention. We emphasize the usefulness of multistate models for addressing challenges arising in complex diseases, along with the simplicity and interpretability that come from defining utilities to synthesize evidence of treatment effects on different aspects of the disease process. Robust variance estimation based on the infinitesimal jackknife means that such methods can be used as the basis of primary analyses of clinical trials. We illustrate the use of utilities for the assessment of bleeding outcomes in a trial of cancer patients with thrombocytopenia.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70269"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12519945/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145287044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ChatGPT as a Tool for Biostatisticians: A Tutorial on Applications, Opportunities, and Limitations. ChatGPT作为生物统计学家的工具:应用、机会和局限性的教程。
IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-10-01 DOI: 10.1002/sim.70263
Dennis Dobler, Harald Binder, Anne-Laure Boulesteix, Jan-Bernd Igelmann, David Köhler, Ulrich Mansmann, Markus Pauly, André Scherag, Matthias Schmid, Amani Al Tawil, Susanne Weber

Modern large language models (LLMs) have reshaped the workflows of people across countless fields-and biostatistics is no exception. These models offer novel support in drafting study plans, generating software code, or writing reports. However, reliance on LLMs carries the risk of inaccuracies due to potential hallucinations that may produce fabricated "facts", leading to erroneous statistical statements and conclusions. Such errors could compromise the high precision and transparency fundamental to our field. This tutorial aims to illustrate the impact of LLM-based applications on various contemporary biostatistical tasks. We will explore both the risks and opportunities presented by this new era of artificial intelligence. Our ultimate conclusion emphasizes that advanced applications should only be used in combination with sufficient background knowledge. Over time, consistently verifying LLM outputs may lead to an appropriately calibrated trust in these tools among users.

现代大型语言模型(llm)已经重塑了人们在无数领域的工作流程——生物统计学也不例外。这些模型为起草研究计划、生成软件代码或编写报告提供了新颖的支持。然而,由于潜在的幻觉可能产生虚假的“事实”,导致错误的统计陈述和结论,对法学硕士的依赖存在不准确的风险。这样的错误可能会损害我们这个领域的高精度和透明度。本教程旨在说明基于法学硕士的应用程序对各种当代生物统计任务的影响。我们将探索人工智能新时代带来的风险和机遇。我们的最终结论强调,高级应用程序只应与足够的背景知识相结合使用。随着时间的推移,持续验证法学硕士输出可能会导致用户对这些工具的适当校准信任。
{"title":"ChatGPT as a Tool for Biostatisticians: A Tutorial on Applications, Opportunities, and Limitations.","authors":"Dennis Dobler, Harald Binder, Anne-Laure Boulesteix, Jan-Bernd Igelmann, David Köhler, Ulrich Mansmann, Markus Pauly, André Scherag, Matthias Schmid, Amani Al Tawil, Susanne Weber","doi":"10.1002/sim.70263","DOIUrl":"10.1002/sim.70263","url":null,"abstract":"<p><p>Modern large language models (LLMs) have reshaped the workflows of people across countless fields-and biostatistics is no exception. These models offer novel support in drafting study plans, generating software code, or writing reports. However, reliance on LLMs carries the risk of inaccuracies due to potential hallucinations that may produce fabricated \"facts\", leading to erroneous statistical statements and conclusions. Such errors could compromise the high precision and transparency fundamental to our field. This tutorial aims to illustrate the impact of LLM-based applications on various contemporary biostatistical tasks. We will explore both the risks and opportunities presented by this new era of artificial intelligence. Our ultimate conclusion emphasizes that advanced applications should only be used in combination with sufficient background knowledge. Over time, consistently verifying LLM outputs may lead to an appropriately calibrated trust in these tools among users.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 23-24","pages":"e70263"},"PeriodicalIF":1.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12548020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145347357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Statistics in Medicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1