A comparison of some existing and novel methods for integrating historical models to improve estimation of coefficients in logistic regression.

IF 1.5 3区 数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS Journal of the Royal Statistical Society Series A-Statistics in Society Pub Date : 2024-09-24 eCollection Date: 2025-01-01 DOI:10.1093/jrsssa/qnae093
Philip S Boonstra, Pedro Orozco Del Pino
{"title":"A comparison of some existing and novel methods for integrating historical models to improve estimation of coefficients in logistic regression.","authors":"Philip S Boonstra, Pedro Orozco Del Pino","doi":"10.1093/jrsssa/qnae093","DOIUrl":null,"url":null,"abstract":"<p><p>Model integration refers to the process of incorporating a fitted historical model into the estimation of a current study to increase statistical efficiency. Integration can be challenging when the current model includes new covariates, leading to potential model misspecification. We present and evaluate seven existing and novel model integration techniques, which employ both likelihood constraints and Bayesian informative priors. Using a simulation study of logistic regression, we quantify how efficiency-assessed by bias and variance-changes with the sample sizes of both historical and current studies and in response to violations to transportability assumptions. We also apply these methods to a case study in which the goal is to use novel predictors to update a risk prediction model for in-hospital mortality among pediatric extracorporeal membrane oxygenation patients. Our simulation study and case study suggest that (i) when historical sample size is small, accounting for this statistical uncertainty is more efficient; (ii) all methods lose efficiency when there exist differences between the historical and current data-generating mechanisms; (iii) additional shrinkage to zero can improve efficiency in higher-dimensional settings but at the cost of bias in estimation.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 1","pages":"46-67"},"PeriodicalIF":1.5000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11728056/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series A-Statistics in Society","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/jrsssa/qnae093","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Model integration refers to the process of incorporating a fitted historical model into the estimation of a current study to increase statistical efficiency. Integration can be challenging when the current model includes new covariates, leading to potential model misspecification. We present and evaluate seven existing and novel model integration techniques, which employ both likelihood constraints and Bayesian informative priors. Using a simulation study of logistic regression, we quantify how efficiency-assessed by bias and variance-changes with the sample sizes of both historical and current studies and in response to violations to transportability assumptions. We also apply these methods to a case study in which the goal is to use novel predictors to update a risk prediction model for in-hospital mortality among pediatric extracorporeal membrane oxygenation patients. Our simulation study and case study suggest that (i) when historical sample size is small, accounting for this statistical uncertainty is more efficient; (ii) all methods lose efficiency when there exist differences between the historical and current data-generating mechanisms; (iii) additional shrinkage to zero can improve efficiency in higher-dimensional settings but at the cost of bias in estimation.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
结合历史模型改进逻辑回归中系数估计的一些现有方法和新方法的比较。
模型整合是指将拟合的历史模型纳入当前研究的估计中以提高统计效率的过程。当当前模型包含新的协变量时,集成可能具有挑战性,从而导致潜在的模型规格错误。我们提出并评估了七种现有的和新颖的模型集成技术,它们同时采用了似然约束和贝叶斯信息先验。通过对逻辑回归的模拟研究,我们量化了效率(通过偏差和方差评估)如何随着历史和当前研究的样本量以及对可运输性假设的违反而变化。我们还将这些方法应用于一个案例研究,目的是使用新的预测因子来更新儿科体外膜氧合患者住院死亡率的风险预测模型。我们的模拟研究和案例研究表明:(i)当历史样本量较小时,对这种统计不确定性的考虑更有效;当历史数据生成机制和当前数据生成机制存在差异时,所有方法都失去效率;(iii)额外的零收缩可以提高高维环境下的效率,但代价是估计偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.90
自引率
5.00%
发文量
136
审稿时长
>12 weeks
期刊介绍: Series A (Statistics in Society) publishes high quality papers that demonstrate how statistical thinking, design and analyses play a vital role in all walks of life and benefit society in general. There is no restriction on subject-matter: any interesting, topical and revelatory applications of statistics are welcome. For example, important applications of statistical and related data science methodology in medicine, business and commerce, industry, economics and finance, education and teaching, physical and biomedical sciences, the environment, the law, government and politics, demography, psychology, sociology and sport all fall within the journal''s remit. The journal is therefore aimed at a wide statistical audience and at professional statisticians in particular. Its emphasis is on well-written and clearly reasoned quantitative approaches to problems in the real world rather than the exposition of technical detail. Thus, although the methodological basis of papers must be sound and adequately explained, methodology per se should not be the main focus of a Series A paper. Of particular interest are papers on topical or contentious statistical issues, papers which give reviews or exposés of current statistical concerns and papers which demonstrate how appropriate statistical thinking has contributed to our understanding of important substantive questions. Historical, professional and biographical contributions are also welcome, as are discussions of methods of data collection and of ethical issues, provided that all such papers have substantial statistical relevance.
期刊最新文献
Studying Chinese immigrants' spatial distribution in the Raleigh-Durham area by linking survey and commercial data using romanized names. A comparison of some existing and novel methods for integrating historical models to improve estimation of coefficients in logistic regression. Synthesis estimators for transportability with positivity violations by a continuous covariate. Data-integration with pseudoweights and survey-calibration: application to developing US-representative lung cancer risk models for use in screening. A framework for understanding selection bias in real-world healthcare data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1