{"title":"A comparison of some existing and novel methods for integrating historical models to improve estimation of coefficients in logistic regression.","authors":"Philip S Boonstra, Pedro Orozco Del Pino","doi":"10.1093/jrsssa/qnae093","DOIUrl":null,"url":null,"abstract":"<p><p>Model integration refers to the process of incorporating a fitted historical model into the estimation of a current study to increase statistical efficiency. Integration can be challenging when the current model includes new covariates, leading to potential model misspecification. We present and evaluate seven existing and novel model integration techniques, which employ both likelihood constraints and Bayesian informative priors. Using a simulation study of logistic regression, we quantify how efficiency-assessed by bias and variance-changes with the sample sizes of both historical and current studies and in response to violations to transportability assumptions. We also apply these methods to a case study in which the goal is to use novel predictors to update a risk prediction model for in-hospital mortality among pediatric extracorporeal membrane oxygenation patients. Our simulation study and case study suggest that (i) when historical sample size is small, accounting for this statistical uncertainty is more efficient; (ii) all methods lose efficiency when there exist differences between the historical and current data-generating mechanisms; (iii) additional shrinkage to zero can improve efficiency in higher-dimensional settings but at the cost of bias in estimation.</p>","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 1","pages":"46-67"},"PeriodicalIF":1.5000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11728056/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series A-Statistics in Society","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/jrsssa/qnae093","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Model integration refers to the process of incorporating a fitted historical model into the estimation of a current study to increase statistical efficiency. Integration can be challenging when the current model includes new covariates, leading to potential model misspecification. We present and evaluate seven existing and novel model integration techniques, which employ both likelihood constraints and Bayesian informative priors. Using a simulation study of logistic regression, we quantify how efficiency-assessed by bias and variance-changes with the sample sizes of both historical and current studies and in response to violations to transportability assumptions. We also apply these methods to a case study in which the goal is to use novel predictors to update a risk prediction model for in-hospital mortality among pediatric extracorporeal membrane oxygenation patients. Our simulation study and case study suggest that (i) when historical sample size is small, accounting for this statistical uncertainty is more efficient; (ii) all methods lose efficiency when there exist differences between the historical and current data-generating mechanisms; (iii) additional shrinkage to zero can improve efficiency in higher-dimensional settings but at the cost of bias in estimation.
期刊介绍:
Series A (Statistics in Society) publishes high quality papers that demonstrate how statistical thinking, design and analyses play a vital role in all walks of life and benefit society in general. There is no restriction on subject-matter: any interesting, topical and revelatory applications of statistics are welcome. For example, important applications of statistical and related data science methodology in medicine, business and commerce, industry, economics and finance, education and teaching, physical and biomedical sciences, the environment, the law, government and politics, demography, psychology, sociology and sport all fall within the journal''s remit. The journal is therefore aimed at a wide statistical audience and at professional statisticians in particular. Its emphasis is on well-written and clearly reasoned quantitative approaches to problems in the real world rather than the exposition of technical detail. Thus, although the methodological basis of papers must be sound and adequately explained, methodology per se should not be the main focus of a Series A paper. Of particular interest are papers on topical or contentious statistical issues, papers which give reviews or exposés of current statistical concerns and papers which demonstrate how appropriate statistical thinking has contributed to our understanding of important substantive questions. Historical, professional and biographical contributions are also welcome, as are discussions of methods of data collection and of ethical issues, provided that all such papers have substantial statistical relevance.