Development and Internal Validation of an Interpretable Machine Learning Model to Predict Readmissions in a United States Healthcare System

IF 3.4 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Informatics Pub Date : 2023-03-27 DOI:10.3390/informatics10020033
Amanda L. Luo, Akshay Ravi, Simone Arvisais-Anhalt, Anoop Muniyappa, Xinran Liu, Sha Wang
{"title":"Development and Internal Validation of an Interpretable Machine Learning Model to Predict Readmissions in a United States Healthcare System","authors":"Amanda L. Luo, Akshay Ravi, Simone Arvisais-Anhalt, Anoop Muniyappa, Xinran Liu, Sha Wang","doi":"10.3390/informatics10020033","DOIUrl":null,"url":null,"abstract":"(1) One in four hospital readmissions is potentially preventable. Machine learning (ML) models have been developed to predict hospital readmissions and risk-stratify patients, but thus far they have been limited in clinical applicability, timeliness, and generalizability. (2) Methods: Using deidentified clinical data from the University of California, San Francisco (UCSF) between January 2016 and November 2021, we developed and compared four supervised ML models (logistic regression, random forest, gradient boosting, and XGBoost) to predict 30-day readmissions for adults admitted to a UCSF hospital. (3) Results: Of 147,358 inpatient encounters, 20,747 (13.9%) patients were readmitted within 30 days of discharge. The final model selected was XGBoost, which had an area under the receiver operating characteristic curve of 0.783 and an area under the precision-recall curve of 0.434. The most important features by Shapley Additive Explanations were days since last admission, discharge department, and inpatient length of stay. (4) Conclusions: We developed and internally validated a supervised ML model to predict 30-day readmissions in a US-based healthcare system. This model has several advantages including state-of-the-art performance metrics, the use of clinical data, the use of features available within 24 h of discharge, and generalizability to multiple disease states.","PeriodicalId":37100,"journal":{"name":"Informatics","volume":"10 1","pages":"33"},"PeriodicalIF":3.4000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/informatics10020033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

(1) One in four hospital readmissions is potentially preventable. Machine learning (ML) models have been developed to predict hospital readmissions and risk-stratify patients, but thus far they have been limited in clinical applicability, timeliness, and generalizability. (2) Methods: Using deidentified clinical data from the University of California, San Francisco (UCSF) between January 2016 and November 2021, we developed and compared four supervised ML models (logistic regression, random forest, gradient boosting, and XGBoost) to predict 30-day readmissions for adults admitted to a UCSF hospital. (3) Results: Of 147,358 inpatient encounters, 20,747 (13.9%) patients were readmitted within 30 days of discharge. The final model selected was XGBoost, which had an area under the receiver operating characteristic curve of 0.783 and an area under the precision-recall curve of 0.434. The most important features by Shapley Additive Explanations were days since last admission, discharge department, and inpatient length of stay. (4) Conclusions: We developed and internally validated a supervised ML model to predict 30-day readmissions in a US-based healthcare system. This model has several advantages including state-of-the-art performance metrics, the use of clinical data, the use of features available within 24 h of discharge, and generalizability to multiple disease states.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一个可解释的机器学习模型的开发和内部验证,以预测美国医疗保健系统的再入院率
(1) 四分之一的再次入院可能是可以预防的。机器学习(ML)模型已被开发用于预测医院再次入院和对患者进行风险分层,但到目前为止,它们在临床适用性、及时性和可推广性方面受到限制。(2) 方法:利用加州大学旧金山分校(UCSF)2016年1月至2021年11月的非识别临床数据,我们开发并比较了四种监督ML模型(逻辑回归、随机森林、梯度增强和XGBoost),以预测加州大学旧金山分校医院收治的成年人30天的再入院情况。(3) 结果:在147358例住院患者中,20747例(13.9%)患者在出院后30天内再次入院。最终选择的型号是XGBoost,其受试者工作特性曲线下的面积为0.783,精密召回曲线下的区域为0.434。Shapley加法解释最重要的特征是自上次入院以来的天数、出院部门和住院时间。(4) 结论:我们开发并内部验证了一个监督ML模型,用于预测美国医疗系统中30天的再次入院。该模型具有几个优点,包括最先进的性能指标、临床数据的使用、出院24小时内可用特征的使用,以及对多种疾病状态的可推广性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Informatics
Informatics Social Sciences-Communication
CiteScore
6.60
自引率
6.50%
发文量
88
审稿时长
6 weeks
期刊最新文献
Simulation of discrete control systems with parallelism of behavior Formal description model and conditions for detecting linked coupling faults of the memory devices A model of homographs automatic identification for the Belarusian language Ontological analysis in the problems of container applications threat modelling Closed Gordon – Newell network with single-line poles and exponentially limited request waiting time
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1