Amanda L. Luo, Akshay Ravi, Simone Arvisais-Anhalt, Anoop Muniyappa, Xinran Liu, Sha Wang
{"title":"Development and Internal Validation of an Interpretable Machine Learning Model to Predict Readmissions in a United States Healthcare System","authors":"Amanda L. Luo, Akshay Ravi, Simone Arvisais-Anhalt, Anoop Muniyappa, Xinran Liu, Sha Wang","doi":"10.3390/informatics10020033","DOIUrl":null,"url":null,"abstract":"(1) One in four hospital readmissions is potentially preventable. Machine learning (ML) models have been developed to predict hospital readmissions and risk-stratify patients, but thus far they have been limited in clinical applicability, timeliness, and generalizability. (2) Methods: Using deidentified clinical data from the University of California, San Francisco (UCSF) between January 2016 and November 2021, we developed and compared four supervised ML models (logistic regression, random forest, gradient boosting, and XGBoost) to predict 30-day readmissions for adults admitted to a UCSF hospital. (3) Results: Of 147,358 inpatient encounters, 20,747 (13.9%) patients were readmitted within 30 days of discharge. The final model selected was XGBoost, which had an area under the receiver operating characteristic curve of 0.783 and an area under the precision-recall curve of 0.434. The most important features by Shapley Additive Explanations were days since last admission, discharge department, and inpatient length of stay. (4) Conclusions: We developed and internally validated a supervised ML model to predict 30-day readmissions in a US-based healthcare system. This model has several advantages including state-of-the-art performance metrics, the use of clinical data, the use of features available within 24 h of discharge, and generalizability to multiple disease states.","PeriodicalId":37100,"journal":{"name":"Informatics","volume":"10 1","pages":"33"},"PeriodicalIF":3.4000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/informatics10020033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
(1) One in four hospital readmissions is potentially preventable. Machine learning (ML) models have been developed to predict hospital readmissions and risk-stratify patients, but thus far they have been limited in clinical applicability, timeliness, and generalizability. (2) Methods: Using deidentified clinical data from the University of California, San Francisco (UCSF) between January 2016 and November 2021, we developed and compared four supervised ML models (logistic regression, random forest, gradient boosting, and XGBoost) to predict 30-day readmissions for adults admitted to a UCSF hospital. (3) Results: Of 147,358 inpatient encounters, 20,747 (13.9%) patients were readmitted within 30 days of discharge. The final model selected was XGBoost, which had an area under the receiver operating characteristic curve of 0.783 and an area under the precision-recall curve of 0.434. The most important features by Shapley Additive Explanations were days since last admission, discharge department, and inpatient length of stay. (4) Conclusions: We developed and internally validated a supervised ML model to predict 30-day readmissions in a US-based healthcare system. This model has several advantages including state-of-the-art performance metrics, the use of clinical data, the use of features available within 24 h of discharge, and generalizability to multiple disease states.