A literature-based approach to predict continuous hospital length of stay in adult acute care patients using admission variables: A single university center experience
Mieke Deschepper , Chloë De Smedt , Kirsten Colpaert
{"title":"A literature-based approach to predict continuous hospital length of stay in adult acute care patients using admission variables: A single university center experience","authors":"Mieke Deschepper , Chloë De Smedt , Kirsten Colpaert","doi":"10.1016/j.ijmedinf.2024.105678","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>To review the existing literature on predicting length of stay (LOS) and to apply the findings on a Real World Data example in a single hospital.</div></div><div><h3>Methods</h3><div>Performing a literature review on PubMed and Embase, focusing on adults, acute conditions, and hospital-wide prediction of LOS, summarizing all the variables and statistical methods used to predict LOS. Then, we use this set of variables on a single university hospital and run an XGBoost model with Survival Cox regression on the LOS, as well as a logistic regression on binary LOS (cut-off at 4 days). Model metrics are the concordance index (c-index) and area under the curve (AUC).</div></div><div><h3>Results</h3><div>After applying the search strategy and exclusion criteria, 57 articles are included in the study. The list of variables is long, but mostly non-clinical data are used in the existing literature. A wide range of statistical methods are used, with a recent trend toward machine learning models. The XGBoost model results for the Cox regression in a C-index of 0.87, and the logistic regression on binary LOS has an AUC of 0.94.</div></div><div><h3>Conclusions</h3><div>Many variables identified in the literature are not available at the time of admission, yet they are still used in models for predicting LOS. Machine learning has become the preferred statistical approach in recent studies, though mainly for binary LOS predictions. Based on the current literature, it remains challenging to derive a practical and high performing model for continuous LOS prediction.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"193 ","pages":"Article 105678"},"PeriodicalIF":3.7000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624003411","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
To review the existing literature on predicting length of stay (LOS) and to apply the findings on a Real World Data example in a single hospital.
Methods
Performing a literature review on PubMed and Embase, focusing on adults, acute conditions, and hospital-wide prediction of LOS, summarizing all the variables and statistical methods used to predict LOS. Then, we use this set of variables on a single university hospital and run an XGBoost model with Survival Cox regression on the LOS, as well as a logistic regression on binary LOS (cut-off at 4 days). Model metrics are the concordance index (c-index) and area under the curve (AUC).
Results
After applying the search strategy and exclusion criteria, 57 articles are included in the study. The list of variables is long, but mostly non-clinical data are used in the existing literature. A wide range of statistical methods are used, with a recent trend toward machine learning models. The XGBoost model results for the Cox regression in a C-index of 0.87, and the logistic regression on binary LOS has an AUC of 0.94.
Conclusions
Many variables identified in the literature are not available at the time of admission, yet they are still used in models for predicting LOS. Machine learning has become the preferred statistical approach in recent studies, though mainly for binary LOS predictions. Based on the current literature, it remains challenging to derive a practical and high performing model for continuous LOS prediction.
目的:回顾有关预测住院时间(LOS)的现有文献,并将研究结果应用于一家医院的真实世界数据示例:方法: 在 PubMed 和 Embase 上进行文献综述,重点关注成人、急性病和全医院的 LOS 预测,总结用于预测 LOS 的所有变量和统计方法。然后,我们将这组变量用于一家大学医院,并运行一个 XGBoost 模型,对 LOS 进行生存 Cox 回归,并对二元 LOS(以 4 天为截止时间)进行逻辑回归。模型指标为一致性指数(c-index)和曲线下面积(AUC):采用检索策略和排除标准后,本研究共纳入 57 篇文章。变量清单很长,但现有文献大多使用非临床数据。使用了多种统计方法,最近的趋势是使用机器学习模型。XGBoost 模型对 Cox 回归的 C 指数为 0.87,对二元 LOS 的逻辑回归 AUC 为 0.94:文献中确定的许多变量在入院时并不存在,但它们仍被用于预测 LOS 的模型中。在最近的研究中,机器学习已成为首选的统计方法,但主要用于二元 LOS 预测。从目前的文献来看,为连续 LOS 预测建立一个实用且高性能的模型仍具有挑战性。
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.