{"title":"Prediction of 30-day mortality for ICU patients with Sepsis-3.","authors":"Zhijiang Yu, Negin Ashrafi, Hexin Li, Kamiar Alaei, Maryam Pishgar","doi":"10.1186/s12911-024-02629-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>There is a growing demand for advanced methods to improve the understanding and prediction of illnesses. This study focuses on Sepsis, a critical response to infection, aiming to enhance early detection and mortality prediction for Sepsis-3 patients to improve hospital resource allocation.</p><p><strong>Methods: </strong>In this study, we developed a Machine Learning (ML) framework to predict the 30-day mortality rate of ICU patients with Sepsis-3 using the MIMIC-III database. Advanced big data extraction tools like Snowflake were used to identify eligible patients. Decision tree models and Entropy Analyses helped refine feature selection, resulting in 30 relevant features curated with clinical experts. We employed the Light Gradient Boosting Machine (LightGBM) model for its efficiency and predictive power.</p><p><strong>Results: </strong>The study comprised a cohort of 9118 Sepsis-3 patients. Our preprocessing techniques significantly improved both the AUC and accuracy metrics. The LightGBM model achieved an impressive AUC of 0.983 (95% CI: [0.980-0.990]), an accuracy of 0.966, and an F1-score of 0.910. Notably, LightGBM showed a substantial 6% improvement over our best baseline model and a 14% enhancement over the best existing literature. These advancements are attributed to (I) the inclusion of the novel and pivotal feature Hospital Length of Stay (HOSP_LOS), absent in previous studies, and (II) LightGBM's gradient boosting architecture, enabling robust predictions with high-dimensional data while maintaining computational efficiency, as demonstrated by its learning curve.</p><p><strong>Conclusions: </strong>Our preprocessing methodology reduced the number of relevant features and identified a crucial feature overlooked in previous studies. The proposed model demonstrated high predictive power and generalization capability, highlighting the potential of ML in ICU settings. This model can streamline ICU resource allocation and provide tailored interventions for Sepsis-3 patients.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11308624/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02629-6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: There is a growing demand for advanced methods to improve the understanding and prediction of illnesses. This study focuses on Sepsis, a critical response to infection, aiming to enhance early detection and mortality prediction for Sepsis-3 patients to improve hospital resource allocation.
Methods: In this study, we developed a Machine Learning (ML) framework to predict the 30-day mortality rate of ICU patients with Sepsis-3 using the MIMIC-III database. Advanced big data extraction tools like Snowflake were used to identify eligible patients. Decision tree models and Entropy Analyses helped refine feature selection, resulting in 30 relevant features curated with clinical experts. We employed the Light Gradient Boosting Machine (LightGBM) model for its efficiency and predictive power.
Results: The study comprised a cohort of 9118 Sepsis-3 patients. Our preprocessing techniques significantly improved both the AUC and accuracy metrics. The LightGBM model achieved an impressive AUC of 0.983 (95% CI: [0.980-0.990]), an accuracy of 0.966, and an F1-score of 0.910. Notably, LightGBM showed a substantial 6% improvement over our best baseline model and a 14% enhancement over the best existing literature. These advancements are attributed to (I) the inclusion of the novel and pivotal feature Hospital Length of Stay (HOSP_LOS), absent in previous studies, and (II) LightGBM's gradient boosting architecture, enabling robust predictions with high-dimensional data while maintaining computational efficiency, as demonstrated by its learning curve.
Conclusions: Our preprocessing methodology reduced the number of relevant features and identified a crucial feature overlooked in previous studies. The proposed model demonstrated high predictive power and generalization capability, highlighting the potential of ML in ICU settings. This model can streamline ICU resource allocation and provide tailored interventions for Sepsis-3 patients.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.