Jack A Cummins, Ben S Gerber, Mayuko Ito Fukunaga, Nils Henninger, Catarina I Kiefe, Feifan Liu
{"title":"In-Hospital Mortality Prediction among Intensive Care Unit Patients with Acute Ischemic Stroke: A Machine Learning Approach.","authors":"Jack A Cummins, Ben S Gerber, Mayuko Ito Fukunaga, Nils Henninger, Catarina I Kiefe, Feifan Liu","doi":"10.34133/hds.0179","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Acute ischemic stroke is a leading cause of death in the United States. Identifying patients with stroke at high risk of mortality is crucial for timely intervention and optimal resource allocation. This study aims to develop and validate machine learning-based models to predict in-hospital mortality risk for intensive care unit (ICU) patients with acute ischemic stroke and identify important associated factors. <b>Methods:</b> Our data include 3,489 acute ischemic stroke admissions to the ICU for patients not discharged or dead within 48 h from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Demographic, hospitalization type, procedure, medication, intake (intravenous and oral), laboratory, vital signs, and clinical assessment [e.g., Glasgow Coma Scale Scores (GCS)] during the initial 48 h of admissions were used to predict in-hospital mortality after 48 h of ICU admission. We explored 3 machine learning models (random forests, logistic regression, and XGBoost) and applied Bayesian optimization for hyperparameter tuning. Important features were identified using learned coefficients. <b>Results:</b> Experiments show that XGBoost tuned for area under the receiver operating characteristic curve (AUC ROC) was the best performing model (AUC ROC 0.86, F1 0.52), compared to random forests (AUC ROC 0.85, F1 0.47) and logistic regression (AUC ROC 0.75, F1 0.40). Top features include GCS, blood urea nitrogen, and Richmond RASS score. The model also demonstrates good fairness for males versus females and across racial/ethnic groups. <b>Conclusions:</b> Machine learning has shown great potential in predicting in-hospital mortality risk for people with acute ischemic stroke in the ICU setting. However, more ethical considerations need to be applied to ensure that performance differences across different racial/ethnic groups will not exacerbate existing health disparities and will not harm historically marginalized populations.</p>","PeriodicalId":73207,"journal":{"name":"Health data science","volume":"5 ","pages":"0179"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11912875/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health data science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34133/hds.0179","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Acute ischemic stroke is a leading cause of death in the United States. Identifying patients with stroke at high risk of mortality is crucial for timely intervention and optimal resource allocation. This study aims to develop and validate machine learning-based models to predict in-hospital mortality risk for intensive care unit (ICU) patients with acute ischemic stroke and identify important associated factors. Methods: Our data include 3,489 acute ischemic stroke admissions to the ICU for patients not discharged or dead within 48 h from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Demographic, hospitalization type, procedure, medication, intake (intravenous and oral), laboratory, vital signs, and clinical assessment [e.g., Glasgow Coma Scale Scores (GCS)] during the initial 48 h of admissions were used to predict in-hospital mortality after 48 h of ICU admission. We explored 3 machine learning models (random forests, logistic regression, and XGBoost) and applied Bayesian optimization for hyperparameter tuning. Important features were identified using learned coefficients. Results: Experiments show that XGBoost tuned for area under the receiver operating characteristic curve (AUC ROC) was the best performing model (AUC ROC 0.86, F1 0.52), compared to random forests (AUC ROC 0.85, F1 0.47) and logistic regression (AUC ROC 0.75, F1 0.40). Top features include GCS, blood urea nitrogen, and Richmond RASS score. The model also demonstrates good fairness for males versus females and across racial/ethnic groups. Conclusions: Machine learning has shown great potential in predicting in-hospital mortality risk for people with acute ischemic stroke in the ICU setting. However, more ethical considerations need to be applied to ensure that performance differences across different racial/ethnic groups will not exacerbate existing health disparities and will not harm historically marginalized populations.