Predicting functional outcomes of patients with spontaneous intracerebral hemorrhage based on explainable machine learning models: a multicenter retrospective study.
Bin Pan, Fengda Li, Chuanghong Liu, Zeyi Li, Chengfa Sun, Kaijian Xia, Hong Xu, Gang Kong, Longyuan Gu, Kaiyuan Cheng
{"title":"Predicting functional outcomes of patients with spontaneous intracerebral hemorrhage based on explainable machine learning models: a multicenter retrospective study.","authors":"Bin Pan, Fengda Li, Chuanghong Liu, Zeyi Li, Chengfa Sun, Kaijian Xia, Hong Xu, Gang Kong, Longyuan Gu, Kaiyuan Cheng","doi":"10.3389/fneur.2024.1494934","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Spontaneous intracerebral hemorrhage (SICH) is the second most common cause of cerebrovascular disease after ischemic stroke, with high mortality and disability rates, imposing a significant economic burden on families and society. This retrospective study aimed to develop and evaluate an interpretable machine learning model to predict functional outcomes 3 months after SICH.</p><p><strong>Methods: </strong>A retrospective analysis was conducted on clinical data from 380 patients with SICH who were hospitalized at three different centers between June 2020 and June 2023. Seventy percent of the samples were randomly selected as the training set, while the remaining 30% were used as the validation set. Univariate analysis, Least Absolute Shrinkage and Selection Operator (LASSO) regression, and Pearson correlation analysis were used to screen clinical variables. The selected variables were then incorporated into five machine learning models: complementary naive bayes (CNB), support vector machine (SVM), gaussian naive bayes (GNB), multilayer perceptron (MLP), and extreme gradient boosting (XGB), to assess their performance. Additionally, the area under the curve (AUC) values were evaluated to compare the performance of each algorithmic model, and global and individual interpretive analyses were conducted using importance ranking and Shapley additive explanations (SHAP).</p><p><strong>Results: </strong>Among the 380 patients, 95 ultimately had poor prognostic outcomes. In the validation set, the AUC values for CNB, SVM, GNB, MLP, and XGB models were 0.899 (0.816-0.979), 0.916 (0.847-0.982), 0.730 (0.602-0.857), 0.913 (0.834-0.986), and 0.969 (0.937-0.998), respectively. Therefore, the XGB model performed the best among the five algorithms. SHAP analysis revealed that the GCS score, hematoma volume, blood pressure changes, platelets, age, bleeding location, and blood glucose levels were the most important variables for poor prognosis.</p><p><strong>Conclusion: </strong>The XGB model developed in this study can effectively predict the risk of poor prognosis in patients with SICH, helping clinicians make personalized and rational clinical decisions. Prognostic risk in patients with SICH is closely associated with GCS score, hematoma volume, blood pressure changes, platelets, age, bleeding location, and blood glucose levels.</p>","PeriodicalId":12575,"journal":{"name":"Frontiers in Neurology","volume":"15 ","pages":"1494934"},"PeriodicalIF":2.7000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757109/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Neurology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fneur.2024.1494934","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Spontaneous intracerebral hemorrhage (SICH) is the second most common cause of cerebrovascular disease after ischemic stroke, with high mortality and disability rates, imposing a significant economic burden on families and society. This retrospective study aimed to develop and evaluate an interpretable machine learning model to predict functional outcomes 3 months after SICH.
Methods: A retrospective analysis was conducted on clinical data from 380 patients with SICH who were hospitalized at three different centers between June 2020 and June 2023. Seventy percent of the samples were randomly selected as the training set, while the remaining 30% were used as the validation set. Univariate analysis, Least Absolute Shrinkage and Selection Operator (LASSO) regression, and Pearson correlation analysis were used to screen clinical variables. The selected variables were then incorporated into five machine learning models: complementary naive bayes (CNB), support vector machine (SVM), gaussian naive bayes (GNB), multilayer perceptron (MLP), and extreme gradient boosting (XGB), to assess their performance. Additionally, the area under the curve (AUC) values were evaluated to compare the performance of each algorithmic model, and global and individual interpretive analyses were conducted using importance ranking and Shapley additive explanations (SHAP).
Results: Among the 380 patients, 95 ultimately had poor prognostic outcomes. In the validation set, the AUC values for CNB, SVM, GNB, MLP, and XGB models were 0.899 (0.816-0.979), 0.916 (0.847-0.982), 0.730 (0.602-0.857), 0.913 (0.834-0.986), and 0.969 (0.937-0.998), respectively. Therefore, the XGB model performed the best among the five algorithms. SHAP analysis revealed that the GCS score, hematoma volume, blood pressure changes, platelets, age, bleeding location, and blood glucose levels were the most important variables for poor prognosis.
Conclusion: The XGB model developed in this study can effectively predict the risk of poor prognosis in patients with SICH, helping clinicians make personalized and rational clinical decisions. Prognostic risk in patients with SICH is closely associated with GCS score, hematoma volume, blood pressure changes, platelets, age, bleeding location, and blood glucose levels.
期刊介绍:
The section Stroke aims to quickly and accurately publish important experimental, translational and clinical studies, and reviews that contribute to the knowledge of stroke, its causes, manifestations, diagnosis, and management.