基于可解释性的机器学习预测中国重症监护室患者肺部炎症的死亡风险

Frontiers in Medicine Pub Date : 2024-06-12 DOI:10.3389/fmed.2024.1399527

Yihai Zhai, Danxiu Lan, Siying Lv, Liqin Mo

{"title":"基于可解释性的机器学习预测中国重症监护室患者肺部炎症的死亡风险","authors":"Yihai Zhai, Danxiu Lan, Siying Lv, Liqin Mo","doi":"10.3389/fmed.2024.1399527","DOIUrl":null,"url":null,"abstract":"The objective of this research was to create a machine learning predictive model that could be easily interpreted in order to precisely determine the risk of premature death in patients receiving intensive care after pulmonary inflammation.In this study, information from the China intensive care units (ICU) Open Source database was used to examine data from 2790 patients who had infections between January 2019 and December 2020. A 7:3 ratio was used to randomly assign the whole patient population to training and validation groups. This study used six machine learning techniques: logistic regression, random forest, gradient boosting tree, extreme gradient boosting tree (XGBoost), multilayer perceptron, and K-nearest neighbor. A cross-validation grid search method was used to search the parameters in each model. Eight metrics were used to assess the models’ performance: accuracy, precision, recall, F1 score, area under the curve (AUC) value, Brier score, Jordon’s index, and calibration slope. The machine methods were ranked based on how well they performed in each of these metrics. The best-performing models were selected for interpretation using both the Shapley Additive exPlanations (SHAP) and Local interpretable model-agnostic explanations (LIME) interpretable techniques.A subset of the study cohort’s patients (120/1668, or 7.19%) died in the hospital following screening for inclusion and exclusion criteria. Using a cross-validated grid search to evaluate the six machine learning techniques, XGBoost showed good discriminative ability, achieving an accuracy score of 0.889 (0.874–0.904), precision score of 0.871 (0.849–0.893), recall score of 0.913 (0.890–0.936), F1 score of 0.891 (0.876–0.906), and AUC of 0.956 (0.939–0.973). Additionally, XGBoost exhibited excellent performance with a Brier score of 0.050, Jordon index of 0.947, and calibration slope of 1.074. It was also possible to create an interactive internet page using the XGBoost model.By identifying patients at higher risk of early mortality, machine learning-based mortality risk prediction models have the potential to significantly improve patient care by directing clinical decision making and enabling early detection of survival and mortality issues in patients with pulmonary inflammation disease.","PeriodicalId":502302,"journal":{"name":"Frontiers in Medicine","volume":"17 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interpretability-based machine learning for predicting the risk of death from pulmonary inflammation in Chinese intensive care unit patients\",\"authors\":\"Yihai Zhai, Danxiu Lan, Siying Lv, Liqin Mo\",\"doi\":\"10.3389/fmed.2024.1399527\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The objective of this research was to create a machine learning predictive model that could be easily interpreted in order to precisely determine the risk of premature death in patients receiving intensive care after pulmonary inflammation.In this study, information from the China intensive care units (ICU) Open Source database was used to examine data from 2790 patients who had infections between January 2019 and December 2020. A 7:3 ratio was used to randomly assign the whole patient population to training and validation groups. This study used six machine learning techniques: logistic regression, random forest, gradient boosting tree, extreme gradient boosting tree (XGBoost), multilayer perceptron, and K-nearest neighbor. A cross-validation grid search method was used to search the parameters in each model. Eight metrics were used to assess the models’ performance: accuracy, precision, recall, F1 score, area under the curve (AUC) value, Brier score, Jordon’s index, and calibration slope. The machine methods were ranked based on how well they performed in each of these metrics. The best-performing models were selected for interpretation using both the Shapley Additive exPlanations (SHAP) and Local interpretable model-agnostic explanations (LIME) interpretable techniques.A subset of the study cohort’s patients (120/1668, or 7.19%) died in the hospital following screening for inclusion and exclusion criteria. Using a cross-validated grid search to evaluate the six machine learning techniques, XGBoost showed good discriminative ability, achieving an accuracy score of 0.889 (0.874–0.904), precision score of 0.871 (0.849–0.893), recall score of 0.913 (0.890–0.936), F1 score of 0.891 (0.876–0.906), and AUC of 0.956 (0.939–0.973). Additionally, XGBoost exhibited excellent performance with a Brier score of 0.050, Jordon index of 0.947, and calibration slope of 1.074. It was also possible to create an interactive internet page using the XGBoost model.By identifying patients at higher risk of early mortality, machine learning-based mortality risk prediction models have the potential to significantly improve patient care by directing clinical decision making and enabling early detection of survival and mortality issues in patients with pulmonary inflammation disease.\",\"PeriodicalId\":502302,\"journal\":{\"name\":\"Frontiers in Medicine\",\"volume\":\"17 4\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/fmed.2024.1399527\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fmed.2024.1399527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究的目的是创建一个易于解释的机器学习预测模型，以精确确定肺部炎症后接受重症监护的患者过早死亡的风险。在本研究中，我们使用了中国重症监护病房（ICU）开源数据库中的信息，对2019年1月至2020年12月期间发生感染的2790名患者的数据进行了研究。采用7:3的比例将全部患者随机分配到训练组和验证组。本研究使用了六种机器学习技术：逻辑回归、随机森林、梯度提升树、极端梯度提升树（XGBoost）、多层感知器和 K 最近邻。交叉验证网格搜索法用于搜索每个模型的参数。评估模型性能的指标有八个：准确度、精确度、召回率、F1得分、曲线下面积（AUC）值、布赖尔得分、乔丹指数和校准斜率。根据机器方法在上述各项指标中的表现对其进行排名。在根据纳入和排除标准进行筛选后，研究队列中的一部分患者（120/1668，或 7.19%）在医院死亡。通过交叉验证网格搜索对六种机器学习技术进行评估，XGBoost 显示出良好的判别能力，准确率为 0.889（0.874-0.904），精确度为 0.871（0.849-0.893），召回率为 0.913（0.890-0.936），F1 为 0.891（0.876-0.906），AUC 为 0.956（0.939-0.973）。此外，XGBoost 的 Brier 得分为 0.050，Jordon 指数为 0.947，校准斜率为 1.074，表现出色。通过识别早期死亡风险较高的患者，基于机器学习的死亡风险预测模型有可能指导临床决策，并能早期发现肺部炎症患者的生存和死亡问题，从而显著改善患者护理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Interpretability-based machine learning for predicting the risk of death from pulmonary inflammation in Chinese intensive care unit patients

The objective of this research was to create a machine learning predictive model that could be easily interpreted in order to precisely determine the risk of premature death in patients receiving intensive care after pulmonary inflammation.In this study, information from the China intensive care units (ICU) Open Source database was used to examine data from 2790 patients who had infections between January 2019 and December 2020. A 7:3 ratio was used to randomly assign the whole patient population to training and validation groups. This study used six machine learning techniques: logistic regression, random forest, gradient boosting tree, extreme gradient boosting tree (XGBoost), multilayer perceptron, and K-nearest neighbor. A cross-validation grid search method was used to search the parameters in each model. Eight metrics were used to assess the models’ performance: accuracy, precision, recall, F1 score, area under the curve (AUC) value, Brier score, Jordon’s index, and calibration slope. The machine methods were ranked based on how well they performed in each of these metrics. The best-performing models were selected for interpretation using both the Shapley Additive exPlanations (SHAP) and Local interpretable model-agnostic explanations (LIME) interpretable techniques.A subset of the study cohort’s patients (120/1668, or 7.19%) died in the hospital following screening for inclusion and exclusion criteria. Using a cross-validated grid search to evaluate the six machine learning techniques, XGBoost showed good discriminative ability, achieving an accuracy score of 0.889 (0.874–0.904), precision score of 0.871 (0.849–0.893), recall score of 0.913 (0.890–0.936), F1 score of 0.891 (0.876–0.906), and AUC of 0.956 (0.939–0.973). Additionally, XGBoost exhibited excellent performance with a Brier score of 0.050, Jordon index of 0.947, and calibration slope of 1.074. It was also possible to create an interactive internet page using the XGBoost model.By identifying patients at higher risk of early mortality, machine learning-based mortality risk prediction models have the potential to significantly improve patient care by directing clinical decision making and enabling early detection of survival and mortality issues in patients with pulmonary inflammation disease.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Medicine

自引率

0.00%

发文量