利用机器学习模型预测艰难梭菌感染的心衰患者 28 天内的全因死亡率:来自 MIMIC-IV 数据库的证据。

IF 1.9 4区 医学 Q3 CARDIAC & CARDIOVASCULAR SYSTEMS Cardiology Pub Date : 2024-08-17 DOI:10.1159/000540994
Caiping Shi, Qiong Jie, Hongsong Zhang, Xinying Zhang, Weijuan Chu, Chen Chen, Qian Zhang, Zhen Hu
{"title":"利用机器学习模型预测艰难梭菌感染的心衰患者 28 天内的全因死亡率:来自 MIMIC-IV 数据库的证据。","authors":"Caiping Shi, Qiong Jie, Hongsong Zhang, Xinying Zhang, Weijuan Chu, Chen Chen, Qian Zhang, Zhen Hu","doi":"10.1159/000540994","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Heart failure (HF) may induce bowel hypoperfusion, leading to hypoxia of the villa of the bowel wall and the occurrence of Clostridioides difficile infection (CDI). However, the risk factors for the development of CDI in HF patients have yet to be fully illustrated, especially because of a lack of evidence from real-world data.</p><p><strong>Methods: </strong>Clinical data and survival situations of HF patients with CDI admitted to ICU were extracted from the Medical Information Mart for Intensive Care (MIMIC)-IV database. For developing a model that can predict 28-day all-cause mortality in HF patients with CDI, the Recursive Feature Elimination with Cross-Validation (RFE-CV) method was used for feature selection. And nine machine learning (ML) algorithms, including logistic regression (LR), decision tree, Bayesian, adaptive boosting, random forest (RF), gradient boosting decision tree, XGBoost, light gradient boosting machine, and categorical boosting, were applied for model construction. After training and hyperparameter optimization of the models through grid search 5-fold cross-validation, the performance of models was evaluated by the area under curve (AUC), accuracy, sensitivity, specificity, precision, negative predictive value, and F1 score. Furthermore, the SHapley Additive exPlanations (SHAP) method was used to interpret the optimal model.</p><p><strong>Results: </strong>A total of 526 HF patients with CDI were included in the study, of whom 99 cases (18.8%) experienced death within 28 days. Eighteen of the 57 variables were selected for the model construction algorithm for model construction. Among the ML models considered, the RF model emerged as the optimal model achieving the accuracy, F1-score, and AUC values of 0.821, 0.596, and 0.864, respectively. The net benefit of the model surpassed other models at 16%-22% threshold probabilities based on decision curve analysis. According to the importance of features in the RF model, red blood cell distribution width, blood urea nitrogen, Simplified Acute Physiology Score II, Sequential Organ Failure Assessment, and white blood cell count were highlighted as the five most influential variables.</p><p><strong>Conclusions: </strong>We developed ML models to predict 28-day all-cause mortality in HF patients associated with CDI in the ICU, which are more effective than the conventional LR model. The RF model has the best performance among all the ML models employed. It may be useful to help clinicians identify high-risk HF patients with CDI.</p>","PeriodicalId":9391,"journal":{"name":"Cardiology","volume":" ","pages":"1"},"PeriodicalIF":1.9000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of 28-Day All-Cause Mortality in Heart Failure Patients with Clostridioides difficile Infection Using Machine Learning Models: Evidence from the MIMIC-IV Database.\",\"authors\":\"Caiping Shi, Qiong Jie, Hongsong Zhang, Xinying Zhang, Weijuan Chu, Chen Chen, Qian Zhang, Zhen Hu\",\"doi\":\"10.1159/000540994\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Heart failure (HF) may induce bowel hypoperfusion, leading to hypoxia of the villa of the bowel wall and the occurrence of Clostridioides difficile infection (CDI). However, the risk factors for the development of CDI in HF patients have yet to be fully illustrated, especially because of a lack of evidence from real-world data.</p><p><strong>Methods: </strong>Clinical data and survival situations of HF patients with CDI admitted to ICU were extracted from the Medical Information Mart for Intensive Care (MIMIC)-IV database. For developing a model that can predict 28-day all-cause mortality in HF patients with CDI, the Recursive Feature Elimination with Cross-Validation (RFE-CV) method was used for feature selection. And nine machine learning (ML) algorithms, including logistic regression (LR), decision tree, Bayesian, adaptive boosting, random forest (RF), gradient boosting decision tree, XGBoost, light gradient boosting machine, and categorical boosting, were applied for model construction. After training and hyperparameter optimization of the models through grid search 5-fold cross-validation, the performance of models was evaluated by the area under curve (AUC), accuracy, sensitivity, specificity, precision, negative predictive value, and F1 score. Furthermore, the SHapley Additive exPlanations (SHAP) method was used to interpret the optimal model.</p><p><strong>Results: </strong>A total of 526 HF patients with CDI were included in the study, of whom 99 cases (18.8%) experienced death within 28 days. Eighteen of the 57 variables were selected for the model construction algorithm for model construction. Among the ML models considered, the RF model emerged as the optimal model achieving the accuracy, F1-score, and AUC values of 0.821, 0.596, and 0.864, respectively. The net benefit of the model surpassed other models at 16%-22% threshold probabilities based on decision curve analysis. According to the importance of features in the RF model, red blood cell distribution width, blood urea nitrogen, Simplified Acute Physiology Score II, Sequential Organ Failure Assessment, and white blood cell count were highlighted as the five most influential variables.</p><p><strong>Conclusions: </strong>We developed ML models to predict 28-day all-cause mortality in HF patients associated with CDI in the ICU, which are more effective than the conventional LR model. The RF model has the best performance among all the ML models employed. It may be useful to help clinicians identify high-risk HF patients with CDI.</p>\",\"PeriodicalId\":9391,\"journal\":{\"name\":\"Cardiology\",\"volume\":\" \",\"pages\":\"1\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cardiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1159/000540994\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cardiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1159/000540994","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

导言:心力衰竭(HF)可能诱发肠道灌注不足,导致肠壁绒毛缺氧和艰难梭菌感染(CDI)的发生。然而,高血压患者发生 CDI 的风险因素尚未得到充分说明,尤其是因为缺乏来自真实世界数据的证据:方法:我们从重症监护医学信息市场(MIMIC)-IV 数据库中提取了入住重症监护室的患有 CDI 的高血压患者的临床数据和生存情况。为了建立能预测心房颤动伴 CDI 患者 28 天全因死亡率的模型,研究人员采用了递归特征消除与交叉验证(RFE-CV)方法进行特征选择。在构建模型时应用了九种机器学习(ML)算法,包括逻辑回归(LR)、决策树(DT)、贝叶斯算法、自适应提升(AdaBoost)、随机森林(RF)、梯度提升决策树(GBDT)、XGBoost、轻梯度提升机(LightGBM)和分类提升(CatBoost)。通过网格搜索 5 倍交叉验证对模型进行训练和超参数优化后,用曲线下面积(AUC)、准确率、灵敏度、特异性、精确度、负预测值和 F1 分数来评估模型的性能。此外,还使用了SHapley Additive exPlanations(SHAP)方法来解释最佳模型:研究共纳入了 526 例患有 CDI 的高频患者,其中 99 例(18.8%)在 28 天内死亡。在 57 个变量中,有 18 个被选中用于模型构建算法。在所考虑的 ML 模型中,RF 模型成为最佳模型,其准确度、F1 分数和 AUC 值分别为 0.821、0.596 和 0.864。根据决策曲线分析,在 16%~22% 的阈值概率下,该模型的净收益超过了其他模型。根据 RF 模型中特征的重要性,红细胞分布宽度、血尿素氮、简化急性生理学评分 II、序贯器官功能衰竭评估和白细胞计数被列为影响最大的五个变量:我们建立了 ML 模型来预测 ICU 中伴有 CDI 的高血压患者 28 天的全因死亡率,该模型比传统的逻辑回归模型更有效。在所有采用的多重回归模型中,RF 模型的性能最佳。它可以帮助临床医生识别患有 CDI 的高危 HF 患者。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Prediction of 28-Day All-Cause Mortality in Heart Failure Patients with Clostridioides difficile Infection Using Machine Learning Models: Evidence from the MIMIC-IV Database.

Introduction: Heart failure (HF) may induce bowel hypoperfusion, leading to hypoxia of the villa of the bowel wall and the occurrence of Clostridioides difficile infection (CDI). However, the risk factors for the development of CDI in HF patients have yet to be fully illustrated, especially because of a lack of evidence from real-world data.

Methods: Clinical data and survival situations of HF patients with CDI admitted to ICU were extracted from the Medical Information Mart for Intensive Care (MIMIC)-IV database. For developing a model that can predict 28-day all-cause mortality in HF patients with CDI, the Recursive Feature Elimination with Cross-Validation (RFE-CV) method was used for feature selection. And nine machine learning (ML) algorithms, including logistic regression (LR), decision tree, Bayesian, adaptive boosting, random forest (RF), gradient boosting decision tree, XGBoost, light gradient boosting machine, and categorical boosting, were applied for model construction. After training and hyperparameter optimization of the models through grid search 5-fold cross-validation, the performance of models was evaluated by the area under curve (AUC), accuracy, sensitivity, specificity, precision, negative predictive value, and F1 score. Furthermore, the SHapley Additive exPlanations (SHAP) method was used to interpret the optimal model.

Results: A total of 526 HF patients with CDI were included in the study, of whom 99 cases (18.8%) experienced death within 28 days. Eighteen of the 57 variables were selected for the model construction algorithm for model construction. Among the ML models considered, the RF model emerged as the optimal model achieving the accuracy, F1-score, and AUC values of 0.821, 0.596, and 0.864, respectively. The net benefit of the model surpassed other models at 16%-22% threshold probabilities based on decision curve analysis. According to the importance of features in the RF model, red blood cell distribution width, blood urea nitrogen, Simplified Acute Physiology Score II, Sequential Organ Failure Assessment, and white blood cell count were highlighted as the five most influential variables.

Conclusions: We developed ML models to predict 28-day all-cause mortality in HF patients associated with CDI in the ICU, which are more effective than the conventional LR model. The RF model has the best performance among all the ML models employed. It may be useful to help clinicians identify high-risk HF patients with CDI.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cardiology
Cardiology 医学-心血管系统
CiteScore
3.40
自引率
5.30%
发文量
56
审稿时长
1.5 months
期刊介绍: ''Cardiology'' features first reports on original clinical, preclinical and fundamental research as well as ''Novel Insights from Clinical Experience'' and topical comprehensive reviews in selected areas of cardiovascular disease. ''Editorial Comments'' provide a critical but positive evaluation of a recent article. Papers not only describe but offer critical appraisals of new developments in non-invasive and invasive diagnostic methods and in pharmacologic, nutritional and mechanical/surgical therapies. Readers are thus kept informed of current strategies in the prevention, recognition and treatment of heart disease. Special sections in a variety of subspecialty areas reinforce the journal''s value as a complete record of recent progress for all cardiologists, internists, cardiac surgeons, clinical physiologists, pharmacologists and professionals in other areas of medicine interested in current activity in cardiovascular diseases.
期刊最新文献
Comparative study of the therapeutic effects of radiofrequency ablation of ganglionated plexi guided by high-frequency stimulation and anatomical localization methods in the treatment of vagal syncope in young people. Electrocardiographic strain and relationship with LV remodelling and clinical outcomes in patients with aortic stenosis undergoing transcatheter aortic valve implantation. Assessment of coronary microvascular dysfunction by angiography-based index of microcirculatory resistance: an indirect meta-analysis. Genetic Association of the Ins/Del Variant of ACE and Risk of Cardiomyopathy: A Case-Control Study and Updated Meta-Analysis. Real-World Evidence: Integrating Machine Learning with Real-World Big Data for Predictive Analytics in Healthcare.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1