Explainable SHAP-XGBoost models for in-hospital mortality after myocardial infarction

IF 2.7 Q2 CARDIAC & CARDIOVASCULAR SYSTEMS Cardiovascular digital health journal Pub Date : 2023-08-01 Epub Date: 2023-06-14 DOI:10.1016/j.cvdhj.2023.06.001

Constantine Tarabanis MD , Evangelos Kalampokis PhD , Mahmoud Khalil MD , Carlos L. Alviar MD , Larry A. Chinitz MD, FHRS , Lior Jankelson MD, PhD

{"title":"Explainable SHAP-XGBoost models for in-hospital mortality after myocardial infarction","authors":"Constantine Tarabanis MD , Evangelos Kalampokis PhD , Mahmoud Khalil MD , Carlos L. Alviar MD , Larry A. Chinitz MD, FHRS , Lior Jankelson MD, PhD","doi":"10.1016/j.cvdhj.2023.06.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>A lack of explainability in published machine learning (ML) models limits clinicians’ understanding of how predictions are made, in turn undermining uptake of the models into clinical practice.</p></div><div><h3>Objective</h3><p>The purpose of this study was to develop explainable ML models to predict in-hospital mortality in patients hospitalized for myocardial infarction (MI).</p></div><div><h3>Methods</h3><p>Adult patients hospitalized for an MI were identified in the National Inpatient Sample between January 1, 2012, and September 30, 2015. The resulting cohort comprised 457,096 patients described by 64 predictor variables relating to demographic/comorbidity characteristics and in-hospital complications. The gradient boosting algorithm eXtreme Gradient Boosting (XGBoost) was used to develop explainable models for in-hospital mortality prediction in the overall cohort and patient subgroups based on MI type and/or sex.</p></div><div><h3>Results</h3><p>The resulting models exhibited an area under the receiver operating characteristic curve (AUC) ranging from 0.876 to 0.942, specificity 82% to 87%, and sensitivity 75% to 87%. All models exhibited high negative predictive value ≥0.974. The SHapley Additive exPlanation (SHAP) framework was applied to explain the models. The top predictor variables of increasing and decreasing mortality were age and undergoing percutaneous coronary intervention, respectively. Other notable findings included a decreased mortality risk associated with certain patient subpopulations with hyperlipidemia and a comparatively greater risk of death among women below age 55 years.</p></div><div><h3>Conclusion</h3><p>The literature lacks explainable ML models predicting in-hospital mortality after an MI. In a national registry, explainable ML models performed best in ruling out in-hospital death post-MI, and their explanation illustrated their potential for guiding hypothesis generation and future study design.</p></div>","PeriodicalId":72527,"journal":{"name":"Cardiovascular digital health journal","volume":"4 4","pages":"Pages 126-132"},"PeriodicalIF":2.7000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10435947/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cardiovascular digital health journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666693623000361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/6/14 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Background

A lack of explainability in published machine learning (ML) models limits clinicians’ understanding of how predictions are made, in turn undermining uptake of the models into clinical practice.

Objective

The purpose of this study was to develop explainable ML models to predict in-hospital mortality in patients hospitalized for myocardial infarction (MI).

Methods

Adult patients hospitalized for an MI were identified in the National Inpatient Sample between January 1, 2012, and September 30, 2015. The resulting cohort comprised 457,096 patients described by 64 predictor variables relating to demographic/comorbidity characteristics and in-hospital complications. The gradient boosting algorithm eXtreme Gradient Boosting (XGBoost) was used to develop explainable models for in-hospital mortality prediction in the overall cohort and patient subgroups based on MI type and/or sex.

Results

The resulting models exhibited an area under the receiver operating characteristic curve (AUC) ranging from 0.876 to 0.942, specificity 82% to 87%, and sensitivity 75% to 87%. All models exhibited high negative predictive value ≥0.974. The SHapley Additive exPlanation (SHAP) framework was applied to explain the models. The top predictor variables of increasing and decreasing mortality were age and undergoing percutaneous coronary intervention, respectively. Other notable findings included a decreased mortality risk associated with certain patient subpopulations with hyperlipidemia and a comparatively greater risk of death among women below age 55 years.

Conclusion

The literature lacks explainable ML models predicting in-hospital mortality after an MI. In a national registry, explainable ML models performed best in ruling out in-hospital death post-MI, and their explanation illustrated their potential for guiding hypothesis generation and future study design.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

心肌梗死后住院死亡率的可解释SHAP-XGBoost模型

背景已发表的机器学习（ML）模型缺乏可解释性，限制了临床医生对如何进行预测的理解，进而削弱了模型在临床实践中的应用。目的本研究的目的是开发可解释的ML模型来预测因心肌梗死（MI）住院患者的住院死亡率。方法在2012年1月1日至2015年9月30日的全国住院患者样本中确定因心肌梗死住院的成年患者。由此产生的队列包括457096名患者，由64个与人口统计学/共病特征和住院并发症相关的预测变量描述。梯度增强算法极限梯度增强（XGBoost）用于开发可解释的模型，用于根据MI类型和/或性别在整个队列和患者亚组中预测住院死亡率。结果所得模型的受试者工作特性曲线下面积（AUC）范围为0.876至0.942，特异性为82%至87%，敏感性为75%至87%。所有模型均具有较高的阴性预测值≥0.974。应用SHapley加性展开（SHAP）框架对模型进行了解释。死亡率上升和下降的首要预测变量分别是年龄和接受经皮冠状动脉介入治疗。其他值得注意的发现包括与某些高脂血症患者亚群相关的死亡率降低，以及55岁以下女性的死亡风险相对较高。结论文献中缺乏可解释的ML模型来预测MI后的住院死亡率。在国家登记中，可解释的ML-模型在排除MI后的医院死亡方面表现最好，它们的解释说明了它们在指导假设生成和未来研究设计方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊