Application of the Unbalanced Ensemble Algorithm for Prognostic Prediction Outcomes of All-Cause Mortality in Coronary Heart Disease Patients Comorbid with Hypertension

IF 2.7 4区 医学 Q2 HEALTH CARE SCIENCES & SERVICES Risk Management and Healthcare Policy Pub Date : 2024-08-06 DOI:10.2147/rmhp.s472398
Jiaxin Zan, Xiaojing Dong, Hong Yang, Jingjing Yan, Zixuan He, Jing Tian, Yanbo Zhang
{"title":"Application of the Unbalanced Ensemble Algorithm for Prognostic Prediction Outcomes of All-Cause Mortality in Coronary Heart Disease Patients Comorbid with Hypertension","authors":"Jiaxin Zan, Xiaojing Dong, Hong Yang, Jingjing Yan, Zixuan He, Jing Tian, Yanbo Zhang","doi":"10.2147/rmhp.s472398","DOIUrl":null,"url":null,"abstract":"<strong>Purpose:</strong> This study sought to develop an unbalanced-ensemble model that could accurately predict death outcomes of patients with comorbid coronary heart disease (CHD) and hypertension and evaluate the factors contributing to death.<br/><strong>Patients and Methods:</strong> Medical records of 1058 patients with coronary heart disease combined with hypertension and excluding those acute coronary syndrome were collected. Patients were followed-up at the first, third, sixth, and twelfth months after discharge to record death events. Follow-up ended two years after discharge. Patients were divided into survival and nonsurvival groups. According to medical records, gender, smoking, drinking, COPD, cerebral stroke, diabetes, hyperhomocysteinemia, heart failure and renal insufficiency of the two groups were sorted and compared and other influencing factors of the two groups, feature selection was carried out to construct models. Owing to data unbalance, we developed four unbalanced-ensemble prediction models based on Balanced Random Forest (BRF), EasyEnsemble, RUSBoost, SMOTEBoost and the two base classification algorithms based on AdaBoost and Logistic. Each model was optimised using hyperparameters based on GridSearchCV and evaluated using area under the curve (AUC), sensitivity, recall, Brier score, and geometric mean (G-mean). Additionally, to understand the influence of variables on model performance, we constructed a SHapley Additive explanation (SHAP) model based on the optimal model.<br/><strong>Results:</strong> There were significant differences in age, heart rate, COPD, cerebral stroke, heart failure and renal insufficiency in the nonsurvival group compared with the survival group. Among all models, BRF yielded the highest AUC (0.810; 95% CI, 0.778– 0.839), sensitivity (0.990; 95% CI, 0.981– 1.000), recall (0.990; 95% CI, 0.981– 1.000), and G-mean (0.806; 95% CI, 0.778– 0.827), and the lowest Brier score (0.181; 95% CI, 0.178– 0.185). Therefore, we identified BRF as the optimal model. Furthermore, red blood cell count (RBC), body mass index (BMI), and lactate dehydrogenase were found to be important mortality-associated risk factors.<br/><strong>Conclusion:</strong> BRF combined with advanced machine learning methods and SHAP is highly effective and accurately predicts mortality in patients with CHD comorbid with hypertension. This model has the potential to assist clinicians in modifying treatment strategies to improve patient outcomes. <br/><br/><strong>Keywords:</strong> coronary heart disease comorbid with hypertension, ensemble learning, balanced random forest, SHAP, Prognosis<br/>","PeriodicalId":56009,"journal":{"name":"Risk Management and Healthcare Policy","volume":"26 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Risk Management and Healthcare Policy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/rmhp.s472398","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: This study sought to develop an unbalanced-ensemble model that could accurately predict death outcomes of patients with comorbid coronary heart disease (CHD) and hypertension and evaluate the factors contributing to death.
Patients and Methods: Medical records of 1058 patients with coronary heart disease combined with hypertension and excluding those acute coronary syndrome were collected. Patients were followed-up at the first, third, sixth, and twelfth months after discharge to record death events. Follow-up ended two years after discharge. Patients were divided into survival and nonsurvival groups. According to medical records, gender, smoking, drinking, COPD, cerebral stroke, diabetes, hyperhomocysteinemia, heart failure and renal insufficiency of the two groups were sorted and compared and other influencing factors of the two groups, feature selection was carried out to construct models. Owing to data unbalance, we developed four unbalanced-ensemble prediction models based on Balanced Random Forest (BRF), EasyEnsemble, RUSBoost, SMOTEBoost and the two base classification algorithms based on AdaBoost and Logistic. Each model was optimised using hyperparameters based on GridSearchCV and evaluated using area under the curve (AUC), sensitivity, recall, Brier score, and geometric mean (G-mean). Additionally, to understand the influence of variables on model performance, we constructed a SHapley Additive explanation (SHAP) model based on the optimal model.
Results: There were significant differences in age, heart rate, COPD, cerebral stroke, heart failure and renal insufficiency in the nonsurvival group compared with the survival group. Among all models, BRF yielded the highest AUC (0.810; 95% CI, 0.778– 0.839), sensitivity (0.990; 95% CI, 0.981– 1.000), recall (0.990; 95% CI, 0.981– 1.000), and G-mean (0.806; 95% CI, 0.778– 0.827), and the lowest Brier score (0.181; 95% CI, 0.178– 0.185). Therefore, we identified BRF as the optimal model. Furthermore, red blood cell count (RBC), body mass index (BMI), and lactate dehydrogenase were found to be important mortality-associated risk factors.
Conclusion: BRF combined with advanced machine learning methods and SHAP is highly effective and accurately predicts mortality in patients with CHD comorbid with hypertension. This model has the potential to assist clinicians in modifying treatment strategies to improve patient outcomes.

Keywords: coronary heart disease comorbid with hypertension, ensemble learning, balanced random forest, SHAP, Prognosis
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
非平衡集合算法在冠心病合并高血压患者全因死亡率预测结果中的应用
目的:本研究旨在开发一种非平衡组合模型,该模型可准确预测冠心病合并高血压患者的死亡结果,并评估导致死亡的因素:收集了 1058 名冠心病合并高血压患者的病历资料,其中不包括急性冠脉综合征患者。在患者出院后的第一、第三、第六和第十二个月对其进行随访,记录死亡事件。随访在出院后两年结束。患者被分为存活组和非存活组。根据病历,对两组患者的性别、吸烟、饮酒、慢性阻塞性肺病、脑中风、糖尿病、高同型半胱氨酸血症、心力衰竭和肾功能不全等影响因素进行分类和比较,并进行特征选择,构建模型。由于数据的不平衡性,我们开发了基于平衡随机森林(BRF)、EasyEnsemble、RUSBoost、SMOTEBoost的四种非平衡集合预测模型,以及基于AdaBoost和Logistic的两种基础分类算法。每个模型都使用基于 GridSearchCV 的超参数进行了优化,并使用曲线下面积(AUC)、灵敏度、召回率、Brier 分数和几何平均数(G-mean)进行了评估。此外,为了了解变量对模型性能的影响,我们在最优模型的基础上构建了一个 SHapley Additive explanation (SHAP) 模型:结果:与存活组相比,非存活组在年龄、心率、慢性阻塞性肺病、脑中风、心力衰竭和肾功能不全方面存在明显差异。在所有模型中,BRF 的 AUC(0.810;95% CI,0.778- 0.839)、灵敏度(0.990;95% CI,0.981- 1.000)、召回率(0.990;95% CI,0.981- 1.000)和 G-mean (0.806;95% CI,0.778- 0.827)最高,而 Brier 评分(0.181;95% CI,0.178- 0.185)最低。因此,我们将 BRF 确定为最佳模型。此外,我们还发现红细胞计数(RBC)、体重指数(BMI)和乳酸脱氢酶是与死亡率相关的重要风险因素:BRF与先进的机器学习方法和SHAP相结合,能高效、准确地预测合并高血压的冠心病患者的死亡率。该模型有望帮助临床医生调整治疗策略,改善患者预后。关键词:冠心病合并高血压;集合学习;平衡随机森林;SHAP;预后
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Risk Management and Healthcare Policy
Risk Management and Healthcare Policy Medicine-Public Health, Environmental and Occupational Health
CiteScore
6.20
自引率
2.90%
发文量
242
审稿时长
16 weeks
期刊介绍: Risk Management and Healthcare Policy is an international, peer-reviewed, open access journal focusing on all aspects of public health, policy and preventative measures to promote good health and improve morbidity and mortality in the population. Specific topics covered in the journal include: Public and community health Policy and law Preventative and predictive healthcare Risk and hazard management Epidemiology, detection and screening Lifestyle and diet modification Vaccination and disease transmission/modification programs Health and safety and occupational health Healthcare services provision Health literacy and education Advertising and promotion of health issues Health economic evaluations and resource management Risk Management and Healthcare Policy focuses on human interventional and observational research. The journal welcomes submitted papers covering original research, clinical and epidemiological studies, reviews and evaluations, guidelines, expert opinion and commentary, and extended reports. Case reports will only be considered if they make a valuable and original contribution to the literature. The journal does not accept study protocols, animal-based or cell line-based studies.
期刊最新文献
Factors Influencing Occupational Stress of State Security Forces During the COVID-19 Pandemic: A Scoping Review. Comparative Life Cycle Assessment Between Single-Use and Reprocessed IPC Sleeves [Response to Letter]. Does Vision Health Knowledge Matter? A Cross-Sectional Study of Primary School Students in Rural China. Job Satisfaction Among Faculty in Standardized Residency Training Programs in Heilongjiang Province, China: A Cross-Sectional Study. Decreased Work Capability Related to High-Altitude Exposure.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1