Application of the Unbalanced Ensemble Algorithm for Prognostic Prediction Outcomes of All-Cause Mortality in Coronary Heart Disease Patients Comorbid with Hypertension
{"title":"Application of the Unbalanced Ensemble Algorithm for Prognostic Prediction Outcomes of All-Cause Mortality in Coronary Heart Disease Patients Comorbid with Hypertension","authors":"Jiaxin Zan, Xiaojing Dong, Hong Yang, Jingjing Yan, Zixuan He, Jing Tian, Yanbo Zhang","doi":"10.2147/rmhp.s472398","DOIUrl":null,"url":null,"abstract":"<strong>Purpose:</strong> This study sought to develop an unbalanced-ensemble model that could accurately predict death outcomes of patients with comorbid coronary heart disease (CHD) and hypertension and evaluate the factors contributing to death.<br/><strong>Patients and Methods:</strong> Medical records of 1058 patients with coronary heart disease combined with hypertension and excluding those acute coronary syndrome were collected. Patients were followed-up at the first, third, sixth, and twelfth months after discharge to record death events. Follow-up ended two years after discharge. Patients were divided into survival and nonsurvival groups. According to medical records, gender, smoking, drinking, COPD, cerebral stroke, diabetes, hyperhomocysteinemia, heart failure and renal insufficiency of the two groups were sorted and compared and other influencing factors of the two groups, feature selection was carried out to construct models. Owing to data unbalance, we developed four unbalanced-ensemble prediction models based on Balanced Random Forest (BRF), EasyEnsemble, RUSBoost, SMOTEBoost and the two base classification algorithms based on AdaBoost and Logistic. Each model was optimised using hyperparameters based on GridSearchCV and evaluated using area under the curve (AUC), sensitivity, recall, Brier score, and geometric mean (G-mean). Additionally, to understand the influence of variables on model performance, we constructed a SHapley Additive explanation (SHAP) model based on the optimal model.<br/><strong>Results:</strong> There were significant differences in age, heart rate, COPD, cerebral stroke, heart failure and renal insufficiency in the nonsurvival group compared with the survival group. Among all models, BRF yielded the highest AUC (0.810; 95% CI, 0.778– 0.839), sensitivity (0.990; 95% CI, 0.981– 1.000), recall (0.990; 95% CI, 0.981– 1.000), and G-mean (0.806; 95% CI, 0.778– 0.827), and the lowest Brier score (0.181; 95% CI, 0.178– 0.185). Therefore, we identified BRF as the optimal model. Furthermore, red blood cell count (RBC), body mass index (BMI), and lactate dehydrogenase were found to be important mortality-associated risk factors.<br/><strong>Conclusion:</strong> BRF combined with advanced machine learning methods and SHAP is highly effective and accurately predicts mortality in patients with CHD comorbid with hypertension. This model has the potential to assist clinicians in modifying treatment strategies to improve patient outcomes. <br/><br/><strong>Keywords:</strong> coronary heart disease comorbid with hypertension, ensemble learning, balanced random forest, SHAP, Prognosis<br/>","PeriodicalId":56009,"journal":{"name":"Risk Management and Healthcare Policy","volume":"26 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Risk Management and Healthcare Policy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/rmhp.s472398","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: This study sought to develop an unbalanced-ensemble model that could accurately predict death outcomes of patients with comorbid coronary heart disease (CHD) and hypertension and evaluate the factors contributing to death. Patients and Methods: Medical records of 1058 patients with coronary heart disease combined with hypertension and excluding those acute coronary syndrome were collected. Patients were followed-up at the first, third, sixth, and twelfth months after discharge to record death events. Follow-up ended two years after discharge. Patients were divided into survival and nonsurvival groups. According to medical records, gender, smoking, drinking, COPD, cerebral stroke, diabetes, hyperhomocysteinemia, heart failure and renal insufficiency of the two groups were sorted and compared and other influencing factors of the two groups, feature selection was carried out to construct models. Owing to data unbalance, we developed four unbalanced-ensemble prediction models based on Balanced Random Forest (BRF), EasyEnsemble, RUSBoost, SMOTEBoost and the two base classification algorithms based on AdaBoost and Logistic. Each model was optimised using hyperparameters based on GridSearchCV and evaluated using area under the curve (AUC), sensitivity, recall, Brier score, and geometric mean (G-mean). Additionally, to understand the influence of variables on model performance, we constructed a SHapley Additive explanation (SHAP) model based on the optimal model. Results: There were significant differences in age, heart rate, COPD, cerebral stroke, heart failure and renal insufficiency in the nonsurvival group compared with the survival group. Among all models, BRF yielded the highest AUC (0.810; 95% CI, 0.778– 0.839), sensitivity (0.990; 95% CI, 0.981– 1.000), recall (0.990; 95% CI, 0.981– 1.000), and G-mean (0.806; 95% CI, 0.778– 0.827), and the lowest Brier score (0.181; 95% CI, 0.178– 0.185). Therefore, we identified BRF as the optimal model. Furthermore, red blood cell count (RBC), body mass index (BMI), and lactate dehydrogenase were found to be important mortality-associated risk factors. Conclusion: BRF combined with advanced machine learning methods and SHAP is highly effective and accurately predicts mortality in patients with CHD comorbid with hypertension. This model has the potential to assist clinicians in modifying treatment strategies to improve patient outcomes.
Keywords: coronary heart disease comorbid with hypertension, ensemble learning, balanced random forest, SHAP, Prognosis
期刊介绍:
Risk Management and Healthcare Policy is an international, peer-reviewed, open access journal focusing on all aspects of public health, policy and preventative measures to promote good health and improve morbidity and mortality in the population. Specific topics covered in the journal include:
Public and community health
Policy and law
Preventative and predictive healthcare
Risk and hazard management
Epidemiology, detection and screening
Lifestyle and diet modification
Vaccination and disease transmission/modification programs
Health and safety and occupational health
Healthcare services provision
Health literacy and education
Advertising and promotion of health issues
Health economic evaluations and resource management
Risk Management and Healthcare Policy focuses on human interventional and observational research. The journal welcomes submitted papers covering original research, clinical and epidemiological studies, reviews and evaluations, guidelines, expert opinion and commentary, and extended reports. Case reports will only be considered if they make a valuable and original contribution to the literature. The journal does not accept study protocols, animal-based or cell line-based studies.