Explainable machine learning model for assessing health status in patients with comorbid coronary heart disease and depression: Development and validation study
{"title":"Explainable machine learning model for assessing health status in patients with comorbid coronary heart disease and depression: Development and validation study","authors":"Jiqing Li, Shuo Wu, Jianhua Gu","doi":"10.1016/j.ijmedinf.2025.105808","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Coronary heart disease (CHD) and depression frequently co-occur, significantly impacting patient outcomes. However, comprehensive health status assessment tools for this complex population are lacking. This study aimed to develop and validate an explainable machine learning model to evaluate overall health status in patients with comorbid CHD and depression.</div></div><div><h3>Methods</h3><div>Utilizing data from the 2021–2022 Behavioral Risk Factor Surveillance System, we developed and externally validated machine learning models to predict overall health status, defined as having both poor physical and mental health for ≥ 14 days in the past 30 days. Eleven machine learning algorithms were evaluated, including artificial neural networks, support vector machines, and ensemble methods. The SHapley Additive exPlanations (SHAP) method was employed to enhance model interpretability. Model performance was assessed using discrimination, calibration, and decision curve analysis.</div></div><div><h3>Results</h3><div>The study included 9,747 participants in the derivation cohort and 8,394 in the external validation cohort. Among the eleven algorithms evaluated, an optimized XGBoost model with eight key features demonstrated balanced performance. SHAP analysis revealed that employment status, physical activity, income, and age were the most influential predictors. The model maintained good discrimination (AUC 0.712, 95% CI 0.703–0.721 in derivation; AUC 0.711, 95% CI 0.701–0.721 in validation), calibration and clinical utility across both cohorts.</div></div><div><h3>Conclusion</h3><div>Our explainable machine learning model provides a novel, comprehensive approach to assessing health status in patients with comorbid CHD and depression, offering valuable insights for personalized management strategies.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"196 ","pages":"Article 105808"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505625000255","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Coronary heart disease (CHD) and depression frequently co-occur, significantly impacting patient outcomes. However, comprehensive health status assessment tools for this complex population are lacking. This study aimed to develop and validate an explainable machine learning model to evaluate overall health status in patients with comorbid CHD and depression.
Methods
Utilizing data from the 2021–2022 Behavioral Risk Factor Surveillance System, we developed and externally validated machine learning models to predict overall health status, defined as having both poor physical and mental health for ≥ 14 days in the past 30 days. Eleven machine learning algorithms were evaluated, including artificial neural networks, support vector machines, and ensemble methods. The SHapley Additive exPlanations (SHAP) method was employed to enhance model interpretability. Model performance was assessed using discrimination, calibration, and decision curve analysis.
Results
The study included 9,747 participants in the derivation cohort and 8,394 in the external validation cohort. Among the eleven algorithms evaluated, an optimized XGBoost model with eight key features demonstrated balanced performance. SHAP analysis revealed that employment status, physical activity, income, and age were the most influential predictors. The model maintained good discrimination (AUC 0.712, 95% CI 0.703–0.721 in derivation; AUC 0.711, 95% CI 0.701–0.721 in validation), calibration and clinical utility across both cohorts.
Conclusion
Our explainable machine learning model provides a novel, comprehensive approach to assessing health status in patients with comorbid CHD and depression, offering valuable insights for personalized management strategies.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.