Hanna Kwiendacz, Bi Huang, Yang Chen, Oliwia Janota, Krzysztof Irlik, Yang Liu, Marta Mantovani, Yalin Zheng, Mirela Hendel, Julia Piaśnik, Wiktoria Wójcik, Uazman Alam, Janusz Gumprecht, Gregory Y H Lip, Katarzyna Nabrdalik
{"title":"Predicting major adverse cardiac events in diabetes and chronic kidney disease: a machine learning study from the Silesia Diabetes-Heart Project.","authors":"Hanna Kwiendacz, Bi Huang, Yang Chen, Oliwia Janota, Krzysztof Irlik, Yang Liu, Marta Mantovani, Yalin Zheng, Mirela Hendel, Julia Piaśnik, Wiktoria Wójcik, Uazman Alam, Janusz Gumprecht, Gregory Y H Lip, Katarzyna Nabrdalik","doi":"10.1186/s12933-025-02615-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>People living with diabetes mellitus (DM) and chronic kidney disease (CKD) are at significantly high risk of cardiovascular events (CVEs), however the predictive performance of traditional risk prediction methods are limited.</p><p><strong>Methods: </strong>We utilised machine learning (ML) model to predict CVEs in persons with DM and CKD from the Silesia Diabetes-Heart Project, a routine standard of care dataset. CVEs were defined as composite of nonfatal myocardial infarction, new onset heart failure, nonfatal stroke, incident atrial fibrillation, undergoing percutaneous coronary intervention or coronary artery bypass grafting, hospitalisation or death due to cardiovascular disease. Five ML models (Logistic regression [LR], Random forest [RF], Support vector classification [SVC], Light gradient boosting machine [LGBM], and eXtreme gradient boosting machine [XGBM]) were constructed. The predictive performance of the five ML models was compared and the model interpretability were evaluated by Shapley Additive exPlanations (SHAP).</p><p><strong>Results: </strong>A total of 1,116 people with DM and CKD out of 3,056 with DM were included (median age 67 [IQR 57-76] years; 57% men). The incidence of CVEs was 14.1% (157/1,116) during a median of 3.1 years follow-up period. Ten important features were identified through univariate Logistic regression, Boruta, and Least Absolute Shrinkage and Selection Operator [LASSO] regression. Among the five ML models based on these features, LGBM had the highest area under curve [AUC] (AUC = 0.740, 95% Confidence Interval [CI] 0.738-0.743), followed by LR (AUC = 0.621, 95% CI 0.618-0.623), RF (AUC = 0.707, 95% CI 0.704-0.709), SVC (AUC = 0.707, 95% CI 0.704-0.710), and XGBM (AUC = 0.710, 95% CI 0.707-0.713). Meanwhile, LGBM had relatively higher Recall (0.739), F1-score (0.820), and G-mean (0.826). The SHAP plot of LGBM revealed that estimated glomerular filtration rate (eGFR), age, and triglyceride glucose index were the three most important features for predicting CVEs.</p><p><strong>Conclusion: </strong>Ten features-based ML models, especially the LGBM model, had acceptable performance in predicting CVEs in persons with DM and CKD. A decrease in eGFR, aging, and elevated inflammatory markers significantly enhanced the predictive capability of the model. Future external validation of our model is required prior to implementation in a clinical environment.</p>","PeriodicalId":9374,"journal":{"name":"Cardiovascular Diabetology","volume":"24 1","pages":"76"},"PeriodicalIF":8.5000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11829423/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cardiovascular Diabetology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12933-025-02615-w","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: People living with diabetes mellitus (DM) and chronic kidney disease (CKD) are at significantly high risk of cardiovascular events (CVEs), however the predictive performance of traditional risk prediction methods are limited.
Methods: We utilised machine learning (ML) model to predict CVEs in persons with DM and CKD from the Silesia Diabetes-Heart Project, a routine standard of care dataset. CVEs were defined as composite of nonfatal myocardial infarction, new onset heart failure, nonfatal stroke, incident atrial fibrillation, undergoing percutaneous coronary intervention or coronary artery bypass grafting, hospitalisation or death due to cardiovascular disease. Five ML models (Logistic regression [LR], Random forest [RF], Support vector classification [SVC], Light gradient boosting machine [LGBM], and eXtreme gradient boosting machine [XGBM]) were constructed. The predictive performance of the five ML models was compared and the model interpretability were evaluated by Shapley Additive exPlanations (SHAP).
Results: A total of 1,116 people with DM and CKD out of 3,056 with DM were included (median age 67 [IQR 57-76] years; 57% men). The incidence of CVEs was 14.1% (157/1,116) during a median of 3.1 years follow-up period. Ten important features were identified through univariate Logistic regression, Boruta, and Least Absolute Shrinkage and Selection Operator [LASSO] regression. Among the five ML models based on these features, LGBM had the highest area under curve [AUC] (AUC = 0.740, 95% Confidence Interval [CI] 0.738-0.743), followed by LR (AUC = 0.621, 95% CI 0.618-0.623), RF (AUC = 0.707, 95% CI 0.704-0.709), SVC (AUC = 0.707, 95% CI 0.704-0.710), and XGBM (AUC = 0.710, 95% CI 0.707-0.713). Meanwhile, LGBM had relatively higher Recall (0.739), F1-score (0.820), and G-mean (0.826). The SHAP plot of LGBM revealed that estimated glomerular filtration rate (eGFR), age, and triglyceride glucose index were the three most important features for predicting CVEs.
Conclusion: Ten features-based ML models, especially the LGBM model, had acceptable performance in predicting CVEs in persons with DM and CKD. A decrease in eGFR, aging, and elevated inflammatory markers significantly enhanced the predictive capability of the model. Future external validation of our model is required prior to implementation in a clinical environment.
期刊介绍:
Cardiovascular Diabetology is a journal that welcomes manuscripts exploring various aspects of the relationship between diabetes, cardiovascular health, and the metabolic syndrome. We invite submissions related to clinical studies, genetic investigations, experimental research, pharmacological studies, epidemiological analyses, and molecular biology research in this field.