Machine learning to predict stroke risk from routine hospital data: A systematic review

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS International Journal of Medical Informatics Pub Date : 2025-01-28 DOI:10.1016/j.ijmedinf.2025.105811

William Heseltine-Carp , Megan Courtman , Daniel Browning , Aishwarya Kasabe , Michael Allen , Adam Streeter , Emmanuel Ifeachor , Martin James , Stephen Mullin

{"title":"Machine learning to predict stroke risk from routine hospital data: A systematic review","authors":"William Heseltine-Carp , Megan Courtman , Daniel Browning , Aishwarya Kasabe , Michael Allen , Adam Streeter , Emmanuel Ifeachor , Martin James , Stephen Mullin","doi":"10.1016/j.ijmedinf.2025.105811","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Stroke remains a leading cause of morbidity and mortality. Despite this, current risk stratification tools such as CHA<sub>2</sub>DS<sub>2</sub>-VASc and QRISK3 are of limited accuracy, particularly in those without a diagnosis of atrial-fibrillation. Hence, there is a need for more accurate stroke risk prediction models. Machine-learning (ML) may provide a solution to this by leveraging existing routine hospital databases to build accurate stroke risk prediction models and identify novel risk factors for stroke.</div></div><div><h3>Aims</h3><div>In this systematic review we appraise current research using ML to predict stroke risk from routine hospital data. Based on these findings we then highlight common methodological limitations and recommendations for future research.</div></div><div><h3>Methods</h3><div>In this review we identify 49 original research (38 in the general population and 11 in AF specific populations) articles from the PUBMED database from January-2013 to December-2024 using ML and routine hospital data to predict the risk of stroke.</div></div><div><h3>Results</h3><div>ML models were able to accurately predict stroke risk in both AF specific and general populations, with AUCs ranging from 0.64 to 0.99. Where tested, ML also consistently outperformed traditional risk stratification tool, such as CHA<sub>2</sub>DS<sub>2</sub>-VASc. ML also appeared useful in identifying several novel risk factors from electrocardiogram, laboratory test and echocardiography data.</div><div>However, the quality of datasets were often limited, there was a high suspicion of overfitting and models often lacked calibration, external validation and explainability analysis.</div></div><div><h3>Conclusion</h3><div>Whilst ML has shown great potential in stroke prediction and identifying novel risk factors for stroke, improvements in study methodology is required prior to integration of ML into routine healthcare. Future research should adhere to the EQUATOR guidance on prediction models and encourage interdisciplinary collaboration between computer scientists and clinicians. Further prospective RCTs are also required to validate models in the clinical setting and the identify barriers of integrating ML into routine healthcare.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"196 ","pages":"Article 105811"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505625000280","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

Stroke remains a leading cause of morbidity and mortality. Despite this, current risk stratification tools such as CHA₂DS₂-VASc and QRISK3 are of limited accuracy, particularly in those without a diagnosis of atrial-fibrillation. Hence, there is a need for more accurate stroke risk prediction models. Machine-learning (ML) may provide a solution to this by leveraging existing routine hospital databases to build accurate stroke risk prediction models and identify novel risk factors for stroke.

Aims

In this systematic review we appraise current research using ML to predict stroke risk from routine hospital data. Based on these findings we then highlight common methodological limitations and recommendations for future research.

Methods

In this review we identify 49 original research (38 in the general population and 11 in AF specific populations) articles from the PUBMED database from January-2013 to December-2024 using ML and routine hospital data to predict the risk of stroke.

Results

ML models were able to accurately predict stroke risk in both AF specific and general populations, with AUCs ranging from 0.64 to 0.99. Where tested, ML also consistently outperformed traditional risk stratification tool, such as CHA₂DS₂-VASc. ML also appeared useful in identifying several novel risk factors from electrocardiogram, laboratory test and echocardiography data.

However, the quality of datasets were often limited, there was a high suspicion of overfitting and models often lacked calibration, external validation and explainability analysis.

Conclusion

Whilst ML has shown great potential in stroke prediction and identifying novel risk factors for stroke, improvements in study methodology is required prior to integration of ML into routine healthcare. Future research should adhere to the EQUATOR guidance on prediction models and encourage interdisciplinary collaboration between computer scientists and clinicians. Further prospective RCTs are also required to validate models in the clinical setting and the identify barriers of integrating ML into routine healthcare.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Medical Informatics 医学-计算机：信息系统

CiteScore

8.90

自引率

4.10%

发文量

217

审稿时长

42 days

期刊介绍： International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings. The scope of journal covers: Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.; Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc. Educational computer based programs pertaining to medical informatics or medicine in general; Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.