Development and validation of an interpretable machine learning model for predicting post-stroke epilepsy

IF 2 4区 医学 Q3 CLINICAL NEUROLOGY Epilepsy Research Pub Date : 2024-06-28 DOI:10.1016/j.eplepsyres.2024.107397
Yue Yu , Zhibin Chen , Yong Yang , Jiajun Zhang , Yan Wang
{"title":"Development and validation of an interpretable machine learning model for predicting post-stroke epilepsy","authors":"Yue Yu ,&nbsp;Zhibin Chen ,&nbsp;Yong Yang ,&nbsp;Jiajun Zhang ,&nbsp;Yan Wang","doi":"10.1016/j.eplepsyres.2024.107397","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Epilepsy is a serious complication after an ischemic stroke. Although two studies have developed prediction model for post-stroke epilepsy (PSE), their accuracy remains insufficient, and their applicability to different populations is uncertain. With the rapid advancement of computer technology, machine learning (ML) offers new opportunities for creating more accurate prediction models. However, the potential of ML in predicting PSE is still not well understood. The purpose of this study was to develop prediction models for PSE among ischemic stroke patients.</p></div><div><h3>Methods</h3><p>Patients with ischemic stroke from two stroke centers were included in this retrospective cohort study. At the baseline level, 33 input variables were considered candidate features. The 2-year PSE prediction models in the derivation cohort were built using six ML algorithms. The predictive performance of these machine learning models required further appraisal and comparison with the reference model using the conventional triage classification information. The Shapley additive explanation (SHAP), based on fair profit allocation among many stakeholders according to their contributions, is used to interpret the predicted outcomes of the naive Bayes (NB) model.</p></div><div><h3>Results</h3><p>A total of 1977 patients were included to build the predictive model for PSE. The Boruta method identified NIHSS score, hospital length of stay, D-dimer level, and cortical involvement as the optimal features, with the receiver operating characteristic curves ranging from 0.709 to 0.849. An additional 870 patients were used to validate the ML and reference models. The NB model achieved the best performance among the PSE prediction models with an area under the receiver operating curve of 0.757. At the 20 % absolute risk threshold, the NB model also provided a sensitivity of 0.739 and a specificity of 0.720. The reference model had poor sensitivities of only 0.15 despite achieving a helpful AUC of 0.732. Furthermore, the SHAP method analysis demonstrated that a higher NIHSS score, longer hospital length of stay, higher D-dimer level, and cortical involvement were positive predictors of epilepsy after ischemic stroke.</p></div><div><h3>Conclusions</h3><p>Our study confirmed the feasibility of applying the ML method to use easy-to-obtain variables for accurate prediction of PSE and provided improved strategies and effective resource allocation for high-risk patients. In addition, the SHAP method could improve model transparency and make it easier for clinicians to grasp the prediction model's reliability.</p></div>","PeriodicalId":11914,"journal":{"name":"Epilepsy Research","volume":"205 ","pages":"Article 107397"},"PeriodicalIF":2.0000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epilepsy Research","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0920121124001128","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Epilepsy is a serious complication after an ischemic stroke. Although two studies have developed prediction model for post-stroke epilepsy (PSE), their accuracy remains insufficient, and their applicability to different populations is uncertain. With the rapid advancement of computer technology, machine learning (ML) offers new opportunities for creating more accurate prediction models. However, the potential of ML in predicting PSE is still not well understood. The purpose of this study was to develop prediction models for PSE among ischemic stroke patients.

Methods

Patients with ischemic stroke from two stroke centers were included in this retrospective cohort study. At the baseline level, 33 input variables were considered candidate features. The 2-year PSE prediction models in the derivation cohort were built using six ML algorithms. The predictive performance of these machine learning models required further appraisal and comparison with the reference model using the conventional triage classification information. The Shapley additive explanation (SHAP), based on fair profit allocation among many stakeholders according to their contributions, is used to interpret the predicted outcomes of the naive Bayes (NB) model.

Results

A total of 1977 patients were included to build the predictive model for PSE. The Boruta method identified NIHSS score, hospital length of stay, D-dimer level, and cortical involvement as the optimal features, with the receiver operating characteristic curves ranging from 0.709 to 0.849. An additional 870 patients were used to validate the ML and reference models. The NB model achieved the best performance among the PSE prediction models with an area under the receiver operating curve of 0.757. At the 20 % absolute risk threshold, the NB model also provided a sensitivity of 0.739 and a specificity of 0.720. The reference model had poor sensitivities of only 0.15 despite achieving a helpful AUC of 0.732. Furthermore, the SHAP method analysis demonstrated that a higher NIHSS score, longer hospital length of stay, higher D-dimer level, and cortical involvement were positive predictors of epilepsy after ischemic stroke.

Conclusions

Our study confirmed the feasibility of applying the ML method to use easy-to-obtain variables for accurate prediction of PSE and provided improved strategies and effective resource allocation for high-risk patients. In addition, the SHAP method could improve model transparency and make it easier for clinicians to grasp the prediction model's reliability.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
开发并验证用于预测中风后癫痫的可解释机器学习模型。
背景:癫痫是缺血性脑卒中后的一种严重并发症。虽然已有两项研究建立了脑卒中后癫痫(PSE)的预测模型,但其准确性仍然不足,对不同人群的适用性也不确定。随着计算机技术的飞速发展,机器学习(ML)为创建更准确的预测模型提供了新的机遇。然而,人们对机器学习在预测 PSE 方面的潜力仍不甚了解。本研究旨在开发缺血性脑卒中患者 PSE 的预测模型:这项回顾性队列研究纳入了来自两个卒中中心的缺血性卒中患者。在基线水平上,33 个输入变量被认为是候选特征。衍生队列中的 2 年 PSE 预测模型采用六种 ML 算法建立。这些机器学习模型的预测性能需要进一步评估,并与使用传统分诊分类信息的参考模型进行比较。沙普利加法解释(SHAP)是根据许多利益相关者的贡献在他们之间进行公平的利润分配,用来解释天真贝叶斯(NB)模型的预测结果:共有 1977 名患者被纳入 PSE 预测模型。Boruta 方法确定 NIHSS 评分、住院时间、D-二聚体水平和皮质受累为最佳特征,接收者操作特征曲线范围为 0.709 至 0.849。另外还使用了 870 名患者来验证 ML 模型和参考模型。在 PSE 预测模型中,NB 模型的性能最佳,接收器工作曲线下面积为 0.757。在 20% 绝对风险阈值下,NB 模型的灵敏度为 0.739,特异性为 0.720。参考模型的灵敏度较低,只有 0.15,尽管其 AUC 达到了 0.732。此外,SHAP方法分析表明,较高的NIHSS评分、较长的住院时间、较高的D-二聚体水平和皮质受累是缺血性卒中后癫痫的积极预测因素:我们的研究证实了应用 ML 方法使用易于获得的变量准确预测 PSE 的可行性,并为高危患者提供了改进策略和有效的资源分配。此外,SHAP 方法可以提高模型的透明度,使临床医生更容易掌握预测模型的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Epilepsy Research
Epilepsy Research 医学-临床神经学
CiteScore
0.10
自引率
4.50%
发文量
143
审稿时长
62 days
期刊介绍: Epilepsy Research provides for publication of high quality articles in both basic and clinical epilepsy research, with a special emphasis on translational research that ultimately relates to epilepsy as a human condition. The journal is intended to provide a forum for reporting the best and most rigorous epilepsy research from all disciplines ranging from biophysics and molecular biology to epidemiological and psychosocial research. As such the journal will publish original papers relevant to epilepsy from any scientific discipline and also studies of a multidisciplinary nature. Clinical and experimental research papers adopting fresh conceptual approaches to the study of epilepsy and its treatment are encouraged. The overriding criteria for publication are novelty, significant clinical or experimental relevance, and interest to a multidisciplinary audience in the broad arena of epilepsy. Review articles focused on any topic of epilepsy research will also be considered, but only if they present an exceptionally clear synthesis of current knowledge and future directions of a research area, based on a critical assessment of the available data or on hypotheses that are likely to stimulate more critical thinking and further advances in an area of epilepsy research.
期刊最新文献
Editorial Board Cannabis use, sleep and mood disturbances among persons with epilepsy – A clinical and polysomnography study from a Canadian tertiary care epilepsy center Evaluating the late seizures of acute encephalopathy with biphasic seizures and late reduced diffusion via monitoring using continuous electroencephalogram Validation of hemispherectomy outcome prediction scale in treatment of medically intractable epilepsy MicroRNAs as potential biomarkers of response to modified Atkins diet in treatment of adults with drug-resistant epilepsy: A proof-of-concept study
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1