Risk factors identification for stroke prognosis using machine learning algorithms

IF 0.9 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS Jordanian Journal of Computers and Information Technology Pub Date : 2022-01-01 DOI:10.5455/jjcit.71-1652725746
T. Ahammad
{"title":"Risk factors identification for stroke prognosis using machine learning algorithms","authors":"T. Ahammad","doi":"10.5455/jjcit.71-1652725746","DOIUrl":null,"url":null,"abstract":"Stroke is a life-threatening condition causing the second-leading number of deaths worldwide. It is a challenging problem in the public health domain of the 21st century for healthcare professionals and researchers. So, proper monitoring of stroke can prevent and reduce its severity. Risk factor analysis is one of the promising approaches for identifying the presence of stroke disease. Numerous researches have focused on forecasting strokes for patients. The majority had a good accuracy ratio, around 90%, on the publicly available dataset. Combining several preprocessing tasks can considerably increase the quality of classifiers, an area of research need. Additionally, the researchers should pinpoint the major risk factors for stroke disease and use advanced classifiers to forecast the likelihood of stroke. This article presents an enhanced approach for identifying the potential risk factors and predicting the incidence of stroke on a publicly available clinical dataset. The method considers and resolves significant gaps in the previous studies. It incorporates ten classification models, including advanced boosting classifiers, to detect the presence of stroke. The performance of the classifiers is analyzed on all possible subsets of attribute/feature selections concerning five metrics to find the best-performing algorithms. The experimental results demonstrate that the proposed approach achieved the best accuracy on all feature classifications. Overall, this study's main achievement is obtaining a higher percentage (97% accuracy using boosting classifiers) of stroke prognosis than state-of-the-art approaches to stroke dataset. Hence, physicians can use gradient and ensemble boosting-tree-based models that are most suitable for predicting patients' strokes in the real world. Moreover, this investigation also reveals that age, heart disease, glucose level, hypertension, and marital status are the most significant risk factors. At the same time, the remaining attributes are also essential to obtaining the best performance.","PeriodicalId":36757,"journal":{"name":"Jordanian Journal of Computers and Information Technology","volume":"1 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jordanian Journal of Computers and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5455/jjcit.71-1652725746","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2

Abstract

Stroke is a life-threatening condition causing the second-leading number of deaths worldwide. It is a challenging problem in the public health domain of the 21st century for healthcare professionals and researchers. So, proper monitoring of stroke can prevent and reduce its severity. Risk factor analysis is one of the promising approaches for identifying the presence of stroke disease. Numerous researches have focused on forecasting strokes for patients. The majority had a good accuracy ratio, around 90%, on the publicly available dataset. Combining several preprocessing tasks can considerably increase the quality of classifiers, an area of research need. Additionally, the researchers should pinpoint the major risk factors for stroke disease and use advanced classifiers to forecast the likelihood of stroke. This article presents an enhanced approach for identifying the potential risk factors and predicting the incidence of stroke on a publicly available clinical dataset. The method considers and resolves significant gaps in the previous studies. It incorporates ten classification models, including advanced boosting classifiers, to detect the presence of stroke. The performance of the classifiers is analyzed on all possible subsets of attribute/feature selections concerning five metrics to find the best-performing algorithms. The experimental results demonstrate that the proposed approach achieved the best accuracy on all feature classifications. Overall, this study's main achievement is obtaining a higher percentage (97% accuracy using boosting classifiers) of stroke prognosis than state-of-the-art approaches to stroke dataset. Hence, physicians can use gradient and ensemble boosting-tree-based models that are most suitable for predicting patients' strokes in the real world. Moreover, this investigation also reveals that age, heart disease, glucose level, hypertension, and marital status are the most significant risk factors. At the same time, the remaining attributes are also essential to obtaining the best performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用机器学习算法识别中风预后的危险因素
中风是一种危及生命的疾病,在全球造成的死亡人数中排名第二。对于医疗保健专业人员和研究人员来说,这是21世纪公共卫生领域的一个具有挑战性的问题。因此,适当的监测可以预防和减轻中风的严重程度。危险因素分析是识别卒中疾病存在的一种很有前途的方法。许多研究都集中在为患者预测中风上。在公开可用的数据集上,大多数具有良好的准确率,约为90%。结合多个预处理任务可以显著提高分类器的质量,这是一个研究领域的需要。此外,研究人员应该确定中风疾病的主要危险因素,并使用先进的分类器来预测中风的可能性。本文提出了一种增强的方法,用于识别潜在的危险因素,并在公开的临床数据集上预测中风的发病率。该方法考虑并解决了以往研究中的重大空白。它结合了十个分类模型,包括先进的增强分类器,以检测中风的存在。分类器的性能分析涉及五个指标的属性/特征选择的所有可能子集,以找到性能最好的算法。实验结果表明,该方法在所有特征分类上都取得了最好的准确率。总的来说,这项研究的主要成就是获得了比最先进的中风数据集方法更高的中风预后百分比(使用增强分类器的准确率为97%)。因此,医生可以使用基于梯度和集合增强树的模型,这些模型最适合在现实世界中预测患者的中风。此外,本调查还显示年龄、心脏病、血糖水平、高血压和婚姻状况是最重要的危险因素。同时,其余属性对于获得最佳性能也是必不可少的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Jordanian Journal of Computers and Information Technology
Jordanian Journal of Computers and Information Technology Computer Science-Computer Science (all)
CiteScore
3.10
自引率
25.00%
发文量
19
期刊最新文献
OPTIMAL ENERGY CONSUMPTION AND COST PERFORMANCE SOLUTION WITH DELAY CONSTRAINTS ON FOG COMPUTING ORTHOGONAL REGRESSED STEEPEST DESCENT DEEP PERCEPTIVE NEURAL LEARNING FOR IoT- AWARE SECURED BIG DATA COMMUNICATION AUTOMATIC DETECTION OF PNEUMONIA USING CONCATENATED CONVOLUTIONAL NEURAL NETWORK DESIGN OF A COMPACT BROADBAND ANTENNA USING CHARACTERISTIC MODE ANALYSIS FOR MICROWAVE APPLICATIONS Effectiveness of zero-shot models in automatic Arabic Poem generation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1