Comparison of the Performance of Machine Learning Algorithms in Predicting Heart Disease

Sajad Yousefi
{"title":"Comparison of the Performance of Machine Learning Algorithms in Predicting Heart Disease","authors":"Sajad Yousefi","doi":"10.30699/fhi.v10i1.349","DOIUrl":null,"url":null,"abstract":"Introduction: Heart disease is often associated with conditions such as clogged arteries due to the sediment accumulation which causes chest pain and heart attack. Many people die due to the heart disease annually. Most countries have a shortage of cardiovascular specialists and thus, a significant percentage of misdiagnosis occurs. Hence, predicting this disease is a serious issue. Using machine learning models performed on multidimensional dataset, this article aims to find the most efficient and accurate machine learning models for disease prediction.Material and Methods: Several algorithms were utilized to predict heart disease among which Decision Tree, Random Forest and KNN supervised machine learning are highly mentioned. The algorithms are applied to the dataset taken from the UCI repository including 294 samples. The dataset includes heart disease features. To enhance the algorithm performance, these features are analyzed, the feature importance scores and cross validation are considered.Results: The algorithm performance is compared with each other, so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, Accuracy, AUC ROC are 83% and 99% respectively for Decision Tree algorithm. Logistic Regression algorithm with accuracy and AUC ROC are 88% and 91% respectively has better performance than other algorithms. Therefore, these techniques can be useful for physicians to predict heart disease patients and prescribe them correctly.Conclusion: Machine learning technique can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the prediction of heart disease is compared to determine the most appropriate classification. As a result of evaluation, better performance was observed in both Decision Tree and Logistic Regression models.","PeriodicalId":154611,"journal":{"name":"Frontiers in Health Informatics","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30699/fhi.v10i1.349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Introduction: Heart disease is often associated with conditions such as clogged arteries due to the sediment accumulation which causes chest pain and heart attack. Many people die due to the heart disease annually. Most countries have a shortage of cardiovascular specialists and thus, a significant percentage of misdiagnosis occurs. Hence, predicting this disease is a serious issue. Using machine learning models performed on multidimensional dataset, this article aims to find the most efficient and accurate machine learning models for disease prediction.Material and Methods: Several algorithms were utilized to predict heart disease among which Decision Tree, Random Forest and KNN supervised machine learning are highly mentioned. The algorithms are applied to the dataset taken from the UCI repository including 294 samples. The dataset includes heart disease features. To enhance the algorithm performance, these features are analyzed, the feature importance scores and cross validation are considered.Results: The algorithm performance is compared with each other, so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, Accuracy, AUC ROC are 83% and 99% respectively for Decision Tree algorithm. Logistic Regression algorithm with accuracy and AUC ROC are 88% and 91% respectively has better performance than other algorithms. Therefore, these techniques can be useful for physicians to predict heart disease patients and prescribe them correctly.Conclusion: Machine learning technique can be used in medicine for analyzing the related data collections to a disease and its prediction. The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the prediction of heart disease is compared to determine the most appropriate classification. As a result of evaluation, better performance was observed in both Decision Tree and Logistic Regression models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习算法在预测心脏病中的性能比较
导语:心脏病通常与动脉阻塞有关,这是由于沉积物积聚引起的胸痛和心脏病发作。每年有许多人死于心脏病。大多数国家都缺乏心血管专科医生,因此出现了很大比例的误诊。因此,预测这种疾病是一个严重的问题。本文旨在利用多维数据集上的机器学习模型,寻找最有效、最准确的疾病预测机器学习模型。材料与方法:几种算法被用于预测心脏病,其中决策树、随机森林和KNN监督机器学习被高度提及。这些算法应用于从UCI存储库中提取的数据集,包括294个样本。该数据集包括心脏病特征。为了提高算法的性能,对这些特征进行了分析,并考虑了特征重要性评分和交叉验证。结果:对算法性能进行比较,对各模型进行基于ROC曲线及准确度、精密度、灵敏度、F1评分等指标的性能评价。评价结果表明,决策树算法的准确率为83%,AUC ROC为99%。Logistic回归算法的准确率和AUC ROC分别为88%和91%,优于其他算法。因此,这些技术可以帮助医生预测心脏病患者并正确地开出处方。结论:机器学习技术可以应用于医学领域,对疾病的相关数据进行分析和预测。ROC曲线下的面积和评估标准涉及到许多机器学习的分类算法来评估心脏病,实际上,对心脏病的预测进行比较,以确定最合适的分类。作为评估的结果,在决策树和逻辑回归模型中观察到更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.20
自引率
0.00%
发文量
0
期刊最新文献
Self-Care Application for Rheumatoid Arthritis: Identifying Key Data Elements Effective use of electronic health records system for healthcare delivery in Ghana Predictive Modeling of COVID-19 Hospitalization Using Twenty Machine Learning Classification Algorithms on Cohort Data Development and Usability Evaluation of a Web-Based Health Information Technology Dashboard of Quality and Economic Indicators Potentially Highly Effective Drugs for COVID-19: Virtual Screening and Molecular Docking Study Through PyRx-Vina Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1