通过机器学习模型预测心脏病发作可能性和严重程度的特征值量化和还原过程

Md Zawharul Islam, Md. Atahar Ishrak, A. H. M. Kamal
{"title":"通过机器学习模型预测心脏病发作可能性和严重程度的特征值量化和还原过程","authors":"Md Zawharul Islam, Md. Atahar Ishrak, A. H. M. Kamal","doi":"10.18535/ijecs/v13i07.4831","DOIUrl":null,"url":null,"abstract":"Heart disease is a prevalent condition nowadays that, if undiagnosed, can be deadly. To predict heart disease, \nresearchers designed many machine learning models. In this study, we propose a model that chooses fewer attribute columns for training, and we use these chosen features to determine the heart problem severity. Correlation Repeated Heat map and Information Gain were used for selecting the features. To train our model we used the UCI Cleveland heart disease dataset. We removed duplicate data to improve the accuracy score, and we also encoded the categorical data collection using the OneHot(OH) encoding method, which can improve prediction accuracy. Support Vector, Logistic Regression, K-Nearest Neighbour, Naive Bayes, Decision Tree, Random Forest, Adaboost, and XGBoost are the eight classifier algorithms that are used in this process overall. Based on repeated heat map correlation, we compare the accuracy score each time. In this proposed method, the Adaboost classification algorithm used by the fbs row heat map achieves the highest accuracy for heart disease detection and it is 92%. By choosing features according to the information gain value, we compare the accuracy score each time in information gain. For both XGBoost and Logistic Regression, we got an accuracy score of 93.44%. However, compared to the XGBoost classification technique, Logistic Regression requires less time. Accuracy, precision, recall, f1-score, sensitivity, specificity, and the AUC of ROC charts were used to evaluate the performance of the model. Overall, the results of our model demonstrate that it is reliable and accurate in identifying cardiac disease and its level of severeness.","PeriodicalId":231371,"journal":{"name":"International Journal of Engineering and Computer Science","volume":" 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature value quantization and reduction process for predicting heart attack possibility and the level of severity by a machine learning model\",\"authors\":\"Md Zawharul Islam, Md. Atahar Ishrak, A. H. M. Kamal\",\"doi\":\"10.18535/ijecs/v13i07.4831\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Heart disease is a prevalent condition nowadays that, if undiagnosed, can be deadly. To predict heart disease, \\nresearchers designed many machine learning models. In this study, we propose a model that chooses fewer attribute columns for training, and we use these chosen features to determine the heart problem severity. Correlation Repeated Heat map and Information Gain were used for selecting the features. To train our model we used the UCI Cleveland heart disease dataset. We removed duplicate data to improve the accuracy score, and we also encoded the categorical data collection using the OneHot(OH) encoding method, which can improve prediction accuracy. Support Vector, Logistic Regression, K-Nearest Neighbour, Naive Bayes, Decision Tree, Random Forest, Adaboost, and XGBoost are the eight classifier algorithms that are used in this process overall. Based on repeated heat map correlation, we compare the accuracy score each time. In this proposed method, the Adaboost classification algorithm used by the fbs row heat map achieves the highest accuracy for heart disease detection and it is 92%. By choosing features according to the information gain value, we compare the accuracy score each time in information gain. For both XGBoost and Logistic Regression, we got an accuracy score of 93.44%. However, compared to the XGBoost classification technique, Logistic Regression requires less time. Accuracy, precision, recall, f1-score, sensitivity, specificity, and the AUC of ROC charts were used to evaluate the performance of the model. Overall, the results of our model demonstrate that it is reliable and accurate in identifying cardiac disease and its level of severeness.\",\"PeriodicalId\":231371,\"journal\":{\"name\":\"International Journal of Engineering and Computer Science\",\"volume\":\" 6\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Engineering and Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18535/ijecs/v13i07.4831\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Engineering and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18535/ijecs/v13i07.4831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

心脏病是当今的一种流行病,如果得不到诊断,可能会致命。为了预测心脏病,研究人员设计了许多机器学习模型。在本研究中,我们提出了一种选择较少属性列进行训练的模型,并利用这些所选特征来判断心脏病的严重程度。在选择特征时使用了相关性重复热图和信息增益。为了训练我们的模型,我们使用了 UCI 克利夫兰心脏病数据集。我们删除了重复数据以提高准确率,还使用 OneHot(OH) 编码方法对分类数据集进行了编码,从而提高了预测准确率。支持向量、逻辑回归、K-近邻、Naive Bayes、决策树、随机森林、Adaboost 和 XGBoost 是这一过程中总体使用的八种分类算法。在重复热图相关性的基础上,我们比较每次的准确率得分。在本方法中,fbs 行热图使用的 Adaboost 分类算法对心脏病检测的准确率最高,达到 92%。通过根据信息增益值选择特征,我们比较了每次信息增益的准确率得分。对于 XGBoost 和 Logistic 回归,我们都获得了 93.44% 的准确率。不过,与 XGBoost 分类技术相比,逻辑回归所需的时间更短。准确率、精确度、召回率、f1 分数、灵敏度、特异性和 ROC 图的 AUC 均用于评估模型的性能。总体而言,我们的模型结果表明,它在识别心脏疾病及其严重程度方面是可靠和准确的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Feature value quantization and reduction process for predicting heart attack possibility and the level of severity by a machine learning model
Heart disease is a prevalent condition nowadays that, if undiagnosed, can be deadly. To predict heart disease, researchers designed many machine learning models. In this study, we propose a model that chooses fewer attribute columns for training, and we use these chosen features to determine the heart problem severity. Correlation Repeated Heat map and Information Gain were used for selecting the features. To train our model we used the UCI Cleveland heart disease dataset. We removed duplicate data to improve the accuracy score, and we also encoded the categorical data collection using the OneHot(OH) encoding method, which can improve prediction accuracy. Support Vector, Logistic Regression, K-Nearest Neighbour, Naive Bayes, Decision Tree, Random Forest, Adaboost, and XGBoost are the eight classifier algorithms that are used in this process overall. Based on repeated heat map correlation, we compare the accuracy score each time. In this proposed method, the Adaboost classification algorithm used by the fbs row heat map achieves the highest accuracy for heart disease detection and it is 92%. By choosing features according to the information gain value, we compare the accuracy score each time in information gain. For both XGBoost and Logistic Regression, we got an accuracy score of 93.44%. However, compared to the XGBoost classification technique, Logistic Regression requires less time. Accuracy, precision, recall, f1-score, sensitivity, specificity, and the AUC of ROC charts were used to evaluate the performance of the model. Overall, the results of our model demonstrate that it is reliable and accurate in identifying cardiac disease and its level of severeness.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A FRAMEWORK FOR MANAGEMENT OF LEAKS AND EQUIPMENT FAILURE IN OIL WELLS Data-Driven Approach to Automated Lyric Generation Predictive Analytics for Demand Forecasting: A deep Learning-based Decision Support System A Model for Detection of Malwares on Edge Devices ENHANCE DOCUMENT VALIDATION UIPATH POWERED SIGNATURE VERIFICATION
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1