Sibo Prasad Patro, Neelamadhab Padhy, Rahul Deo Sah
{"title":"基于相关性和特征选择技术的心脏病预测分类模型","authors":"Sibo Prasad Patro, Neelamadhab Padhy, Rahul Deo Sah","doi":"10.1109/OCIT56763.2022.00016","DOIUrl":null,"url":null,"abstract":"Accurate analysis and prediction for real-time heart disease are highly significant. Many medical diagnosis difficulties have a class imbalance because the number of patients with a certain disease is significantly smaller than the number of healthy people in the population. The purpose of this work is to provide a way for using a feature selection technique to determine the most relevant features of heart disease characteristics. The experiment for this study is performed over the Framingham Heart Study dataset using OneR, GA, and CORR feature selection methods. With the help of the Chi-squared test, six highly correlated features are selected for disease prediction. The experimental results show that CORR has the lowest mean rank of 8.16% and the accuracy for the proposed model using SVM outperformed with an accuracy of 67% on oversampling data.","PeriodicalId":425541,"journal":{"name":"2022 OITS International Conference on Information Technology (OCIT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Classification model for heart disease prediction using correlation and feature selection techniques\",\"authors\":\"Sibo Prasad Patro, Neelamadhab Padhy, Rahul Deo Sah\",\"doi\":\"10.1109/OCIT56763.2022.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate analysis and prediction for real-time heart disease are highly significant. Many medical diagnosis difficulties have a class imbalance because the number of patients with a certain disease is significantly smaller than the number of healthy people in the population. The purpose of this work is to provide a way for using a feature selection technique to determine the most relevant features of heart disease characteristics. The experiment for this study is performed over the Framingham Heart Study dataset using OneR, GA, and CORR feature selection methods. With the help of the Chi-squared test, six highly correlated features are selected for disease prediction. The experimental results show that CORR has the lowest mean rank of 8.16% and the accuracy for the proposed model using SVM outperformed with an accuracy of 67% on oversampling data.\",\"PeriodicalId\":425541,\"journal\":{\"name\":\"2022 OITS International Conference on Information Technology (OCIT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 OITS International Conference on Information Technology (OCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/OCIT56763.2022.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 OITS International Conference on Information Technology (OCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/OCIT56763.2022.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Classification model for heart disease prediction using correlation and feature selection techniques
Accurate analysis and prediction for real-time heart disease are highly significant. Many medical diagnosis difficulties have a class imbalance because the number of patients with a certain disease is significantly smaller than the number of healthy people in the population. The purpose of this work is to provide a way for using a feature selection technique to determine the most relevant features of heart disease characteristics. The experiment for this study is performed over the Framingham Heart Study dataset using OneR, GA, and CORR feature selection methods. With the help of the Chi-squared test, six highly correlated features are selected for disease prediction. The experimental results show that CORR has the lowest mean rank of 8.16% and the accuracy for the proposed model using SVM outperformed with an accuracy of 67% on oversampling data.