K. Reddy, I. Elamvazuthi, A. Aziz, S. Paramasivam, Hui Na Chua
{"title":"使用主成分分析的机器学习进行心脏病风险预测","authors":"K. Reddy, I. Elamvazuthi, A. Aziz, S. Paramasivam, Hui Na Chua","doi":"10.1109/ICIAS49414.2021.9642676","DOIUrl":null,"url":null,"abstract":"Cardiovascular diseases (CVDs) are killing about 17.9 million people every year. Early prediction can help people to change their lifestyles and to endure proper medical treatment if necessary. The data available in the healthcare sector is very useful to predict whether a patient will have a disease or not in the future. In this research, several machine learning algorithms such as Decision Tree (DT), Discriminant Analysis (DA), Logistic Regression (LR), Naïve Bayes (NB), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Ensemble were trained on Cleveland heart disease dataset. The performance of the algorithms was evaluated using 10-fold cross-validation without and with Principal Component Analysis (PCA). LR provided the highest accuracy of 85.8% with PCA by keeping 9 components and Ensemble classifiers and attained an accuracy of 83.8% using a Bagged tree with PCA by keeping 10 components.","PeriodicalId":212635,"journal":{"name":"2020 8th International Conference on Intelligent and Advanced Systems (ICIAS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Heart Disease Risk Prediction using Machine Learning with Principal Component Analysis\",\"authors\":\"K. Reddy, I. Elamvazuthi, A. Aziz, S. Paramasivam, Hui Na Chua\",\"doi\":\"10.1109/ICIAS49414.2021.9642676\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cardiovascular diseases (CVDs) are killing about 17.9 million people every year. Early prediction can help people to change their lifestyles and to endure proper medical treatment if necessary. The data available in the healthcare sector is very useful to predict whether a patient will have a disease or not in the future. In this research, several machine learning algorithms such as Decision Tree (DT), Discriminant Analysis (DA), Logistic Regression (LR), Naïve Bayes (NB), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Ensemble were trained on Cleveland heart disease dataset. The performance of the algorithms was evaluated using 10-fold cross-validation without and with Principal Component Analysis (PCA). LR provided the highest accuracy of 85.8% with PCA by keeping 9 components and Ensemble classifiers and attained an accuracy of 83.8% using a Bagged tree with PCA by keeping 10 components.\",\"PeriodicalId\":212635,\"journal\":{\"name\":\"2020 8th International Conference on Intelligent and Advanced Systems (ICIAS)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 8th International Conference on Intelligent and Advanced Systems (ICIAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIAS49414.2021.9642676\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 8th International Conference on Intelligent and Advanced Systems (ICIAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIAS49414.2021.9642676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Heart Disease Risk Prediction using Machine Learning with Principal Component Analysis
Cardiovascular diseases (CVDs) are killing about 17.9 million people every year. Early prediction can help people to change their lifestyles and to endure proper medical treatment if necessary. The data available in the healthcare sector is very useful to predict whether a patient will have a disease or not in the future. In this research, several machine learning algorithms such as Decision Tree (DT), Discriminant Analysis (DA), Logistic Regression (LR), Naïve Bayes (NB), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Ensemble were trained on Cleveland heart disease dataset. The performance of the algorithms was evaluated using 10-fold cross-validation without and with Principal Component Analysis (PCA). LR provided the highest accuracy of 85.8% with PCA by keeping 9 components and Ensemble classifiers and attained an accuracy of 83.8% using a Bagged tree with PCA by keeping 10 components.