{"title":"基于随机森林和SHAP的心脏病可解释预测","authors":"Lin Wu","doi":"10.1117/12.2682322","DOIUrl":null,"url":null,"abstract":"In order to improve the accuracy of heart disease prediction models and address the lack of interpretability in traditional machine learning models, this paper proposes a heart disease prediction method based on random forests and SHAP value. This method first preprocesses the dataset by encoding the data, filling in missing values, and removing outliers. It then uses recursive feature elimination and cross-validation to remove irrelevant features and select relevant features for further model training. The results, compared with other methods using accuracy, precision, recall, and F1 score, show that the proposed method outperforms other models. The interpretable model constructed based on SHAP value reflects the effect of feature values on prediction model results and provides a ranking of feature importance. The experimental results show that the method can effectively improve the accuracy of heart disease prediction, and provide a clear interpretation of the model prediction results. It can be an aid in the treatment and prevention of heart disease.","PeriodicalId":440430,"journal":{"name":"International Conference on Electronic Technology and Information Science","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interpretable prediction of heart disease based on random forest and SHAP\",\"authors\":\"Lin Wu\",\"doi\":\"10.1117/12.2682322\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to improve the accuracy of heart disease prediction models and address the lack of interpretability in traditional machine learning models, this paper proposes a heart disease prediction method based on random forests and SHAP value. This method first preprocesses the dataset by encoding the data, filling in missing values, and removing outliers. It then uses recursive feature elimination and cross-validation to remove irrelevant features and select relevant features for further model training. The results, compared with other methods using accuracy, precision, recall, and F1 score, show that the proposed method outperforms other models. The interpretable model constructed based on SHAP value reflects the effect of feature values on prediction model results and provides a ranking of feature importance. The experimental results show that the method can effectively improve the accuracy of heart disease prediction, and provide a clear interpretation of the model prediction results. It can be an aid in the treatment and prevention of heart disease.\",\"PeriodicalId\":440430,\"journal\":{\"name\":\"International Conference on Electronic Technology and Information Science\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Electronic Technology and Information Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2682322\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Electronic Technology and Information Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2682322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Interpretable prediction of heart disease based on random forest and SHAP
In order to improve the accuracy of heart disease prediction models and address the lack of interpretability in traditional machine learning models, this paper proposes a heart disease prediction method based on random forests and SHAP value. This method first preprocesses the dataset by encoding the data, filling in missing values, and removing outliers. It then uses recursive feature elimination and cross-validation to remove irrelevant features and select relevant features for further model training. The results, compared with other methods using accuracy, precision, recall, and F1 score, show that the proposed method outperforms other models. The interpretable model constructed based on SHAP value reflects the effect of feature values on prediction model results and provides a ranking of feature importance. The experimental results show that the method can effectively improve the accuracy of heart disease prediction, and provide a clear interpretation of the model prediction results. It can be an aid in the treatment and prevention of heart disease.