{"title":"集成学习算法在糖尿病早期诊断分类中的比较","authors":"Okta Jaya Harmaja, Irvan Prasetia, Yosi Victor Hutagalung, Hendra Ardanis Sirait","doi":"10.34012/jurnalsisteminformasidanilmukomputer.v7i1.4054","DOIUrl":null,"url":null,"abstract":"Diabetes is a significant public health problem and affects millions of people worldwide. This study will perform a comparative analysis of three ensemble learning algorithms (Random Forest, AdaBoost, and XGBoost) in classifying diabetes diagnoses. Based on the research that has been carried out, it is concluded that the model with the highest accuracy is Random Forest with a value of 0.86, XGBoost with a value of 0.85, and AdaBoost with a value of 0.82. It can also be concluded that the three models perform well and can be used to classify diabetes. Based on the visualization of the results of Feature Importance that has been made, it can be concluded that the Random Forest and XGBoost algorithms have in common the 3 most important features, namely Glucose, BMI and Age. As for AdaBoost, the 3 most important features are DPF, BMI and Glucose.","PeriodicalId":499639,"journal":{"name":"Jusikom : Jurnal Sistem Informasi Ilmu Komputer","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COMPARISON OF ENSEMBLE LEARNING ALGORITHM IN CLASSIFYING EARLY DIAGNOSTIC OF DIABETES\",\"authors\":\"Okta Jaya Harmaja, Irvan Prasetia, Yosi Victor Hutagalung, Hendra Ardanis Sirait\",\"doi\":\"10.34012/jurnalsisteminformasidanilmukomputer.v7i1.4054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diabetes is a significant public health problem and affects millions of people worldwide. This study will perform a comparative analysis of three ensemble learning algorithms (Random Forest, AdaBoost, and XGBoost) in classifying diabetes diagnoses. Based on the research that has been carried out, it is concluded that the model with the highest accuracy is Random Forest with a value of 0.86, XGBoost with a value of 0.85, and AdaBoost with a value of 0.82. It can also be concluded that the three models perform well and can be used to classify diabetes. Based on the visualization of the results of Feature Importance that has been made, it can be concluded that the Random Forest and XGBoost algorithms have in common the 3 most important features, namely Glucose, BMI and Age. As for AdaBoost, the 3 most important features are DPF, BMI and Glucose.\",\"PeriodicalId\":499639,\"journal\":{\"name\":\"Jusikom : Jurnal Sistem Informasi Ilmu Komputer\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jusikom : Jurnal Sistem Informasi Ilmu Komputer\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34012/jurnalsisteminformasidanilmukomputer.v7i1.4054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jusikom : Jurnal Sistem Informasi Ilmu Komputer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34012/jurnalsisteminformasidanilmukomputer.v7i1.4054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
COMPARISON OF ENSEMBLE LEARNING ALGORITHM IN CLASSIFYING EARLY DIAGNOSTIC OF DIABETES
Diabetes is a significant public health problem and affects millions of people worldwide. This study will perform a comparative analysis of three ensemble learning algorithms (Random Forest, AdaBoost, and XGBoost) in classifying diabetes diagnoses. Based on the research that has been carried out, it is concluded that the model with the highest accuracy is Random Forest with a value of 0.86, XGBoost with a value of 0.85, and AdaBoost with a value of 0.82. It can also be concluded that the three models perform well and can be used to classify diabetes. Based on the visualization of the results of Feature Importance that has been made, it can be concluded that the Random Forest and XGBoost algorithms have in common the 3 most important features, namely Glucose, BMI and Age. As for AdaBoost, the 3 most important features are DPF, BMI and Glucose.