{"title":"Comparing Different Machine Learning Techniques in Predicting Diabetes on Early Stage","authors":"Shweta Yadu, Rashmi Chandra, Vivek Kumar Sinha","doi":"10.3390/engproc2024062020","DOIUrl":null,"url":null,"abstract":": One of the diseases that is constantly spreading and is estimated to cause a significant number of deaths worldwide is diabetes mellitus. It is determined by the quantity of a blood sugar molecule made from glucose. The possibility of this disease has been predicted using a variety of methods. To forecast diabetes at an early stage, adequate and clear data on diabetic individuals are needed. In this study, 520 records from a hospital in Bangladesh with 16 different characteristic numbers were used to make predictions. At UCI, this dataset is accessible to everyone. We used Random Forest, Ada Booster, KNN, and Bagging algorithms after feature selection. Through 10-fold cross-validation, it was discovered that the Random Forest method had the best test accuracy, scoring 97.03% correctly and 95.03% correctly.","PeriodicalId":517910,"journal":{"name":"CC 2023","volume":" 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CC 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/engproc2024062020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
: One of the diseases that is constantly spreading and is estimated to cause a significant number of deaths worldwide is diabetes mellitus. It is determined by the quantity of a blood sugar molecule made from glucose. The possibility of this disease has been predicted using a variety of methods. To forecast diabetes at an early stage, adequate and clear data on diabetic individuals are needed. In this study, 520 records from a hospital in Bangladesh with 16 different characteristic numbers were used to make predictions. At UCI, this dataset is accessible to everyone. We used Random Forest, Ada Booster, KNN, and Bagging algorithms after feature selection. Through 10-fold cross-validation, it was discovered that the Random Forest method had the best test accuracy, scoring 97.03% correctly and 95.03% correctly.