M. Rahim, Md Alfaz Hossain, Md. Najmul Hossain, Jungpil Shin, K. Yun
{"title":"使用机器学习技术的基于堆叠集成的2型糖尿病预测","authors":"M. Rahim, Md Alfaz Hossain, Md. Najmul Hossain, Jungpil Shin, K. Yun","doi":"10.33166/aetic.2023.01.003","DOIUrl":null,"url":null,"abstract":"Diabetes is a long-term disease caused by the human body's inability to make enough insulin or to use it properly. This is one of the curses of the present world. Although it is not very severe in the initial stage, over time, it takes a deadly shape and gradually affects a variety of human organs, such as the heart, kidney, liver, eyes, and brain, leading to death. Many researchers focus on the machine and in-depth learning strategies to efficiently predict diabetes based on numerous risk variables such as insulin, BMI, and glucose in this healthcare issue. We proposed a robust approach based on the stacked ensemble method for predicting diabetes using several machine learning (ML) methods. The stacked ensemble comprises two models: the base model and the meta-model. Base models use a variety of models of ML, such as Support Vector Machine (SVM), K Nearest Neighbor (KNN), Naïve Bayes (NB), and Random Forest (RF), which make different assumptions about predictions, and meta-models make final predictions using Logistic Regression from predictive outputs from base models. To assess the efficiency of the proposed model, we have considered the PIMA Indian Diabetes Dataset (PIMA-IDD). We used linear and stratified sampling to ensure dataset consistency and K-fold cross-validation to prevent model overfitting. Experiments revealed that the proposed stacked ensemble model outperformed the model specified in the base classifier as well as the comprehensive methods, with an accuracy of 94.17%.","PeriodicalId":36440,"journal":{"name":"Annals of Emerging Technologies in Computing","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stacked Ensemble-Based Type-2 Diabetes Prediction Using Machine Learning Techniques\",\"authors\":\"M. Rahim, Md Alfaz Hossain, Md. Najmul Hossain, Jungpil Shin, K. Yun\",\"doi\":\"10.33166/aetic.2023.01.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diabetes is a long-term disease caused by the human body's inability to make enough insulin or to use it properly. This is one of the curses of the present world. Although it is not very severe in the initial stage, over time, it takes a deadly shape and gradually affects a variety of human organs, such as the heart, kidney, liver, eyes, and brain, leading to death. Many researchers focus on the machine and in-depth learning strategies to efficiently predict diabetes based on numerous risk variables such as insulin, BMI, and glucose in this healthcare issue. We proposed a robust approach based on the stacked ensemble method for predicting diabetes using several machine learning (ML) methods. The stacked ensemble comprises two models: the base model and the meta-model. Base models use a variety of models of ML, such as Support Vector Machine (SVM), K Nearest Neighbor (KNN), Naïve Bayes (NB), and Random Forest (RF), which make different assumptions about predictions, and meta-models make final predictions using Logistic Regression from predictive outputs from base models. To assess the efficiency of the proposed model, we have considered the PIMA Indian Diabetes Dataset (PIMA-IDD). We used linear and stratified sampling to ensure dataset consistency and K-fold cross-validation to prevent model overfitting. Experiments revealed that the proposed stacked ensemble model outperformed the model specified in the base classifier as well as the comprehensive methods, with an accuracy of 94.17%.\",\"PeriodicalId\":36440,\"journal\":{\"name\":\"Annals of Emerging Technologies in Computing\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Emerging Technologies in Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33166/aetic.2023.01.003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Emerging Technologies in Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33166/aetic.2023.01.003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
Stacked Ensemble-Based Type-2 Diabetes Prediction Using Machine Learning Techniques
Diabetes is a long-term disease caused by the human body's inability to make enough insulin or to use it properly. This is one of the curses of the present world. Although it is not very severe in the initial stage, over time, it takes a deadly shape and gradually affects a variety of human organs, such as the heart, kidney, liver, eyes, and brain, leading to death. Many researchers focus on the machine and in-depth learning strategies to efficiently predict diabetes based on numerous risk variables such as insulin, BMI, and glucose in this healthcare issue. We proposed a robust approach based on the stacked ensemble method for predicting diabetes using several machine learning (ML) methods. The stacked ensemble comprises two models: the base model and the meta-model. Base models use a variety of models of ML, such as Support Vector Machine (SVM), K Nearest Neighbor (KNN), Naïve Bayes (NB), and Random Forest (RF), which make different assumptions about predictions, and meta-models make final predictions using Logistic Regression from predictive outputs from base models. To assess the efficiency of the proposed model, we have considered the PIMA Indian Diabetes Dataset (PIMA-IDD). We used linear and stratified sampling to ensure dataset consistency and K-fold cross-validation to prevent model overfitting. Experiments revealed that the proposed stacked ensemble model outperformed the model specified in the base classifier as well as the comprehensive methods, with an accuracy of 94.17%.