D. Tripathi, S. Biswas, S. Reshmi, Arpita Nath Boruah, B. Purkayastha
{"title":"Diabetes Prediction Using Machine Learning Analytics: Ensemble Learning Techniques","authors":"D. Tripathi, S. Biswas, S. Reshmi, Arpita Nath Boruah, B. Purkayastha","doi":"10.1109/ASIANCON55314.2022.9908975","DOIUrl":null,"url":null,"abstract":"Diabetes is an incurable disease which is due to a high level of sugar in the blood over a long period of time. Hence, early prediction is required to reduce its severity significantly. Now-a-days Machine Learning (ML) community has been working on diabetes prediction and much research has been done for decades for its prediction. Keeping in view of its severity, this paper proposes a model, named Diabetes Expert System using Machine Learning Analytics (DESMLA) to explore the diabetes data to predict the disease more effectively. The Diabetes Dataset (DD) is imbalanced in nature; therefore, the DESMLA model uses the 5 most prominent oversampling techniques namely SMOTE, Borderline SMOTE, ADASYN SMOTE, K-Means SMOTE and Gaussian SMOTE to get rid of this class imbalance problem of the diabetes dataset. DESMLA model also performs feature selection to determine only the significant features for diabetes prediction as DD may contain some irrelevant and redundant features. DESMLA shows the comparison between filter and wrapper approaches for feature selection. From the experimental results, it is observed that DESMLA with wrapper approach produces better performance than that of filter approach. The performance improvement of DESMLA with class imbalance treatment and feature selection is observed which is promising and significant.","PeriodicalId":429704,"journal":{"name":"2022 2nd Asian Conference on Innovation in Technology (ASIANCON)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd Asian Conference on Innovation in Technology (ASIANCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASIANCON55314.2022.9908975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Diabetes is an incurable disease which is due to a high level of sugar in the blood over a long period of time. Hence, early prediction is required to reduce its severity significantly. Now-a-days Machine Learning (ML) community has been working on diabetes prediction and much research has been done for decades for its prediction. Keeping in view of its severity, this paper proposes a model, named Diabetes Expert System using Machine Learning Analytics (DESMLA) to explore the diabetes data to predict the disease more effectively. The Diabetes Dataset (DD) is imbalanced in nature; therefore, the DESMLA model uses the 5 most prominent oversampling techniques namely SMOTE, Borderline SMOTE, ADASYN SMOTE, K-Means SMOTE and Gaussian SMOTE to get rid of this class imbalance problem of the diabetes dataset. DESMLA model also performs feature selection to determine only the significant features for diabetes prediction as DD may contain some irrelevant and redundant features. DESMLA shows the comparison between filter and wrapper approaches for feature selection. From the experimental results, it is observed that DESMLA with wrapper approach produces better performance than that of filter approach. The performance improvement of DESMLA with class imbalance treatment and feature selection is observed which is promising and significant.