Diabetes Prediction Using Machine Learning Analytics: Ensemble Learning Techniques

D. Tripathi, S. Biswas, S. Reshmi, Arpita Nath Boruah, B. Purkayastha
{"title":"Diabetes Prediction Using Machine Learning Analytics: Ensemble Learning Techniques","authors":"D. Tripathi, S. Biswas, S. Reshmi, Arpita Nath Boruah, B. Purkayastha","doi":"10.1109/ASIANCON55314.2022.9908975","DOIUrl":null,"url":null,"abstract":"Diabetes is an incurable disease which is due to a high level of sugar in the blood over a long period of time. Hence, early prediction is required to reduce its severity significantly. Now-a-days Machine Learning (ML) community has been working on diabetes prediction and much research has been done for decades for its prediction. Keeping in view of its severity, this paper proposes a model, named Diabetes Expert System using Machine Learning Analytics (DESMLA) to explore the diabetes data to predict the disease more effectively. The Diabetes Dataset (DD) is imbalanced in nature; therefore, the DESMLA model uses the 5 most prominent oversampling techniques namely SMOTE, Borderline SMOTE, ADASYN SMOTE, K-Means SMOTE and Gaussian SMOTE to get rid of this class imbalance problem of the diabetes dataset. DESMLA model also performs feature selection to determine only the significant features for diabetes prediction as DD may contain some irrelevant and redundant features. DESMLA shows the comparison between filter and wrapper approaches for feature selection. From the experimental results, it is observed that DESMLA with wrapper approach produces better performance than that of filter approach. The performance improvement of DESMLA with class imbalance treatment and feature selection is observed which is promising and significant.","PeriodicalId":429704,"journal":{"name":"2022 2nd Asian Conference on Innovation in Technology (ASIANCON)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd Asian Conference on Innovation in Technology (ASIANCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASIANCON55314.2022.9908975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Diabetes is an incurable disease which is due to a high level of sugar in the blood over a long period of time. Hence, early prediction is required to reduce its severity significantly. Now-a-days Machine Learning (ML) community has been working on diabetes prediction and much research has been done for decades for its prediction. Keeping in view of its severity, this paper proposes a model, named Diabetes Expert System using Machine Learning Analytics (DESMLA) to explore the diabetes data to predict the disease more effectively. The Diabetes Dataset (DD) is imbalanced in nature; therefore, the DESMLA model uses the 5 most prominent oversampling techniques namely SMOTE, Borderline SMOTE, ADASYN SMOTE, K-Means SMOTE and Gaussian SMOTE to get rid of this class imbalance problem of the diabetes dataset. DESMLA model also performs feature selection to determine only the significant features for diabetes prediction as DD may contain some irrelevant and redundant features. DESMLA shows the comparison between filter and wrapper approaches for feature selection. From the experimental results, it is observed that DESMLA with wrapper approach produces better performance than that of filter approach. The performance improvement of DESMLA with class imbalance treatment and feature selection is observed which is promising and significant.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用机器学习分析预测糖尿病:集成学习技术
糖尿病是一种无法治愈的疾病,它是由于长期高水平的血糖在血液中。因此,需要早期预测以显著降低其严重程度。现在的机器学习(ML)社区一直致力于糖尿病预测,几十年来已经做了很多研究。针对糖尿病的严重程度,本文提出了一种基于机器学习分析(DESMLA)的糖尿病专家系统模型来探索糖尿病数据,从而更有效地预测糖尿病。糖尿病数据集(DD)本质上是不平衡的;因此,DESMLA模型使用了5种最突出的过采样技术,即SMOTE、Borderline SMOTE、ADASYN SMOTE、K-Means SMOTE和高斯SMOTE来消除糖尿病数据集的类不平衡问题。由于DD可能包含一些不相关和冗余的特征,DESMLA模型还进行了特征选择,仅确定对糖尿病预测有意义的特征。DESMLA显示了特征选择的过滤器和包装器方法之间的比较。实验结果表明,采用包装方法的DESMLA比采用滤波方法的DESMLA具有更好的性能。类不平衡处理和特征选择对DESMLA的性能有显著的改善。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Distributed Multi-Sensor DCNN & Multivariate Time Series Classification Based technique for Earthquake early warning Cross Technology Communication between LTE-U and Wi-Fi to Improve Overall QoS of 5G System Prediction of Ayurvedic Herbs for Specific Diseases by Classification Techniques in Machine Learning Face Mask Detection Using Machine Learning Techniques Closed-form BER Expressions of QPSK Modulation over NOMA-PNC Parallel Relay Channels
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1