A Proposed Technique Using Machine Learning for the Prediction of Diabetes Disease through a Mobile App

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Intelligent Systems Pub Date : 2024-01-09 DOI:10.1155/2024/6688934
Hosam El-Sofany, Samir A. El-Seoud, Omar H. Karam, Yasser M. Abd El-Latif, Islam A. T. F. Taj-Eddin
{"title":"A Proposed Technique Using Machine Learning for the Prediction of Diabetes Disease through a Mobile App","authors":"Hosam El-Sofany,&nbsp;Samir A. El-Seoud,&nbsp;Omar H. Karam,&nbsp;Yasser M. Abd El-Latif,&nbsp;Islam A. T. F. Taj-Eddin","doi":"10.1155/2024/6688934","DOIUrl":null,"url":null,"abstract":"<p>With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction of the disease to prevent long-term health complications. This study addresses this need by using machine learning (ML) techniques applied to the Pima Indians dataset and private diabetes datasets through the implementation of a computerized system for predicting diabetes. In contrast to prior research, this study employs a semisupervised model combined with strong gradient boosting, effectively predicting diabetes-related features of the dataset. Additionally, the researchers employ the SMOTE technique to deal with the problem of imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, and Naive Bayes, are evaluated to determine the algorithm that produces the most accurate diabetes prediction. The proposed approach has achieved impressive performance. For the private dataset, the XGBoost algorithm with SMOTE achieved an accuracy of 97.4%, an F1 coefficient of 0.95, and an AUC of 0.87. For the combined datasets, it achieved an accuracy of 83.1%, an F1 coefficient of 0.76, and an AUC of 0.85. To understand how the model predicts the final results, an explainable AI technique using SHAP methods is implemented. Furthermore, the study demonstrates the adaptability of the proposed system by applying a domain adaptation method. To further enhance accessibility, a mobile app has been developed for instant diabetes prediction based on user-entered features. This study contributes novel insights and techniques to the field of ML-based diabetic prediction, potentially aiding in the early detection and management of diabetes in Saudi Arabia.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/6688934","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction of the disease to prevent long-term health complications. This study addresses this need by using machine learning (ML) techniques applied to the Pima Indians dataset and private diabetes datasets through the implementation of a computerized system for predicting diabetes. In contrast to prior research, this study employs a semisupervised model combined with strong gradient boosting, effectively predicting diabetes-related features of the dataset. Additionally, the researchers employ the SMOTE technique to deal with the problem of imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, and Naive Bayes, are evaluated to determine the algorithm that produces the most accurate diabetes prediction. The proposed approach has achieved impressive performance. For the private dataset, the XGBoost algorithm with SMOTE achieved an accuracy of 97.4%, an F1 coefficient of 0.95, and an AUC of 0.87. For the combined datasets, it achieved an accuracy of 83.1%, an F1 coefficient of 0.76, and an AUC of 0.85. To understand how the model predicts the final results, an explainable AI technique using SHAP methods is implemented. Furthermore, the study demonstrates the adaptability of the proposed system by applying a domain adaptation method. To further enhance accessibility, a mobile app has been developed for instant diabetes prediction based on user-entered features. This study contributes novel insights and techniques to the field of ML-based diabetic prediction, potentially aiding in the early detection and management of diabetes in Saudi Arabia.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用机器学习通过移动应用程序预测糖尿病的拟议技术
随着沙特阿拉伯糖尿病发病率的不断上升,迫切需要对该疾病进行早期检测和预测,以预防长期的健康并发症。本研究针对这一需求,将机器学习(ML)技术应用于皮马印第安人数据集和私人糖尿病数据集,实施了一套预测糖尿病的计算机系统。与之前的研究不同,本研究采用了半监督模型,结合强梯度提升技术,有效地预测了数据集中与糖尿病相关的特征。此外,研究人员还采用了 SMOTE 技术来处理不平衡类的问题。研究人员评估了十种 ML 分类技术,包括逻辑回归、随机森林、KNN、决策树、bagging、AdaBoost、XGBoost、投票、SVM 和 Naive Bayes,以确定最准确的糖尿病预测算法。所提出的方法取得了令人印象深刻的性能。对于私人数据集,采用 SMOTE 的 XGBoost 算法的准确率达到 97.4%,F1 系数为 0.95,AUC 为 0.87。对于综合数据集,其准确率达到 83.1%,F1 系数为 0.76,AUC 为 0.85。为了了解该模型如何预测最终结果,研究人员使用 SHAP 方法实施了一种可解释的人工智能技术。此外,该研究还通过应用领域适应方法展示了所提系统的适应性。为了进一步提高可访问性,还开发了一款移动应用程序,可根据用户输入的特征即时预测糖尿病。这项研究为基于 ML 的糖尿病预测领域提供了新的见解和技术,可能有助于沙特阿拉伯糖尿病的早期检测和管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Intelligent Systems
International Journal of Intelligent Systems 工程技术-计算机:人工智能
CiteScore
11.30
自引率
14.30%
发文量
304
审稿时长
9 months
期刊介绍: The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.
期刊最新文献
A Novel Self-Attention Transfer Adaptive Learning Approach for Brain Tumor Categorization A Manifold-Guided Gravitational Search Algorithm for High-Dimensional Global Optimization Problems PU-GNN: A Positive-Unlabeled Learning Method for Polypharmacy Side-Effects Detection Based on Graph Neural Networks Real-World Image Deraining Using Model-Free Unsupervised Learning Complex Question Answering Method on Risk Management Knowledge Graph: Multi-Intent Information Retrieval Based on Knowledge Subgraphs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1