{"title":"A risk assessment and prediction framework for diabetes mellitus using machine learning algorithms","authors":"Salliah Shafi Bhat , Madhina Banu , Gufran Ahmad Ansari , Venkatesan Selvam","doi":"10.1016/j.health.2023.100273","DOIUrl":null,"url":null,"abstract":"<div><p>Diabetes disease seriously threatens people's health and is becoming more common nowadays. Diabetes Mellitus (DM) is a condition caused by high blood sugar levels, inactivity, unhealthy eating, being overweight, and other factors. This research article analyzed and examined various risk prediction models and algorithms for diabetes, including Type 1, Type 2, and Gestational Diabetes. This study develops several Machine Learning (ML) models for predicting diabetes using various datasets. The process involves producing highly informative features called Feature Engineering (FE). We used the Pima Indian Diabetes Dataset (PIDD) to experiment with and examine the effectiveness of ML models' ability to predict diabetes. Using Python programming, we used three classification algorithms, Logistic Regression, Gradient Boost, and Decision Tree, and combined feature selection techniques among the classification techniques, Decision Tree has the highest accuracy rate (91 %), precision (96 %), recall (92 %), and Fi score (94 %).</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"4 ","pages":"Article 100273"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001405/pdfft?md5=0eb0088277442a491debaaea7ad5d6d2&pid=1-s2.0-S2772442523001405-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare analytics (New York, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772442523001405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Diabetes disease seriously threatens people's health and is becoming more common nowadays. Diabetes Mellitus (DM) is a condition caused by high blood sugar levels, inactivity, unhealthy eating, being overweight, and other factors. This research article analyzed and examined various risk prediction models and algorithms for diabetes, including Type 1, Type 2, and Gestational Diabetes. This study develops several Machine Learning (ML) models for predicting diabetes using various datasets. The process involves producing highly informative features called Feature Engineering (FE). We used the Pima Indian Diabetes Dataset (PIDD) to experiment with and examine the effectiveness of ML models' ability to predict diabetes. Using Python programming, we used three classification algorithms, Logistic Regression, Gradient Boost, and Decision Tree, and combined feature selection techniques among the classification techniques, Decision Tree has the highest accuracy rate (91 %), precision (96 %), recall (92 %), and Fi score (94 %).