Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches.

IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS BMC Medical Informatics and Decision Making Pub Date : 2025-01-31 DOI:10.1186/s12911-025-02887-y
Nafiseh Hosseini, Hamid Tanzadehpanah, Amin Mansoori, Mostafa Sabzekar, Gordon A Ferns, Habibollah Esmaily, Majid Ghayour-Mobarhan
{"title":"Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches.","authors":"Nafiseh Hosseini, Hamid Tanzadehpanah, Amin Mansoori, Mostafa Sabzekar, Gordon A Ferns, Habibollah Esmaily, Majid Ghayour-Mobarhan","doi":"10.1186/s12911-025-02887-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The aim of this study was to evaluate the potential models to determine the most important anthropometric factors associated with type 2 diabetes mellitus (T2DM).</p><p><strong>Method: </strong>A dataset derived from the Mashhad Stroke and heart atherosclerotic disorders (MASHAD) study comprising 9354 subject aged 65 - 35. 25% (2336 people) of subjects were diabetic and 75% (7018 people) where non-diabetic was used for the analysis of 10 anthropometric factors and age that were measured in all patients. A K-nearest neighbor (KNN) model was used to assess the association between T2DM and selected factors. The model was evaluated using accuracy, sensitivity, specificity, precision and f1-measure parameters. The receiver operating characteristic (ROC) curve and factor importance analysis were also determined. The performance of the KNN model was compared with Artificial neural network (ANN) and support vector machine (SVM) models.</p><p><strong>Result: </strong>After feature selection analysis and assessing multicollinearity, six factors (Mid-arm Circumference (MAC), Waist Circumference (WC), Body Roundness Index (BRI), Body Adiposity Index (BAI), Body Mass Index (BMI), age) were used in the final model. BRI, BAI and MAC factors in males and BMI, BRI, and MAC factors in females were found to have the greatest association with T2DM. The accuracy of the KNN model was approximately 93% for both genders. The best K (number of neighbors) for the model was 4 which had the lowest error rate. The area under the ROC curve (AUC) was 0.985 for men and 0.986 for women. The KNN model achieved the best result of the models explored.</p><p><strong>Conclusion: </strong>The KNN model had a high accuracy (93%) for predicting the association between anthropometric factors and T2DM. Selecting the K parameter (nearest neighbor) has an essential impact on reducing the error rate. Feature selection analysis reduces the dimensions of the KNN model and increases the accuracy of final results.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"49"},"PeriodicalIF":3.8000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11786328/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-02887-y","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The aim of this study was to evaluate the potential models to determine the most important anthropometric factors associated with type 2 diabetes mellitus (T2DM).

Method: A dataset derived from the Mashhad Stroke and heart atherosclerotic disorders (MASHAD) study comprising 9354 subject aged 65 - 35. 25% (2336 people) of subjects were diabetic and 75% (7018 people) where non-diabetic was used for the analysis of 10 anthropometric factors and age that were measured in all patients. A K-nearest neighbor (KNN) model was used to assess the association between T2DM and selected factors. The model was evaluated using accuracy, sensitivity, specificity, precision and f1-measure parameters. The receiver operating characteristic (ROC) curve and factor importance analysis were also determined. The performance of the KNN model was compared with Artificial neural network (ANN) and support vector machine (SVM) models.

Result: After feature selection analysis and assessing multicollinearity, six factors (Mid-arm Circumference (MAC), Waist Circumference (WC), Body Roundness Index (BRI), Body Adiposity Index (BAI), Body Mass Index (BMI), age) were used in the final model. BRI, BAI and MAC factors in males and BMI, BRI, and MAC factors in females were found to have the greatest association with T2DM. The accuracy of the KNN model was approximately 93% for both genders. The best K (number of neighbors) for the model was 4 which had the lowest error rate. The area under the ROC curve (AUC) was 0.985 for men and 0.986 for women. The KNN model achieved the best result of the models explored.

Conclusion: The KNN model had a high accuracy (93%) for predicting the association between anthropometric factors and T2DM. Selecting the K parameter (nearest neighbor) has an essential impact on reducing the error rate. Feature selection analysis reduces the dimensions of the KNN model and increases the accuracy of final results.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用稳健模型检测人体测量因素与T2DM之间的关联:机器学习方法。
背景:本研究的目的是评估潜在的模型,以确定与2型糖尿病(T2DM)相关的最重要的人体测量因素。方法:数据集来源于马什哈德中风和心脏动脉粥样硬化性疾病(MASHAD)研究,包括9354名年龄在65 - 35岁之间的受试者。25%(2336人)的受试者为糖尿病患者,75%(7018人)的受试者为非糖尿病患者,用于分析所有患者测量的10个人体测量因素和年龄。采用k -最近邻(KNN)模型评估T2DM与选定因素之间的关系。采用准确性、敏感性、特异性、精密度和f1-measure参数对模型进行评价。测定受试者工作特征(ROC)曲线及因素重要性分析。将KNN模型与人工神经网络(ANN)和支持向量机(SVM)模型进行了性能比较。结果:经过特征选择分析和多重共线性评估,最终模型采用中臂围(MAC)、腰围(WC)、体圆度指数(BRI)、体脂指数(BAI)、体质量指数(BMI)、年龄6个因素。男性的BRI、BAI和MAC因子以及女性的BMI、BRI和MAC因子与T2DM的相关性最大。对于男女,KNN模型的准确率约为93%。模型的最佳K(邻居数)为4,错误率最低。ROC曲线下面积(AUC)男性为0.985,女性为0.986。在所探索的模型中,KNN模型取得了最好的结果。结论:KNN模型预测人体测量因子与T2DM的相关性具有较高的准确度(93%)。选择K参数(最近邻)对降低错误率有重要影响。特征选择分析减少了KNN模型的维数,提高了最终结果的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
期刊最新文献
Understanding how a tap on/tap off system supports clinical work in an emergency department: a qualitative study. Phenotypic subclassification of preeclampsia through cluster analysis of preterm birth-related factors. Respiratory sound analysis for ICU clinical decision support: deep learning-based classification of normal and abnormal sounds using real ICU data. Explainable counterfactual reasoning in depression medication selection at multi-levels (personalized and population). Radiomics features and clinical factors for predicting restenosis following endovascular therapy in patients with peripheral artery disease.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1