Discriminating insulin resistance in middle-aged nondiabetic women using machine learning approaches.

IF 3.1 Q2 HEALTH CARE SCIENCES & SERVICES AIMS Public Health Pub Date : 2024-05-09 eCollection Date: 2024-01-01 DOI:10.3934/publichealth.2024034
Zailing Xing, Henian Chen, Amy C Alman
{"title":"Discriminating insulin resistance in middle-aged nondiabetic women using machine learning approaches.","authors":"Zailing Xing, Henian Chen, Amy C Alman","doi":"10.3934/publichealth.2024034","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>We employed machine learning algorithms to discriminate insulin resistance (IR) in middle-aged nondiabetic women.</p><p><strong>Methods: </strong>The data was from the National Health and Nutrition Examination Survey (2007-2018). The study subjects were 2084 nondiabetic women aged 45-64. The analysis included 48 predictors. We randomly divided the data into training (n = 1667) and testing (n = 417) datasets. Four machine learning techniques were employed to discriminate IR: extreme gradient boosting (XGBoosting), random forest (RF), gradient boosting machine (GBM), and decision tree (DT). The area under the curve (AUC) of receiver operating characteristic (ROC), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were compared as performance metrics to select the optimal technique.</p><p><strong>Results: </strong>The XGBoosting algorithm achieved a relatively high AUC of 0.93 in the training dataset and 0.86 in the testing dataset to discriminate IR using 48 predictors and was followed by the RF, GBM, and DT models. After selecting the top five predictors to build models, the XGBoost algorithm with the AUC of 0.90 (training dataset) and 0.86 (testing dataset) remained the optimal prediction model. The SHapley Additive exPlanations (SHAP) values revealed the associations between the five predictors and IR, namely BMI (strongly positive impact on IR), fasting glucose (strongly positive), HDL-C (medium negative), triglycerides (medium positive), and glycohemoglobin (medium positive). The threshold values for identifying IR were 29 kg/m<sup>2</sup>, 100 mg/dL, 54.5 mg/dL, 89 mg/dL, and 5.6% for BMI, glucose, HDL-C, triglycerides, and glycohemoglobin, respectively.</p><p><strong>Conclusion: </strong>The XGBoosting algorithm demonstrated superior performance metrics for discriminating IR in middle-aged nondiabetic women, with BMI, glucose, HDL-C, glycohemoglobin, and triglycerides as the top five predictors.</p>","PeriodicalId":45684,"journal":{"name":"AIMS Public Health","volume":"11 2","pages":"667-687"},"PeriodicalIF":3.1000,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11252584/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AIMS Public Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3934/publichealth.2024034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: We employed machine learning algorithms to discriminate insulin resistance (IR) in middle-aged nondiabetic women.

Methods: The data was from the National Health and Nutrition Examination Survey (2007-2018). The study subjects were 2084 nondiabetic women aged 45-64. The analysis included 48 predictors. We randomly divided the data into training (n = 1667) and testing (n = 417) datasets. Four machine learning techniques were employed to discriminate IR: extreme gradient boosting (XGBoosting), random forest (RF), gradient boosting machine (GBM), and decision tree (DT). The area under the curve (AUC) of receiver operating characteristic (ROC), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were compared as performance metrics to select the optimal technique.

Results: The XGBoosting algorithm achieved a relatively high AUC of 0.93 in the training dataset and 0.86 in the testing dataset to discriminate IR using 48 predictors and was followed by the RF, GBM, and DT models. After selecting the top five predictors to build models, the XGBoost algorithm with the AUC of 0.90 (training dataset) and 0.86 (testing dataset) remained the optimal prediction model. The SHapley Additive exPlanations (SHAP) values revealed the associations between the five predictors and IR, namely BMI (strongly positive impact on IR), fasting glucose (strongly positive), HDL-C (medium negative), triglycerides (medium positive), and glycohemoglobin (medium positive). The threshold values for identifying IR were 29 kg/m2, 100 mg/dL, 54.5 mg/dL, 89 mg/dL, and 5.6% for BMI, glucose, HDL-C, triglycerides, and glycohemoglobin, respectively.

Conclusion: The XGBoosting algorithm demonstrated superior performance metrics for discriminating IR in middle-aged nondiabetic women, with BMI, glucose, HDL-C, glycohemoglobin, and triglycerides as the top five predictors.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用机器学习方法判别中年非糖尿病妇女的胰岛素抵抗。
目的:我们采用机器学习算法来判别非糖尿病中年女性的胰岛素抵抗(IR):我们采用机器学习算法来判别中年非糖尿病女性的胰岛素抵抗(IR):数据来自美国国家健康与营养调查(2007-2018 年)。研究对象为 2084 名 45-64 岁的非糖尿病女性。分析包括 48 个预测因子。我们将数据随机分为训练数据集(n = 1667)和测试数据集(n = 417)。我们采用了四种机器学习技术来判别 IR:极端梯度提升(XGBoosting)、随机森林(RF)、梯度提升机(GBM)和决策树(DT)。比较了接收者操作特征曲线下面积(AUC)、准确率、灵敏度、特异性、阳性预测值、阴性预测值和 F1 分数等性能指标,以选择最佳技术:XGBoosting算法使用48个预测因子对IR进行判别,在训练数据集和测试数据集上的AUC分别达到了0.93和0.86,相对较高,其次是RF、GBM和DT模型。在选择前五个预测因子建立模型后,XGBoost 算法的 AUC 为 0.90(训练数据集)和 0.86(测试数据集),仍然是最佳预测模型。SHapley Additive exPlanations(SHAP)值揭示了五个预测因子与 IR 之间的关联,即体重指数(对 IR 有强烈的正向影响)、空腹血糖(强烈的正向影响)、高密度脂蛋白胆固醇(中度负向影响)、甘油三酯(中度正向影响)和糖化血红蛋白(中度正向影响)。BMI、血糖、HDL-C、甘油三酯和糖化血红蛋白识别 IR 的阈值分别为 29 kg/m2、100 mg/dL、54.5 mg/dL、89 mg/dL 和 5.6%:XGBoosting算法在判别中年非糖尿病女性的红外方面表现出卓越的性能指标,BMI、血糖、HDL-C、甘油三酯和甘油三酯是前五大预测指标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
AIMS Public Health
AIMS Public Health HEALTH CARE SCIENCES & SERVICES-
CiteScore
4.80
自引率
0.00%
发文量
31
审稿时长
4 weeks
期刊最新文献
Unraveling the urban climate crisis: Exploring the nexus of urbanization, climate change, and their impacts on the environment and human well-being - A global perspective. Assessing mental resilience with individual and lifestyle determinants among nursing students: An observational study from Greece. Peer (dyadic) support: a hypertension feasibility study for older African American women. Can hotels be used as alternative care sites in disasters and public health emergencies-A narrative review. Safeguarding nurses' mental health: The critical role of psychosocial safety climate in mitigating relational stressors and exhaustion.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1