Application of machine learning algorithm incorporating dietary intake in prediction of gestational diabetes mellitus.

IF 2.6 3区 医学 Q3 ENDOCRINOLOGY & METABOLISM Endocrine Connections Pub Date : 2024-11-21 Print Date: 2024-12-01 DOI:10.1530/EC-24-0169
Tianze Ding, Peijie Liu, Jie Jia, Hui Wu, Jie Zhu, Kefeng Yang
{"title":"Application of machine learning algorithm incorporating dietary intake in prediction of gestational diabetes mellitus.","authors":"Tianze Ding, Peijie Liu, Jie Jia, Hui Wu, Jie Zhu, Kefeng Yang","doi":"10.1530/EC-24-0169","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Gestational diabetes mellitus (GDM) significantly affects pregnancy outcomes. Therefore, it is crucial to develop prediction models since they can guide timely interventions to reduce the incidence of GDM and its associated adverse effects.</p><p><strong>Methods: </strong>A total of 554 pregnant women were selected and their sociodemographic characteristics, clinical data and dietary data were collected. Dietary data were investigated by a validated semi-quantitative food frequency questionnaire (FFQ). We applied random forest mean decrease impurity for feature selection and the models are built using logistic regression, XGBoost, and LightGBM algorithms. The prediction performance of different models was compared by accuracy, sensitivity, specificity, area under curve (AUC) and Hosmer-Lemeshow test.</p><p><strong>Results: </strong>Blood glucose, age, pre-pregnancy body mass index (BMI), triglycerides and high-density lipoprotein cholesterol (HDL) were the top five features according to the feature selection. Among the three algorithms, XGBoost performed best with an AUC of 0.788, LightGBM came second (AUC = 0.749), and logistic regression performed the worst (AUC = 0.712). In addition, XGBoost and LightGBM both achieved a fairly good performance when dietary information was included, surpassing their performance on the non-dietary dataset (0.788 vs 0.718 in XGBoost; 0.749 vs 0.726 in LightGBM).</p><p><strong>Conclusion: </strong>XGBoost and LightGBM algorithms outperform logistic regression in predicting GDM among Chinese pregnant women. In addition, dietary data may have a positive effect on improving model performance, which deserves more in-depth investigation with larger sample size.</p>","PeriodicalId":11634,"journal":{"name":"Endocrine Connections","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Endocrine Connections","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1530/EC-24-0169","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/1 0:00:00","PubModel":"Print","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Gestational diabetes mellitus (GDM) significantly affects pregnancy outcomes. Therefore, it is crucial to develop prediction models since they can guide timely interventions to reduce the incidence of GDM and its associated adverse effects.

Methods: A total of 554 pregnant women were selected and their sociodemographic characteristics, clinical data and dietary data were collected. Dietary data were investigated by a validated semi-quantitative food frequency questionnaire (FFQ). We applied random forest mean decrease impurity for feature selection and the models are built using logistic regression, XGBoost, and LightGBM algorithms. The prediction performance of different models was compared by accuracy, sensitivity, specificity, area under curve (AUC) and Hosmer-Lemeshow test.

Results: Blood glucose, age, pre-pregnancy body mass index (BMI), triglycerides and high-density lipoprotein cholesterol (HDL) were the top five features according to the feature selection. Among the three algorithms, XGBoost performed best with an AUC of 0.788, LightGBM came second (AUC = 0.749), and logistic regression performed the worst (AUC = 0.712). In addition, XGBoost and LightGBM both achieved a fairly good performance when dietary information was included, surpassing their performance on the non-dietary dataset (0.788 vs 0.718 in XGBoost; 0.749 vs 0.726 in LightGBM).

Conclusion: XGBoost and LightGBM algorithms outperform logistic regression in predicting GDM among Chinese pregnant women. In addition, dietary data may have a positive effect on improving model performance, which deserves more in-depth investigation with larger sample size.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
结合饮食摄入量的机器学习算法在预测妊娠糖尿病中的应用。
简介妊娠期糖尿病(GDM)会严重影响妊娠结局。因此,开发预测模型至关重要,因为这些模型可以指导及时干预,降低 GDM 的发病率及其相关不良影响:方法:共选取了 554 名孕妇,收集了她们的社会人口学特征、临床数据和饮食数据。膳食数据通过有效的半定量食物频率问卷(FFQ)进行调查。我们采用随机森林平均降低不纯度的方法进行特征选择,并使用逻辑回归、XGBoost 和 LightGBM 算法建立模型。通过准确性、灵敏度、特异性、曲线下面积(AUC)和 Hosmer-Lemeshow 检验比较了不同模型的预测性能:根据特征选择,血糖、年龄、孕前体重指数(BMI)、甘油三酯和高密度脂蛋白胆固醇(HDL)是排名前五的特征。在三种算法中,XGBoost 的 AUC 为 0.788,表现最佳;LightGBM 次之(AUC = 0.749);Logistic 回归表现最差(AUC = 0.712)。此外,当包含饮食信息时,XGBoost 和 LightGBM 都取得了相当好的性能,超过了它们在非饮食数据集上的性能(XGBoost 为 0.788 vs. 0.718;LightGBM 为 0.749 vs. 0.726):结论:XGBoost 和 LightGBM 算法在预测中国孕妇 GDM 方面优于 Logistic 回归。此外,膳食数据可能对提高模型性能有积极作用,这值得在样本量更大的情况下进行更深入的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Endocrine Connections
Endocrine Connections Medicine-Internal Medicine
CiteScore
5.00
自引率
3.40%
发文量
361
审稿时长
6 weeks
期刊介绍: Endocrine Connections publishes original quality research and reviews in all areas of endocrinology, including papers that deal with non-classical tissues as source or targets of hormones and endocrine papers that have relevance to endocrine-related and intersecting disciplines and the wider biomedical community.
期刊最新文献
STAT6 blockade ameliorates thyroid function in Graves' disease via downregulation of the sodium/iodide symporter. High expression of COL8A1 predicts poor prognosis and promotes EMT in papillary thyroid cancer. Application of machine learning algorithm incorporating dietary intake in prediction of gestational diabetes mellitus. Confusion in the interpretation of prolactin levels caused by inappropriately low reference intervals. TGFBR3 inhibits progression of papillary thyroid cancer by inhibiting the PI3K/AKT pathway and EMT.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1