Prediction of metabolic syndrome using machine learning approaches based on genetic and nutritional factors: a 14-year prospective-based cohort study.

IF 2.1 4区 医学 Q3 GENETICS & HEREDITY BMC Medical Genomics Pub Date : 2024-09-04 DOI:10.1186/s12920-024-01998-1
Dayeon Shin
{"title":"Prediction of metabolic syndrome using machine learning approaches based on genetic and nutritional factors: a 14-year prospective-based cohort study.","authors":"Dayeon Shin","doi":"10.1186/s12920-024-01998-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Metabolic syndrome is a chronic disease associated with multiple comorbidities. Over the last few years, machine learning techniques have been used to predict metabolic syndrome. However, studies incorporating demographic, clinical, laboratory, dietary, and genetic factors to predict the incidence of metabolic syndrome in Koreans are limited. In the present study, we propose a genome-wide polygenic risk score for the prediction of metabolic syndrome, along with other factors, to improve the prediction accuracy of metabolic syndrome.</p><p><strong>Methods: </strong>We developed 7 machine learning-based models and used Cox multivariable regression, deep neural network (DNN), support vector machine (SVM), stochastic gradient descent (SGD), random forest (RAF), Naïve Bayes (NBA) classifier, and AdaBoost (ADB) to predict the incidence of metabolic syndrome at year 14 using the dataset from the Korean Genome and Epidemiology Study (KoGES) Ansan and Ansung.</p><p><strong>Results: </strong>Of the 5440 patients, 2,120 were considered to have new-onset metabolic syndrome. The AUC values of model, which included sex, age, alcohol intake, energy intake, marital status, education status, income status, smoking status, dried laver intake, and genome-wide polygenic risk score (gPRS) Z-score based on 344,447 SNPs (p-value < 1.0), were the highest for RAF (0.994 [95% CI 0.985, 1.000]) and ADB (0.994 [95% CI 0.986, 1.000]).</p><p><strong>Conclusions: </strong>Incorporating both gPRS and demographic, clinical, laboratory, and seaweed data led to enhanced metabolic syndrome risk prediction by capturing the distinct etiologies of metabolic syndrome development. The RAF- and ADB-based models predicted metabolic syndrome more accurately than the NBA-based model for the Korean population.</p>","PeriodicalId":8915,"journal":{"name":"BMC Medical Genomics","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11373243/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Genomics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12920-024-01998-1","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Metabolic syndrome is a chronic disease associated with multiple comorbidities. Over the last few years, machine learning techniques have been used to predict metabolic syndrome. However, studies incorporating demographic, clinical, laboratory, dietary, and genetic factors to predict the incidence of metabolic syndrome in Koreans are limited. In the present study, we propose a genome-wide polygenic risk score for the prediction of metabolic syndrome, along with other factors, to improve the prediction accuracy of metabolic syndrome.

Methods: We developed 7 machine learning-based models and used Cox multivariable regression, deep neural network (DNN), support vector machine (SVM), stochastic gradient descent (SGD), random forest (RAF), Naïve Bayes (NBA) classifier, and AdaBoost (ADB) to predict the incidence of metabolic syndrome at year 14 using the dataset from the Korean Genome and Epidemiology Study (KoGES) Ansan and Ansung.

Results: Of the 5440 patients, 2,120 were considered to have new-onset metabolic syndrome. The AUC values of model, which included sex, age, alcohol intake, energy intake, marital status, education status, income status, smoking status, dried laver intake, and genome-wide polygenic risk score (gPRS) Z-score based on 344,447 SNPs (p-value < 1.0), were the highest for RAF (0.994 [95% CI 0.985, 1.000]) and ADB (0.994 [95% CI 0.986, 1.000]).

Conclusions: Incorporating both gPRS and demographic, clinical, laboratory, and seaweed data led to enhanced metabolic syndrome risk prediction by capturing the distinct etiologies of metabolic syndrome development. The RAF- and ADB-based models predicted metabolic syndrome more accurately than the NBA-based model for the Korean population.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用基于遗传和营养因素的机器学习方法预测代谢综合征:一项为期 14 年的前瞻性队列研究。
导言代谢综合征是一种与多种并发症相关的慢性疾病。过去几年中,机器学习技术已被用于预测代谢综合征。然而,结合人口统计学、临床、实验室、饮食和遗传因素来预测韩国人代谢综合征发病率的研究还很有限。在本研究中,我们提出了预测代谢综合征的全基因组多基因风险评分,并结合其他因素,以提高代谢综合征的预测准确性:我们开发了 7 种基于机器学习的模型,并使用 Cox 多变量回归、深度神经网络(DNN)、支持向量机(SVM)、随机梯度下降(SGD)、随机森林(RAF)、奈夫贝叶斯(NBA)分类器和 AdaBoost(ADB),利用韩国基因组与流行病学研究(KoGES)安山和安城的数据集预测代谢综合征在第 14 年的发病率:在 5440 名患者中,有 2120 人被认为患有新发代谢综合征。该模型的 AUC 值包括性别、年龄、酒精摄入量、能量摄入量、婚姻状况、教育状况、收入状况、吸烟状况、紫菜干摄入量,以及基于 344 447 个 SNPs 的全基因组多基因风险评分(gPRS)Z-score(p 值 结论):将 gPRS 与人口统计学、临床、实验室和海藻数据相结合,通过捕捉代谢综合征发生的不同病因,提高了代谢综合征的风险预测能力。在韩国人群中,基于 RAF 和 ADB 的模型比基于 NBA 的模型能更准确地预测代谢综合征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
BMC Medical Genomics
BMC Medical Genomics 医学-遗传学
CiteScore
3.90
自引率
0.00%
发文量
243
审稿时长
3.5 months
期刊介绍: BMC Medical Genomics is an open access journal publishing original peer-reviewed research articles in all aspects of functional genomics, genome structure, genome-scale population genetics, epigenomics, proteomics, systems analysis, and pharmacogenomics in relation to human health and disease.
期刊最新文献
Construction of a molecular diagnostic system for neurogenic rosacea by combining transcriptome sequencing and machine learning Identification of autophagy-related genes as potential biomarkers correlated with immune infiltration in bipolar disorder: a bioinformatics analysis Whole exome sequencing analysis of 167 men with primary infertility Transcriptomic analysis reveals key molecular signatures across recovery phases of hemorrhagic fever with renal syndrome Shared etiology of Mendelian and complex disease supports drug discovery
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1