使用GCKR、APOA5和BUD13基因变体预测代谢综合征的正则化机器学习模型：德黑兰心脏代谢遗传学研究。

IF 16.4 1区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Accounts of Chemical Research Pub Date : 2023-08-01 DOI:10.22074/cellj.2023.2000864.1294

Nadia Alipour, Anoshirvan Kazemnejad, Mahdi Akbarzadeh, Farzad Eskandari, Asiyeh Sadat Zahedi, Maryam S Daneshpour

{"title":"使用GCKR、APOA5和BUD13基因变体预测代谢综合征的正则化机器学习模型：德黑兰心脏代谢遗传学研究。","authors":"Nadia Alipour, Anoshirvan Kazemnejad, Mahdi Akbarzadeh, Farzad Eskandari, Asiyeh Sadat Zahedi, Maryam S Daneshpour","doi":"10.22074/cellj.2023.2000864.1294","DOIUrl":null,"url":null,"abstract":"Objective: Metabolic syndrome (MetS) is a complex multifactorial disorder that considerably burdens healthcare systems. We aim to classify MetS using regularized machine learning models in the presence of the risk variants of GCKR, BUD13 and APOA5, and environmental risk factors.Materials and methods: A cohort study was conducted on 2,346 cases and 2,203 controls from eligible Tehran Cardiometabolic Genetic Study (TCGS) participants whose data were collected from 1999 to 2017. We used different regularization approaches [least absolute shrinkage and selection operator (LASSO), ridge regression (RR), elasticnet (ENET), adaptive LASSO (aLASSO), and adaptive ENET (aENET)] and a classical logistic regression (LR) model to classify MetS and select influential variables that predict MetS. Demographics, clinical features, and common polymorphisms in the GCKR, BUD13 and APOA5 genes of eligible participants were assessed to classify TCGS participant status in MetS development. The models' performance was evaluated by 10-repeated 10-fold crossvalidation. Various assessment measures of sensitivity, specificity, classification accuracy, and area under the receiver operating characteristic curve (AUC-ROC) and AUC-precision-recall (AUC-PR) curves were used to compare the models.Results: During the follow-up period, 50.38% of participants developed MetS. The groups were not similar in terms of baseline characteristics and risk variants. MetS was significantly associated with age, gender, schooling years, body mass index (BMI), and alternate alleles in all the risk variants, as indicated by LR. A comparison of accuracy, AUCROC, and AUC-PR metrics indicated that the regularization models outperformed LR. Regularized machine learning models provided comparable classification performances, whereas the aLASSO model was more parsimonious and selected fewer predictors.Conclusion: Regularized machine learning models provided more accurate and parsimonious MetS classifying models. These high-performing diagnostic models can lay the foundation for clinical decision support tools that use genetic and demographical variables to locate individuals at high risk for MetS.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2a/7b/Cell-J-25-536.PMC10542204.pdf","citationCount":"0","resultStr":"{\"title\":\"Regularized Machine Learning Models for Prediction of Metabolic Syndrome Using GCKR, APOA5, and BUD13 Gene Variants: Tehran Cardiometabolic Genetic Study.\",\"authors\":\"Nadia Alipour, Anoshirvan Kazemnejad, Mahdi Akbarzadeh, Farzad Eskandari, Asiyeh Sadat Zahedi, Maryam S Daneshpour\",\"doi\":\"10.22074/cellj.2023.2000864.1294\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objective: Metabolic syndrome (MetS) is a complex multifactorial disorder that considerably burdens healthcare systems. We aim to classify MetS using regularized machine learning models in the presence of the risk variants of GCKR, BUD13 and APOA5, and environmental risk factors.Materials and methods: A cohort study was conducted on 2,346 cases and 2,203 controls from eligible Tehran Cardiometabolic Genetic Study (TCGS) participants whose data were collected from 1999 to 2017. We used different regularization approaches [least absolute shrinkage and selection operator (LASSO), ridge regression (RR), elasticnet (ENET), adaptive LASSO (aLASSO), and adaptive ENET (aENET)] and a classical logistic regression (LR) model to classify MetS and select influential variables that predict MetS. Demographics, clinical features, and common polymorphisms in the GCKR, BUD13 and APOA5 genes of eligible participants were assessed to classify TCGS participant status in MetS development. The models' performance was evaluated by 10-repeated 10-fold crossvalidation. Various assessment measures of sensitivity, specificity, classification accuracy, and area under the receiver operating characteristic curve (AUC-ROC) and AUC-precision-recall (AUC-PR) curves were used to compare the models.Results: During the follow-up period, 50.38% of participants developed MetS. The groups were not similar in terms of baseline characteristics and risk variants. MetS was significantly associated with age, gender, schooling years, body mass index (BMI), and alternate alleles in all the risk variants, as indicated by LR. A comparison of accuracy, AUCROC, and AUC-PR metrics indicated that the regularization models outperformed LR. Regularized machine learning models provided comparable classification performances, whereas the aLASSO model was more parsimonious and selected fewer predictors.Conclusion: Regularized machine learning models provided more accurate and parsimonious MetS classifying models. These high-performing diagnostic models can lay the foundation for clinical decision support tools that use genetic and demographical variables to locate individuals at high risk for MetS.\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2a/7b/Cell-J-25-536.PMC10542204.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.22074/cellj.2023.2000864.1294\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.22074/cellj.2023.2000864.1294","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

目的：代谢综合征（MetS）是一种复杂的多因素疾病，给医疗系统带来很大负担。我们的目标是在存在GCKR、BUD13和APOA5的风险变体以及环境风险因素的情况下，使用正则化机器学习模型对MetS进行分类。材料和方法：对符合条件的德黑兰心脏代谢遗传研究（TCGS）参与者的2346例病例和2203名对照进行了队列研究，这些参与者的数据收集于1999年至2017年。我们使用了不同的正则化方法[最小绝对收缩和选择算子（LASSO）、岭回归（RR）、弹性网（ENET）、自适应LASSO（aLASSO）和自适应ENET（aENET）]和经典逻辑回归（LR）模型来对MetS进行分类，并选择预测MetS的有影响力的变量。对符合条件的参与者的人口统计学、临床特征和GCKR、BUD13和APOA5基因的常见多态性进行评估，以对TCGS参与者在MetS发展中的状态进行分类。通过10次重复的10倍交叉验证来评估模型的性能。使用灵敏度、特异性、分类准确度和受试者工作特征曲线下面积（AUC-ROC）和AUC精确回忆（AUC-PR）曲线的各种评估指标来比较模型。结果：在随访期间，50.38%的参与者出现了代谢综合征。两组在基线特征和风险变异方面并不相似。如LR所示，在所有风险变体中，MetS与年龄、性别、受教育年限、体重指数（BMI）和替代等位基因显著相关。准确性、AUROC和AUC-PR指标的比较表明，正则化模型优于LR。正则化机器学习模型提供了可比的分类性能，而aLASSO模型更为简约，选择的预测因子更少。结论：正则化的机器学习模型提供了更准确、更简约的MetS分类模型。这些高性能的诊断模型可以为临床决策支持工具奠定基础，这些工具使用遗传和人口学变量来定位MetS高危人群。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Regularized Machine Learning Models for Prediction of Metabolic Syndrome Using GCKR, APOA5, and BUD13 Gene Variants: Tehran Cardiometabolic Genetic Study.

Objective: Metabolic syndrome (MetS) is a complex multifactorial disorder that considerably burdens healthcare systems. We aim to classify MetS using regularized machine learning models in the presence of the risk variants of GCKR, BUD13 and APOA5, and environmental risk factors.

Materials and methods: A cohort study was conducted on 2,346 cases and 2,203 controls from eligible Tehran Cardiometabolic Genetic Study (TCGS) participants whose data were collected from 1999 to 2017. We used different regularization approaches [least absolute shrinkage and selection operator (LASSO), ridge regression (RR), elasticnet (ENET), adaptive LASSO (aLASSO), and adaptive ENET (aENET)] and a classical logistic regression (LR) model to classify MetS and select influential variables that predict MetS. Demographics, clinical features, and common polymorphisms in the GCKR, BUD13 and APOA5 genes of eligible participants were assessed to classify TCGS participant status in MetS development. The models' performance was evaluated by 10-repeated 10-fold crossvalidation. Various assessment measures of sensitivity, specificity, classification accuracy, and area under the receiver operating characteristic curve (AUC-ROC) and AUC-precision-recall (AUC-PR) curves were used to compare the models.

Results: During the follow-up period, 50.38% of participants developed MetS. The groups were not similar in terms of baseline characteristics and risk variants. MetS was significantly associated with age, gender, schooling years, body mass index (BMI), and alternate alleles in all the risk variants, as indicated by LR. A comparison of accuracy, AUCROC, and AUC-PR metrics indicated that the regularization models outperformed LR. Regularized machine learning models provided comparable classification performances, whereas the aLASSO model was more parsimonious and selected fewer predictors.

Conclusion: Regularized machine learning models provided more accurate and parsimonious MetS classifying models. These high-performing diagnostic models can lay the foundation for clinical decision support tools that use genetic and demographical variables to locate individuals at high risk for MetS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Accounts of Chemical Research 化学-化学综合

CiteScore

31.40

自引率

1.10%

发文量

312

审稿时长

2 months

期刊介绍： Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.

期刊最新文献

Management of Cholesteatoma: Hearing Rehabilitation. Congenital Cholesteatoma. Evaluation of Cholesteatoma. Management of Cholesteatoma: Extension Beyond Middle Ear/Mastoid. Recidivism and Recurrence.