{"title":"使用GCKR、APOA5和BUD13基因变体预测代谢综合征的正则化机器学习模型:德黑兰心脏代谢遗传学研究。","authors":"Nadia Alipour, Anoshirvan Kazemnejad, Mahdi Akbarzadeh, Farzad Eskandari, Asiyeh Sadat Zahedi, Maryam S Daneshpour","doi":"10.22074/cellj.2023.2000864.1294","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Metabolic syndrome (MetS) is a complex multifactorial disorder that considerably burdens healthcare systems. We aim to classify MetS using regularized machine learning models in the presence of the risk variants of <i>GCKR, BUD13</i> and <i>APOA5</i>, and environmental risk factors.</p><p><strong>Materials and methods: </strong>A cohort study was conducted on 2,346 cases and 2,203 controls from eligible Tehran Cardiometabolic Genetic Study (TCGS) participants whose data were collected from 1999 to 2017. We used different regularization approaches [least absolute shrinkage and selection operator (LASSO), ridge regression (RR), elasticnet (ENET), adaptive LASSO (aLASSO), and adaptive ENET (aENET)] and a classical logistic regression (LR) model to classify MetS and select influential variables that predict MetS. Demographics, clinical features, and common polymorphisms in the <i>GCKR, BUD13</i> and <i>APOA5</i> genes of eligible participants were assessed to classify TCGS participant status in MetS development. The models' performance was evaluated by 10-repeated 10-fold crossvalidation. Various assessment measures of sensitivity, specificity, classification accuracy, and area under the receiver operating characteristic curve (AUC-ROC) and AUC-precision-recall (AUC-PR) curves were used to compare the models.</p><p><strong>Results: </strong>During the follow-up period, 50.38% of participants developed MetS. The groups were not similar in terms of baseline characteristics and risk variants. MetS was significantly associated with age, gender, schooling years, body mass index (BMI), and alternate alleles in all the risk variants, as indicated by LR. A comparison of accuracy, AUCROC, and AUC-PR metrics indicated that the regularization models outperformed LR. Regularized machine learning models provided comparable classification performances, whereas the aLASSO model was more parsimonious and selected fewer predictors.</p><p><strong>Conclusion: </strong>Regularized machine learning models provided more accurate and parsimonious MetS classifying models. These high-performing diagnostic models can lay the foundation for clinical decision support tools that use genetic and demographical variables to locate individuals at high risk for MetS.</p>","PeriodicalId":49224,"journal":{"name":"Cell Journal","volume":"25 8","pages":"536-545"},"PeriodicalIF":1.7000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2a/7b/Cell-J-25-536.PMC10542204.pdf","citationCount":"0","resultStr":"{\"title\":\"Regularized Machine Learning Models for Prediction of Metabolic Syndrome Using <i>GCKR, APOA5,</i> and <i>BUD13</i> Gene Variants: Tehran Cardiometabolic Genetic Study.\",\"authors\":\"Nadia Alipour, Anoshirvan Kazemnejad, Mahdi Akbarzadeh, Farzad Eskandari, Asiyeh Sadat Zahedi, Maryam S Daneshpour\",\"doi\":\"10.22074/cellj.2023.2000864.1294\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Metabolic syndrome (MetS) is a complex multifactorial disorder that considerably burdens healthcare systems. We aim to classify MetS using regularized machine learning models in the presence of the risk variants of <i>GCKR, BUD13</i> and <i>APOA5</i>, and environmental risk factors.</p><p><strong>Materials and methods: </strong>A cohort study was conducted on 2,346 cases and 2,203 controls from eligible Tehran Cardiometabolic Genetic Study (TCGS) participants whose data were collected from 1999 to 2017. We used different regularization approaches [least absolute shrinkage and selection operator (LASSO), ridge regression (RR), elasticnet (ENET), adaptive LASSO (aLASSO), and adaptive ENET (aENET)] and a classical logistic regression (LR) model to classify MetS and select influential variables that predict MetS. Demographics, clinical features, and common polymorphisms in the <i>GCKR, BUD13</i> and <i>APOA5</i> genes of eligible participants were assessed to classify TCGS participant status in MetS development. The models' performance was evaluated by 10-repeated 10-fold crossvalidation. Various assessment measures of sensitivity, specificity, classification accuracy, and area under the receiver operating characteristic curve (AUC-ROC) and AUC-precision-recall (AUC-PR) curves were used to compare the models.</p><p><strong>Results: </strong>During the follow-up period, 50.38% of participants developed MetS. The groups were not similar in terms of baseline characteristics and risk variants. MetS was significantly associated with age, gender, schooling years, body mass index (BMI), and alternate alleles in all the risk variants, as indicated by LR. A comparison of accuracy, AUCROC, and AUC-PR metrics indicated that the regularization models outperformed LR. Regularized machine learning models provided comparable classification performances, whereas the aLASSO model was more parsimonious and selected fewer predictors.</p><p><strong>Conclusion: </strong>Regularized machine learning models provided more accurate and parsimonious MetS classifying models. These high-performing diagnostic models can lay the foundation for clinical decision support tools that use genetic and demographical variables to locate individuals at high risk for MetS.</p>\",\"PeriodicalId\":49224,\"journal\":{\"name\":\"Cell Journal\",\"volume\":\"25 8\",\"pages\":\"536-545\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/2a/7b/Cell-J-25-536.PMC10542204.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cell Journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.22074/cellj.2023.2000864.1294\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.22074/cellj.2023.2000864.1294","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
Regularized Machine Learning Models for Prediction of Metabolic Syndrome Using GCKR, APOA5, and BUD13 Gene Variants: Tehran Cardiometabolic Genetic Study.
Objective: Metabolic syndrome (MetS) is a complex multifactorial disorder that considerably burdens healthcare systems. We aim to classify MetS using regularized machine learning models in the presence of the risk variants of GCKR, BUD13 and APOA5, and environmental risk factors.
Materials and methods: A cohort study was conducted on 2,346 cases and 2,203 controls from eligible Tehran Cardiometabolic Genetic Study (TCGS) participants whose data were collected from 1999 to 2017. We used different regularization approaches [least absolute shrinkage and selection operator (LASSO), ridge regression (RR), elasticnet (ENET), adaptive LASSO (aLASSO), and adaptive ENET (aENET)] and a classical logistic regression (LR) model to classify MetS and select influential variables that predict MetS. Demographics, clinical features, and common polymorphisms in the GCKR, BUD13 and APOA5 genes of eligible participants were assessed to classify TCGS participant status in MetS development. The models' performance was evaluated by 10-repeated 10-fold crossvalidation. Various assessment measures of sensitivity, specificity, classification accuracy, and area under the receiver operating characteristic curve (AUC-ROC) and AUC-precision-recall (AUC-PR) curves were used to compare the models.
Results: During the follow-up period, 50.38% of participants developed MetS. The groups were not similar in terms of baseline characteristics and risk variants. MetS was significantly associated with age, gender, schooling years, body mass index (BMI), and alternate alleles in all the risk variants, as indicated by LR. A comparison of accuracy, AUCROC, and AUC-PR metrics indicated that the regularization models outperformed LR. Regularized machine learning models provided comparable classification performances, whereas the aLASSO model was more parsimonious and selected fewer predictors.
Conclusion: Regularized machine learning models provided more accurate and parsimonious MetS classifying models. These high-performing diagnostic models can lay the foundation for clinical decision support tools that use genetic and demographical variables to locate individuals at high risk for MetS.
期刊介绍:
The “Cell Journal (Yakhteh)“, formerly published as “Yakhteh Medical Journal”, is a quarterly English publication of Royan Institute. This journal focuses on topics relevant to cellular and molecular scientific areas, besides other related fields. The Cell J has been certified by Ministry of Culture and Islamic Guidance in 1999 and was accredited as a scientific and research journal by HBI (Health and Biomedical Information) Journal Accreditation Commission in 2000 which is an open access journal.