{"title":"Identification of molecular signatures and pathways of obese breast cancer gene expression data by a machine learning algorithm","authors":"Betul Comertpay, E. Gov","doi":"10.20517/jtgg.2021.44","DOIUrl":null,"url":null,"abstract":"Aim: Currently, the obesity epidemic is one of the biggest problems for human health. Obesity is impacted on survival in patients with breast cancer. However, key biomarkers of obesity-related breast cancer risk are still not well known. Thus, using machine learning to identify the most appropriate features in obesity-associated breast cancer patients may improve the predictive accuracy and interpretability of regression models. Methods: In the present study, we identified 23 differentially expressed genes (DEGs) from the GSE24185 transcriptome dataset. Seed genes were identified from DEGs, the co-expression network genes and hub genes of the protein-protein interaction network. Pathway enrichment analysis was performed for DEGs. The Ridge penalty regression model was executed by using P-values of enriched pathways and seed gene pathway association score to obtain the most relevant molecular signatures. The model was performed using 10-fold cross-validation to fit the penalized models. Results: Angiotensin II receptor type 1 (AGTR1), cyclin D1 (CCND1), glutamate ionotropic receptor AMPA type subunit 2 (GRIA2), interleukin-6 cytokine family signal transducer (IL6ST), matrix metallopeptidase 9 (MMP9), and protein kinase CAMP-dependent type II regulatory subunit beta (PRKAR2B) were considered as candidate molecular signatures of obese patients with breast cancer. In addition, RAF-independent MAPK1/3 activation, collagen degradation, bladder cancer, drug metabolism-cytochrome P450, and signaling by Hedgehog pathways in cancer were primarily associated with obesity-associated breast cancer. Conclusion: These genes may be used for risk analysis of the disease progression of obese patients with breast cancer. Corresponding genes and pathways should be validated via experimental studies.","PeriodicalId":73999,"journal":{"name":"Journal of translational genetics and genomics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of translational genetics and genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20517/jtgg.2021.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Aim: Currently, the obesity epidemic is one of the biggest problems for human health. Obesity is impacted on survival in patients with breast cancer. However, key biomarkers of obesity-related breast cancer risk are still not well known. Thus, using machine learning to identify the most appropriate features in obesity-associated breast cancer patients may improve the predictive accuracy and interpretability of regression models. Methods: In the present study, we identified 23 differentially expressed genes (DEGs) from the GSE24185 transcriptome dataset. Seed genes were identified from DEGs, the co-expression network genes and hub genes of the protein-protein interaction network. Pathway enrichment analysis was performed for DEGs. The Ridge penalty regression model was executed by using P-values of enriched pathways and seed gene pathway association score to obtain the most relevant molecular signatures. The model was performed using 10-fold cross-validation to fit the penalized models. Results: Angiotensin II receptor type 1 (AGTR1), cyclin D1 (CCND1), glutamate ionotropic receptor AMPA type subunit 2 (GRIA2), interleukin-6 cytokine family signal transducer (IL6ST), matrix metallopeptidase 9 (MMP9), and protein kinase CAMP-dependent type II regulatory subunit beta (PRKAR2B) were considered as candidate molecular signatures of obese patients with breast cancer. In addition, RAF-independent MAPK1/3 activation, collagen degradation, bladder cancer, drug metabolism-cytochrome P450, and signaling by Hedgehog pathways in cancer were primarily associated with obesity-associated breast cancer. Conclusion: These genes may be used for risk analysis of the disease progression of obese patients with breast cancer. Corresponding genes and pathways should be validated via experimental studies.