Huajun Yang, Zhongan Wang, L. Gong, Guichuan Huang, Daigang Chen, Xiao-peng Li, Fei Du, Jiang Lin, Xueyi Yang
{"title":"A Novel Hypoxia-Related Gene Signature with Strong Predicting Ability in Non-Small-Cell Lung Cancer Identified by Comprehensive Profiling","authors":"Huajun Yang, Zhongan Wang, L. Gong, Guichuan Huang, Daigang Chen, Xiao-peng Li, Fei Du, Jiang Lin, Xueyi Yang","doi":"10.1155/2022/8594658","DOIUrl":null,"url":null,"abstract":"Background Non-small-cell lung cancer (NSCLC) is the most common malignant tumor among males and females worldwide. Hypoxia is a typical feature of the tumor microenvironment, and it affects cancer development. Circular RNAs (circRNAs) have been reported to sponge miRNAs to regulate target gene expression and play an essential role in tumorigenesis and progression. This study is aimed at identifying whether circRNAs could be used as the diagnostic biomarkers for NSCLC. Methods The heterogeneity of samples in this study was assessed by principal component analysis (PCA). Furthermore, the Gene Expression Omnibus (GEO) database was normalized by the affy R package. We further screened the differentially expressed genes (DEGs) and differentially expressed circular RNAs (DEcircRNAs) using the DEseq2 R package. Moreover, we analyzed the Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment of DEGs using the cluster profile R package. Besides, the Gene Set Enrichment Analysis (GSEA) was used to identify the biological function of DEGs. The interaction between DEGs and the competing endogenous RNAs (ceRNA) network was detected using STRING and visualized using Cytoscape. Starbase predicted the miRNAs of target hub genes, and miRanda predicted the target miRNAs of circRNAs. The RNA-seq profiler and clinical information were downloaded from The Cancer Genome Atlas (TCGA) database. Then, the variables were assessed by the univariate and multivariate Cox proportional hazard regression models. Significant variables in the univariate Cox proportional hazard regression model were included in the multivariate Cox proportional hazard regression model to analyze the association between the variables of clinical features. Furthermore, the overall survival of variables was determined by the Kaplan-Meier survival curve, and the time-dependent receiver operating characteristic (ROC) curve analysis was used to calculate and validate the risk score in NSCLC patients. Moreover, predictive nomograms were constructed and used to predict the prognostic features between the high-risk and low-risk score groups. Results We screened a total of 2039 DEGs, including 1293 upregulated DEGs and 746 downregulated DEGs in hypoxia-treated A549 cells. A549 cells treated with hypoxia had a total of 70 DEcircRNAs, including 21 upregulated and 49 downregulated DEcircRNAs, compared to A549 cells treated with normoxia. The upregulated genes were significantly enriched in 284 GO terms and 42 KEGG pathways, while the downregulated genes were significantly enriched in 184 GO terms and 25 KEGG pathways. Moreover, the function analysis by GSEA showed enrichment in the enzyme-linked receptor protein signaling pathway, hypoxia-inducible factor- (HIF-) 1 signaling pathway, and G protein-coupled receptor (GPCR) downstream signaling. Furthermore, six hub modules and 10 hub genes, CDC45, EXO1, PLK1, RFC4, CCNB1, CDC6, MCM10, DLGAP5, AURKA, and POLE2, were identified. The ceRNA network was constructed, and it consisted of 4 circRNAs, 14 miRNAs, and 38 mRNAs. The ROC curve was constructed and calculated. The area under the curve (AUC) value was 0.62, and the optimal threshold was 0.28. Based on the optimal threshold, the patients were divided into the high-risk score and low-risk score groups. The survival rate in the high-risk score group was lower than that in the low-risk score group. The expression of SERPINE1, STC2, and LPCAT1; clinical stage; and age of the patient were significantly correlated with the high-risk score. Moreover, nomograms were established based on the risk factors in multivariate analysis, and the median survival time, 3-year survival probability, and 5-year survival were possibly predicted according to nomograms. Conclusion The ceRNA network associated with NSCLC was identified, and the hub genes, circRNAs, might act as the potential biomarkers for NSCLC.","PeriodicalId":13988,"journal":{"name":"International Journal of Genomics","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2022-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1155/2022/8594658","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 1
Abstract
Background Non-small-cell lung cancer (NSCLC) is the most common malignant tumor among males and females worldwide. Hypoxia is a typical feature of the tumor microenvironment, and it affects cancer development. Circular RNAs (circRNAs) have been reported to sponge miRNAs to regulate target gene expression and play an essential role in tumorigenesis and progression. This study is aimed at identifying whether circRNAs could be used as the diagnostic biomarkers for NSCLC. Methods The heterogeneity of samples in this study was assessed by principal component analysis (PCA). Furthermore, the Gene Expression Omnibus (GEO) database was normalized by the affy R package. We further screened the differentially expressed genes (DEGs) and differentially expressed circular RNAs (DEcircRNAs) using the DEseq2 R package. Moreover, we analyzed the Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment of DEGs using the cluster profile R package. Besides, the Gene Set Enrichment Analysis (GSEA) was used to identify the biological function of DEGs. The interaction between DEGs and the competing endogenous RNAs (ceRNA) network was detected using STRING and visualized using Cytoscape. Starbase predicted the miRNAs of target hub genes, and miRanda predicted the target miRNAs of circRNAs. The RNA-seq profiler and clinical information were downloaded from The Cancer Genome Atlas (TCGA) database. Then, the variables were assessed by the univariate and multivariate Cox proportional hazard regression models. Significant variables in the univariate Cox proportional hazard regression model were included in the multivariate Cox proportional hazard regression model to analyze the association between the variables of clinical features. Furthermore, the overall survival of variables was determined by the Kaplan-Meier survival curve, and the time-dependent receiver operating characteristic (ROC) curve analysis was used to calculate and validate the risk score in NSCLC patients. Moreover, predictive nomograms were constructed and used to predict the prognostic features between the high-risk and low-risk score groups. Results We screened a total of 2039 DEGs, including 1293 upregulated DEGs and 746 downregulated DEGs in hypoxia-treated A549 cells. A549 cells treated with hypoxia had a total of 70 DEcircRNAs, including 21 upregulated and 49 downregulated DEcircRNAs, compared to A549 cells treated with normoxia. The upregulated genes were significantly enriched in 284 GO terms and 42 KEGG pathways, while the downregulated genes were significantly enriched in 184 GO terms and 25 KEGG pathways. Moreover, the function analysis by GSEA showed enrichment in the enzyme-linked receptor protein signaling pathway, hypoxia-inducible factor- (HIF-) 1 signaling pathway, and G protein-coupled receptor (GPCR) downstream signaling. Furthermore, six hub modules and 10 hub genes, CDC45, EXO1, PLK1, RFC4, CCNB1, CDC6, MCM10, DLGAP5, AURKA, and POLE2, were identified. The ceRNA network was constructed, and it consisted of 4 circRNAs, 14 miRNAs, and 38 mRNAs. The ROC curve was constructed and calculated. The area under the curve (AUC) value was 0.62, and the optimal threshold was 0.28. Based on the optimal threshold, the patients were divided into the high-risk score and low-risk score groups. The survival rate in the high-risk score group was lower than that in the low-risk score group. The expression of SERPINE1, STC2, and LPCAT1; clinical stage; and age of the patient were significantly correlated with the high-risk score. Moreover, nomograms were established based on the risk factors in multivariate analysis, and the median survival time, 3-year survival probability, and 5-year survival were possibly predicted according to nomograms. Conclusion The ceRNA network associated with NSCLC was identified, and the hub genes, circRNAs, might act as the potential biomarkers for NSCLC.
期刊介绍:
International Journal of Genomics is a peer-reviewed, Open Access journal that publishes research articles as well as review articles in all areas of genome-scale analysis. Topics covered by the journal include, but are not limited to: bioinformatics, clinical genomics, disease genomics, epigenomics, evolutionary genomics, functional genomics, genome engineering, and synthetic genomics.