Kritsada Sreebunpeng, Jonathan H. Chan, A. Meechai
{"title":"基于RNA-seq数据的肺癌基因亚网络生物标志物鉴定","authors":"Kritsada Sreebunpeng, Jonathan H. Chan, A. Meechai","doi":"10.1145/3429210.3429212","DOIUrl":null,"url":null,"abstract":"In recent years, the increasing availability of cancer RNA-seq datasets has provided unprecedented information and opportunities for the discovery of biomarkers for cancer. In this study, we tested our previously published Gene Sub-Network-based Feature Selection (GSNFS) method to identify gene-subnetwork biomarkers with RNA-seq-based gene expression data of lung cancer. In addition, five different filter-based feature selection techniques were explored to rank identified subnetworks. We found that the majority of the top 10 ranked subnetworks were associated with cancer pathways such as the MAPK signalling pathway. With Support Vector Machine (SVM) as a classifier based on the Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC) curve using 10-fold cross-validation and cross-dataset validation, we showed that gene subnetwork biomarkers obtained by RNA-seq-based GSNFS analysis had excellent classification performance. Additionally, when comparing the top-ranked subnetworks obtained from RNA-seq-based GSNFS analysis with those top-ranked subnetworks previously obtained from DNA microarray-based GSNFS analysis, we could categorize subnetworks and found unique pathways of cancer for each data-based analysis.","PeriodicalId":164790,"journal":{"name":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identification of Gene Subnetwork Biomarkers of Lung Cancer from RNA-seq Data\",\"authors\":\"Kritsada Sreebunpeng, Jonathan H. Chan, A. Meechai\",\"doi\":\"10.1145/3429210.3429212\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the increasing availability of cancer RNA-seq datasets has provided unprecedented information and opportunities for the discovery of biomarkers for cancer. In this study, we tested our previously published Gene Sub-Network-based Feature Selection (GSNFS) method to identify gene-subnetwork biomarkers with RNA-seq-based gene expression data of lung cancer. In addition, five different filter-based feature selection techniques were explored to rank identified subnetworks. We found that the majority of the top 10 ranked subnetworks were associated with cancer pathways such as the MAPK signalling pathway. With Support Vector Machine (SVM) as a classifier based on the Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC) curve using 10-fold cross-validation and cross-dataset validation, we showed that gene subnetwork biomarkers obtained by RNA-seq-based GSNFS analysis had excellent classification performance. Additionally, when comparing the top-ranked subnetworks obtained from RNA-seq-based GSNFS analysis with those top-ranked subnetworks previously obtained from DNA microarray-based GSNFS analysis, we could categorize subnetworks and found unique pathways of cancer for each data-based analysis.\",\"PeriodicalId\":164790,\"journal\":{\"name\":\"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3429210.3429212\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CSBio '20: Proceedings of the Eleventh International Conference on Computational Systems-Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3429210.3429212","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Identification of Gene Subnetwork Biomarkers of Lung Cancer from RNA-seq Data
In recent years, the increasing availability of cancer RNA-seq datasets has provided unprecedented information and opportunities for the discovery of biomarkers for cancer. In this study, we tested our previously published Gene Sub-Network-based Feature Selection (GSNFS) method to identify gene-subnetwork biomarkers with RNA-seq-based gene expression data of lung cancer. In addition, five different filter-based feature selection techniques were explored to rank identified subnetworks. We found that the majority of the top 10 ranked subnetworks were associated with cancer pathways such as the MAPK signalling pathway. With Support Vector Machine (SVM) as a classifier based on the Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC) curve using 10-fold cross-validation and cross-dataset validation, we showed that gene subnetwork biomarkers obtained by RNA-seq-based GSNFS analysis had excellent classification performance. Additionally, when comparing the top-ranked subnetworks obtained from RNA-seq-based GSNFS analysis with those top-ranked subnetworks previously obtained from DNA microarray-based GSNFS analysis, we could categorize subnetworks and found unique pathways of cancer for each data-based analysis.