{"title":"Targeted unsupervised features learning for gene expression data analysis to predict cancer stage","authors":"Imene Zenbout, Abdelkrim Bouramoul, S. Meshoul","doi":"10.1145/3365953.3365958","DOIUrl":null,"url":null,"abstract":"The intensive explosion in the generation of large scale cancer gene expression data brought several computational challenges, yet opened great opportunities in exploring different pathways in order to improve cancer prognosis, diagnosis and treatment. In this paper, we propose a targeted unsupervised learning model, based on deep autoencoders (TAE) to learn significant cancer representation based on the gene expression omnibus(GEO) integrated expO data set, for the ultimate goal of constructing an accurate cancer stage predictive model. Where, the trained model was tested on two gene expression cancer data sets namely, lung cancer for clinical stage and intensive breast cancer (IBC) for pathological stage. In which, the model extracted new features space for the two cancer type based on the knowledge built from the expO data set. The generated features were used to train classifiers to predict the cancer stage of each sample. We evaluated the effectiveness of our proposal by comparison to the principal component analysis (PCA) unsupervised dimensionality reduction, as well as to the supervised univariate features selection method. The experimental results, show a promising performance of our analysis model to build a collaborative knowledge from different cancer type to enhance the prediction rate of different cancer stage.","PeriodicalId":158189,"journal":{"name":"Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics","volume":"331 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3365953.3365958","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The intensive explosion in the generation of large scale cancer gene expression data brought several computational challenges, yet opened great opportunities in exploring different pathways in order to improve cancer prognosis, diagnosis and treatment. In this paper, we propose a targeted unsupervised learning model, based on deep autoencoders (TAE) to learn significant cancer representation based on the gene expression omnibus(GEO) integrated expO data set, for the ultimate goal of constructing an accurate cancer stage predictive model. Where, the trained model was tested on two gene expression cancer data sets namely, lung cancer for clinical stage and intensive breast cancer (IBC) for pathological stage. In which, the model extracted new features space for the two cancer type based on the knowledge built from the expO data set. The generated features were used to train classifiers to predict the cancer stage of each sample. We evaluated the effectiveness of our proposal by comparison to the principal component analysis (PCA) unsupervised dimensionality reduction, as well as to the supervised univariate features selection method. The experimental results, show a promising performance of our analysis model to build a collaborative knowledge from different cancer type to enhance the prediction rate of different cancer stage.