{"title":"DeepGx","authors":"Joseph M. de Guia, M. Devaraj, C. Leung","doi":"10.1145/3341161.3343516","DOIUrl":null,"url":null,"abstract":"This paper aims to explore the problems associated in solving the classification of cancer in gene expression data using deep learning model. Our proposed solution for the cancer classification of ribonucleic acid sequencing (RNA-seq) extracted from the Pan-Cancer Atlas is to transform the 1-dimensional (1D) gene expression values into 2-dimensional (2D) images. This solution of embedding the gene expression values into a 2D image considers the overall features of the genes and computes features that are needed in the classification task of the deep learning model by using the convolutional neural network (CNN). When training and testing the 33 cohorts of cancer types in the convolutional neural network, our classification model led to an accuracy of 95.65%. This result is reasonably good when compared with existing works that use multiclass label classification. We also examine the genes based on their significance related to cancer types through the heat map and associate them with biomarkers. Our CNN for the classification task fosters the deep learning framework in the cancer genome analysis and leads to better understanding of complex features in cancer disease.","PeriodicalId":229882,"journal":{"name":"Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"272 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341161.3343516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56
Abstract
This paper aims to explore the problems associated in solving the classification of cancer in gene expression data using deep learning model. Our proposed solution for the cancer classification of ribonucleic acid sequencing (RNA-seq) extracted from the Pan-Cancer Atlas is to transform the 1-dimensional (1D) gene expression values into 2-dimensional (2D) images. This solution of embedding the gene expression values into a 2D image considers the overall features of the genes and computes features that are needed in the classification task of the deep learning model by using the convolutional neural network (CNN). When training and testing the 33 cohorts of cancer types in the convolutional neural network, our classification model led to an accuracy of 95.65%. This result is reasonably good when compared with existing works that use multiclass label classification. We also examine the genes based on their significance related to cancer types through the heat map and associate them with biomarkers. Our CNN for the classification task fosters the deep learning framework in the cancer genome analysis and leads to better understanding of complex features in cancer disease.