{"title":"A multi-context feature learning approach to identify disease-specific gene neighborhoods","authors":"S. Ghandikota, A. Jegga","doi":"10.1145/3388440.3412419","DOIUrl":null,"url":null,"abstract":"Analyzing gene networks in a specific phenotype state can provide important insights into pathways and biological processes underlying the onset and progression of the disease. Specifically, analyzing gene neighborhoods around key disease-driver genes and transcription factors can lead to discovery of regulatory networks and novel therapeutic targets. Traditional methods to decipher these regulatory networks mostly rely on transcriptomic signals and do not incorporate the different functional contexts available, making them inadequate to model the inherently complex relationships between genes and their neighborhoods. We present a neural network-based representation learning framework which uses both co-expression and functional gene contexts to learn continuous gene representations. It can be used to extract distributed representations of genes in normal (e.g., control, wild-type, etc.) and perturbed states (e.g., disease, knockout, etc.) by integrating co-expressed gene pairs from multiple transcriptomic datasets. To show the utility of this approach, we trained our model on whole lung tissue transcriptomic studies of idiopathic pulmonary fibrosis (IPF) to generate disease-specific gene representations. We compare the gene features from our method with two other representation learning methods by generating and analyzing the regulatory gene neighborhoods of known transcription factors in the lung tissue. Using several TF-target gene set libraries, we show that the regulatory gene neighborhoods by our method are biologically relevant.","PeriodicalId":411338,"journal":{"name":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3412419","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Analyzing gene networks in a specific phenotype state can provide important insights into pathways and biological processes underlying the onset and progression of the disease. Specifically, analyzing gene neighborhoods around key disease-driver genes and transcription factors can lead to discovery of regulatory networks and novel therapeutic targets. Traditional methods to decipher these regulatory networks mostly rely on transcriptomic signals and do not incorporate the different functional contexts available, making them inadequate to model the inherently complex relationships between genes and their neighborhoods. We present a neural network-based representation learning framework which uses both co-expression and functional gene contexts to learn continuous gene representations. It can be used to extract distributed representations of genes in normal (e.g., control, wild-type, etc.) and perturbed states (e.g., disease, knockout, etc.) by integrating co-expressed gene pairs from multiple transcriptomic datasets. To show the utility of this approach, we trained our model on whole lung tissue transcriptomic studies of idiopathic pulmonary fibrosis (IPF) to generate disease-specific gene representations. We compare the gene features from our method with two other representation learning methods by generating and analyzing the regulatory gene neighborhoods of known transcription factors in the lung tissue. Using several TF-target gene set libraries, we show that the regulatory gene neighborhoods by our method are biologically relevant.