V. Belcastro, D. Bernardo, F. Gregoretti, G. Oliva
{"title":"Parallel Computing Algorithms for Reverse-Engineering and Analysis of Genome-Wide Gene Regulatory Networks from Gene Expression Profiles","authors":"V. Belcastro, D. Bernardo, F. Gregoretti, G. Oliva","doi":"10.1109/PDMC-HIBI.2010.20","DOIUrl":null,"url":null,"abstract":"A Gene Regulatory Network links pairs of genes through an edge if they physically or functionally interact.\"Reverse engineering", a gene regulatory network means to infer the edges between genes from the available experimental data. Transcriptional responses (i.e. gene expression profiles obtained through micro array experiments) are often used to reverse-engineer a network of genes. Reverse-engineering consists in analyzing transcriptional responses to a set of treatments and adding an edge between genes if their expressions show a coordinated behavior on a subset of the treatments, according to some underlying model of gene regulation. Mammalian cells contain tens of thousands of genes, and it is necessary to analyze hundreds of transcriptional responses in order to have acceptable statistical evidence of interactions between genes. There currently exist several ready-to-use software packages able to infer gene networks, but few can be used to infer large-size networks from thousands of transcriptional responses as the dimension of the problem leads to high computational costs and memory requirements. We propose to exploit parallel computing techniques to overcome this problem. In this work, we designed and developed a parallel computing algorithm to reverse engineer large-scale gene regulatory networks from tens of thousands of gene expression profiles. The algorithm is based on computing pair-wise Mutual Information between each gene-pair. We successfully tested it to infer the Mus Musculus (mouse) gene regulatory network in liver from 312 expression profiles collected from a public Internet repository. Each profile measures the expression of 45,101 genes (more specifically, transcripts). We analyzed all of the possible gene-pairs for a total amount of about 1 billion identifying about 60 millions edges. We used a hierarchical clustering algorithm to discover communities within the gene network, and found a modular structure that highlights genes involved in the same biological functions.","PeriodicalId":31175,"journal":{"name":"Infinity","volume":"14 1","pages":"88-94"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infinity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDMC-HIBI.2010.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
A Gene Regulatory Network links pairs of genes through an edge if they physically or functionally interact."Reverse engineering", a gene regulatory network means to infer the edges between genes from the available experimental data. Transcriptional responses (i.e. gene expression profiles obtained through micro array experiments) are often used to reverse-engineer a network of genes. Reverse-engineering consists in analyzing transcriptional responses to a set of treatments and adding an edge between genes if their expressions show a coordinated behavior on a subset of the treatments, according to some underlying model of gene regulation. Mammalian cells contain tens of thousands of genes, and it is necessary to analyze hundreds of transcriptional responses in order to have acceptable statistical evidence of interactions between genes. There currently exist several ready-to-use software packages able to infer gene networks, but few can be used to infer large-size networks from thousands of transcriptional responses as the dimension of the problem leads to high computational costs and memory requirements. We propose to exploit parallel computing techniques to overcome this problem. In this work, we designed and developed a parallel computing algorithm to reverse engineer large-scale gene regulatory networks from tens of thousands of gene expression profiles. The algorithm is based on computing pair-wise Mutual Information between each gene-pair. We successfully tested it to infer the Mus Musculus (mouse) gene regulatory network in liver from 312 expression profiles collected from a public Internet repository. Each profile measures the expression of 45,101 genes (more specifically, transcripts). We analyzed all of the possible gene-pairs for a total amount of about 1 billion identifying about 60 millions edges. We used a hierarchical clustering algorithm to discover communities within the gene network, and found a modular structure that highlights genes involved in the same biological functions.