Parallel Computing Algorithms for Reverse-Engineering and Analysis of Genome-Wide Gene Regulatory Networks from Gene Expression Profiles

Infinity Pub Date : 2010-09-30 DOI:10.1109/PDMC-HIBI.2010.20
V. Belcastro, D. Bernardo, F. Gregoretti, G. Oliva
{"title":"Parallel Computing Algorithms for Reverse-Engineering and Analysis of Genome-Wide Gene Regulatory Networks from Gene Expression Profiles","authors":"V. Belcastro, D. Bernardo, F. Gregoretti, G. Oliva","doi":"10.1109/PDMC-HIBI.2010.20","DOIUrl":null,"url":null,"abstract":"A Gene Regulatory Network links pairs of genes through an edge if they physically or functionally interact.\"Reverse engineering&quot, a gene regulatory network means to infer the edges between genes from the available experimental data. Transcriptional responses (i.e. gene expression profiles obtained through micro array experiments) are often used to reverse-engineer a network of genes. Reverse-engineering consists in analyzing transcriptional responses to a set of treatments and adding an edge between genes if their expressions show a coordinated behavior on a subset of the treatments, according to some underlying model of gene regulation. Mammalian cells contain tens of thousands of genes, and it is necessary to analyze hundreds of transcriptional responses in order to have acceptable statistical evidence of interactions between genes. There currently exist several ready-to-use software packages able to infer gene networks, but few can be used to infer large-size networks from thousands of transcriptional responses as the dimension of the problem leads to high computational costs and memory requirements. We propose to exploit parallel computing techniques to overcome this problem. In this work, we designed and developed a parallel computing algorithm to reverse engineer large-scale gene regulatory networks from tens of thousands of gene expression profiles. The algorithm is based on computing pair-wise Mutual Information between each gene-pair. We successfully tested it to infer the Mus Musculus (mouse) gene regulatory network in liver from 312 expression profiles collected from a public Internet repository. Each profile measures the expression of 45,101 genes (more specifically, transcripts). We analyzed all of the possible gene-pairs for a total amount of about 1 billion identifying about 60 millions edges. We used a hierarchical clustering algorithm to discover communities within the gene network, and found a modular structure that highlights genes involved in the same biological functions.","PeriodicalId":31175,"journal":{"name":"Infinity","volume":"14 1","pages":"88-94"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infinity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDMC-HIBI.2010.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

A Gene Regulatory Network links pairs of genes through an edge if they physically or functionally interact."Reverse engineering", a gene regulatory network means to infer the edges between genes from the available experimental data. Transcriptional responses (i.e. gene expression profiles obtained through micro array experiments) are often used to reverse-engineer a network of genes. Reverse-engineering consists in analyzing transcriptional responses to a set of treatments and adding an edge between genes if their expressions show a coordinated behavior on a subset of the treatments, according to some underlying model of gene regulation. Mammalian cells contain tens of thousands of genes, and it is necessary to analyze hundreds of transcriptional responses in order to have acceptable statistical evidence of interactions between genes. There currently exist several ready-to-use software packages able to infer gene networks, but few can be used to infer large-size networks from thousands of transcriptional responses as the dimension of the problem leads to high computational costs and memory requirements. We propose to exploit parallel computing techniques to overcome this problem. In this work, we designed and developed a parallel computing algorithm to reverse engineer large-scale gene regulatory networks from tens of thousands of gene expression profiles. The algorithm is based on computing pair-wise Mutual Information between each gene-pair. We successfully tested it to infer the Mus Musculus (mouse) gene regulatory network in liver from 312 expression profiles collected from a public Internet repository. Each profile measures the expression of 45,101 genes (more specifically, transcripts). We analyzed all of the possible gene-pairs for a total amount of about 1 billion identifying about 60 millions edges. We used a hierarchical clustering algorithm to discover communities within the gene network, and found a modular structure that highlights genes involved in the same biological functions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于基因表达谱的全基因组基因调控网络逆向工程与分析的并行计算算法
如果一对基因在物理上或功能上相互作用,基因调控网络通过边缘将它们连接起来。“逆向工程”,一个基因调控网络意味着从现有的实验数据中推断出基因之间的边缘。转录反应(即通过微阵列实验获得的基因表达谱)通常用于对基因网络进行逆向工程。逆向工程包括分析对一系列处理的转录反应,如果它们的表达在处理的一个子集上显示出协调的行为,那么根据一些潜在的基因调控模型,在基因之间添加一个边缘。哺乳动物细胞包含数以万计的基因,有必要分析数以百计的转录反应,以便有可接受的基因之间相互作用的统计证据。目前有几个现成的软件包能够推断基因网络,但很少有软件可以从数千个转录反应中推断出大规模的网络,因为这个问题的规模导致了高计算成本和内存需求。我们建议利用并行计算技术来克服这个问题。在这项工作中,我们设计并开发了一种并行计算算法,从数万个基因表达谱中对大规模基因调控网络进行逆向工程。该算法基于计算每个基因对之间的成对互信息。我们成功地对其进行了测试,从公共互联网存储库收集的312个表达谱中推断出肝脏中的小家鼠(小鼠)基因调控网络。每个剖面测量45101个基因的表达(更具体地说,转录本)。我们分析了所有可能的基因对,总共有大约10亿对,确定了大约6000万条边。我们使用分层聚类算法来发现基因网络中的社区,并发现了一个模块化结构,突出了涉及相同生物功能的基因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.30
自引率
0.00%
发文量
26
审稿时长
10 weeks
期刊最新文献
Sistem Informasi Perpustakaan Pada SMKN 3 Tana Toraja Dengan Metode Rapid Application Development (RAD) Perancangan Sistem Informasi Paket Wisata Berbasis Web Studi Kasus Sarira Trip Penggunaan YOLOv4 Untuk Menentukan Lokasi Dosen Dan Mahasiswa Dengan Menggunakan CCTV Aplikasi Reservasi Pantan Toraja Hotel Prediksi Persediaan Sepeda Motor Pada Dealer Yamaha Jaya Baru Motor Mengunakan Metode Frequent Pattern (FF-Growth)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1