A random walk based approach for improving protein-protein interaction network and protein complex prediction

2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops Pub Date : 2012-10-04 DOI:10.1109/BIBM.2012.6392693

Chengwei Lei, Jianhua Ruan

{"title":"A random walk based approach for improving protein-protein interaction network and protein complex prediction","authors":"Chengwei Lei, Jianhua Ruan","doi":"10.1109/BIBM.2012.6392693","DOIUrl":null,"url":null,"abstract":"Recent advances in high-throughput technology have dramatically increased the quantity of available protein-protein interaction (PPI) data and stimulated the development of many methods for predicting protein complexes, which are important in understanding the functional organization of protein-protein interaction networks in different biological processes. However, automated protein complex prediction from PPI data alone is significantly hindered by the high level of noise, sparseness, and highly skewed degree distribution of PPI networks. Here we present a novel network topology-based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of protein complex prediction by reducing the impact of hub nodes. The key idea of our algorithm is that two proteins sharing some high-order topological similarities, which are measured by a novel random walk-based procedure, are likely interacting with each other and may belong to the same protein complex. Applying our algorithm to a yeast protein-protein interaction network, we found that the interactions in the reconstructed PPI network have more significant biological relevance than the original network, assessed by multiple types of information, including gene ontology, gene expression, essentiality, conservation between species, and known protein complexes. Comparison with several existing methods show that the network reconstructed by our method has the highest quality. Finally, using two independent graph clustering algorithms, we found that the reconstructed network has resulted in significantly improved prediction accuracy of protein complexes.","PeriodicalId":6392,"journal":{"name":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2012.6392693","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Recent advances in high-throughput technology have dramatically increased the quantity of available protein-protein interaction (PPI) data and stimulated the development of many methods for predicting protein complexes, which are important in understanding the functional organization of protein-protein interaction networks in different biological processes. However, automated protein complex prediction from PPI data alone is significantly hindered by the high level of noise, sparseness, and highly skewed degree distribution of PPI networks. Here we present a novel network topology-based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of protein complex prediction by reducing the impact of hub nodes. The key idea of our algorithm is that two proteins sharing some high-order topological similarities, which are measured by a novel random walk-based procedure, are likely interacting with each other and may belong to the same protein complex. Applying our algorithm to a yeast protein-protein interaction network, we found that the interactions in the reconstructed PPI network have more significant biological relevance than the original network, assessed by multiple types of information, including gene ontology, gene expression, essentiality, conservation between species, and known protein complexes. Comparison with several existing methods show that the network reconstructed by our method has the highest quality. Finally, using two independent graph clustering algorithms, we found that the reconstructed network has resulted in significantly improved prediction accuracy of protein complexes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种改进蛋白质相互作用网络和蛋白质复合物预测的随机漫步方法

高通量技术的最新进展极大地增加了可用的蛋白质-蛋白质相互作用(PPI)数据的数量，并刺激了许多预测蛋白质复合物方法的发展，这对于理解不同生物过程中蛋白质-蛋白质相互作用网络的功能组织非常重要。然而，仅从PPI数据自动预测蛋白质复合物会受到PPI网络的高噪声、稀疏性和高度偏斜度分布的严重阻碍。本文提出了一种新的基于网络拓扑的算法，通过计算预测来去除虚假相互作用并恢复缺失的相互作用，并通过减少集线器节点的影响来提高蛋白质复合体预测的准确性。我们的算法的关键思想是，两个蛋白质共享一些高阶拓扑相似性，这是由一种新的基于随机行走的程序来测量的，它们可能相互作用，可能属于同一个蛋白质复合物。将我们的算法应用于酵母蛋白-蛋白相互作用网络，我们发现重建的PPI网络中的相互作用比原始网络具有更显著的生物学相关性，包括基因本体、基因表达、必要性、物种之间的保守性和已知的蛋白质复合物等多种类型的信息。与现有几种方法的比较表明，本文方法重构的网络具有较高的质量。最后，使用两种独立的图聚类算法，我们发现重建的网络显著提高了蛋白质复合物的预测精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops

自引率

0.00%

发文量

期刊最新文献

Towards comprehensive longitudinal healthcare data capture On the repetitive collection indexing problem Sampling low-energy protein-protein configurations with basin hopping The effect of measurement approach and noise level on gene selection stability Clinical research progress of treatment over Tourette syndrome with acup-mox therapy