A random walk based approach for improving protein-protein interaction network and protein complex prediction

Chengwei Lei, Jianhua Ruan
{"title":"A random walk based approach for improving protein-protein interaction network and protein complex prediction","authors":"Chengwei Lei, Jianhua Ruan","doi":"10.1109/BIBM.2012.6392693","DOIUrl":null,"url":null,"abstract":"Recent advances in high-throughput technology have dramatically increased the quantity of available protein-protein interaction (PPI) data and stimulated the development of many methods for predicting protein complexes, which are important in understanding the functional organization of protein-protein interaction networks in different biological processes. However, automated protein complex prediction from PPI data alone is significantly hindered by the high level of noise, sparseness, and highly skewed degree distribution of PPI networks. Here we present a novel network topology-based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of protein complex prediction by reducing the impact of hub nodes. The key idea of our algorithm is that two proteins sharing some high-order topological similarities, which are measured by a novel random walk-based procedure, are likely interacting with each other and may belong to the same protein complex. Applying our algorithm to a yeast protein-protein interaction network, we found that the interactions in the reconstructed PPI network have more significant biological relevance than the original network, assessed by multiple types of information, including gene ontology, gene expression, essentiality, conservation between species, and known protein complexes. Comparison with several existing methods show that the network reconstructed by our method has the highest quality. Finally, using two independent graph clustering algorithms, we found that the reconstructed network has resulted in significantly improved prediction accuracy of protein complexes.","PeriodicalId":6392,"journal":{"name":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2012.6392693","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Recent advances in high-throughput technology have dramatically increased the quantity of available protein-protein interaction (PPI) data and stimulated the development of many methods for predicting protein complexes, which are important in understanding the functional organization of protein-protein interaction networks in different biological processes. However, automated protein complex prediction from PPI data alone is significantly hindered by the high level of noise, sparseness, and highly skewed degree distribution of PPI networks. Here we present a novel network topology-based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of protein complex prediction by reducing the impact of hub nodes. The key idea of our algorithm is that two proteins sharing some high-order topological similarities, which are measured by a novel random walk-based procedure, are likely interacting with each other and may belong to the same protein complex. Applying our algorithm to a yeast protein-protein interaction network, we found that the interactions in the reconstructed PPI network have more significant biological relevance than the original network, assessed by multiple types of information, including gene ontology, gene expression, essentiality, conservation between species, and known protein complexes. Comparison with several existing methods show that the network reconstructed by our method has the highest quality. Finally, using two independent graph clustering algorithms, we found that the reconstructed network has resulted in significantly improved prediction accuracy of protein complexes.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种改进蛋白质相互作用网络和蛋白质复合物预测的随机漫步方法
高通量技术的最新进展极大地增加了可用的蛋白质-蛋白质相互作用(PPI)数据的数量,并刺激了许多预测蛋白质复合物方法的发展,这对于理解不同生物过程中蛋白质-蛋白质相互作用网络的功能组织非常重要。然而,仅从PPI数据自动预测蛋白质复合物会受到PPI网络的高噪声、稀疏性和高度偏斜度分布的严重阻碍。本文提出了一种新的基于网络拓扑的算法,通过计算预测来去除虚假相互作用并恢复缺失的相互作用,并通过减少集线器节点的影响来提高蛋白质复合体预测的准确性。我们的算法的关键思想是,两个蛋白质共享一些高阶拓扑相似性,这是由一种新的基于随机行走的程序来测量的,它们可能相互作用,可能属于同一个蛋白质复合物。将我们的算法应用于酵母蛋白-蛋白相互作用网络,我们发现重建的PPI网络中的相互作用比原始网络具有更显著的生物学相关性,包括基因本体、基因表达、必要性、物种之间的保守性和已知的蛋白质复合物等多种类型的信息。与现有几种方法的比较表明,本文方法重构的网络具有较高的质量。最后,使用两种独立的图聚类算法,我们发现重建的网络显著提高了蛋白质复合物的预测精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Towards comprehensive longitudinal healthcare data capture On the repetitive collection indexing problem Sampling low-energy protein-protein configurations with basin hopping The effect of measurement approach and noise level on gene selection stability Clinical research progress of treatment over Tourette syndrome with acup-mox therapy
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1