基于PPIs本体分析的图熵法检测蛋白质复合物的改进

2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops Pub Date : 2012-10-04 DOI:10.1109/BIBMW.2012.6470306

Francisco I. Pena, Young-Rae Cho

{"title":"基于PPIs本体分析的图熵法检测蛋白质复合物的改进","authors":"Francisco I. Pena, Young-Rae Cho","doi":"10.1109/BIBMW.2012.6470306","DOIUrl":null,"url":null,"abstract":"The generation of protein-protein interactions (PPIs) has created the need for efficient computational approaches that can discover highly modular clusters of good quality. These clusters represent protein complexes or functional modules. There are a number of seed-growth style algorithms that exist to identify protein complexes from the genome-wide PPI networks. However, these methods lose accuracy when the networks are comparatively large and have complex connectivity. To combat the noise that exists in these large PPI networks, we propose an improvement to the graph entropy approach which is one of the seed-growth style algorithms. As a novel information-theoretic definition, Graph Entropy is a measure of the structural complexity of a graph. For example, the loss of entropy represents an increase in modularity of the graph. The original algorithm only considers the interconnected nature of vertices, but the new modified definition now considers edge weights. These edge weights are achieved by measuring the semantic similarity of PPIs. The weighted graph entropy approach is applied to the S. cerevisiae PPI data set from BioGRID. The output clusters are compared with known protein complexes so that we can calculate /-scores and use them to evaluate the clusters accuracy. The proposed improvement to the graph entropy approach proves to enhance the quality of clusters as potential protein complexes when compared to the other seed-growth style algorithms.","PeriodicalId":6392,"journal":{"name":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":"21 1","pages":"211-217"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improvements of graph entropy approach to detect protein complexes by ontological analysis of PPIs\",\"authors\":\"Francisco I. Pena, Young-Rae Cho\",\"doi\":\"10.1109/BIBMW.2012.6470306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The generation of protein-protein interactions (PPIs) has created the need for efficient computational approaches that can discover highly modular clusters of good quality. These clusters represent protein complexes or functional modules. There are a number of seed-growth style algorithms that exist to identify protein complexes from the genome-wide PPI networks. However, these methods lose accuracy when the networks are comparatively large and have complex connectivity. To combat the noise that exists in these large PPI networks, we propose an improvement to the graph entropy approach which is one of the seed-growth style algorithms. As a novel information-theoretic definition, Graph Entropy is a measure of the structural complexity of a graph. For example, the loss of entropy represents an increase in modularity of the graph. The original algorithm only considers the interconnected nature of vertices, but the new modified definition now considers edge weights. These edge weights are achieved by measuring the semantic similarity of PPIs. The weighted graph entropy approach is applied to the S. cerevisiae PPI data set from BioGRID. The output clusters are compared with known protein complexes so that we can calculate /-scores and use them to evaluate the clusters accuracy. The proposed improvement to the graph entropy approach proves to enhance the quality of clusters as potential protein complexes when compared to the other seed-growth style algorithms.\",\"PeriodicalId\":6392,\"journal\":{\"name\":\"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops\",\"volume\":\"21 1\",\"pages\":\"211-217\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBMW.2012.6470306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2012.6470306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

蛋白质-蛋白质相互作用(ppi)的产生产生了对高效计算方法的需求，这些方法可以发现高质量的高度模块化集群。这些团簇代表蛋白质复合物或功能模块。有许多种子生长类型的算法可以从全基因组的PPI网络中识别蛋白质复合物。然而，当网络规模较大且连接复杂时，这些方法会失去准确性。为了对抗这些大型PPI网络中存在的噪声，我们提出了一种改进的图熵方法，这是一种种子生长类型的算法。图熵作为一种新的信息论定义，是对图的结构复杂度的度量。例如，熵的损失表示图的模块化的增加。原来的算法只考虑了顶点之间的互联性，而修改后的定义现在考虑了边的权重。这些边缘权重是通过测量ppi的语义相似度来实现的。将加权图熵方法应用于来自BioGRID的S. cerevisiae PPI数据集。将输出的聚类与已知的蛋白质复合物进行比较，以便我们可以计算/-分数并使用它们来评估聚类的准确性。与其他种子生长类型算法相比，对图熵方法的改进证明可以提高簇作为潜在蛋白质复合物的质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Improvements of graph entropy approach to detect protein complexes by ontological analysis of PPIs

The generation of protein-protein interactions (PPIs) has created the need for efficient computational approaches that can discover highly modular clusters of good quality. These clusters represent protein complexes or functional modules. There are a number of seed-growth style algorithms that exist to identify protein complexes from the genome-wide PPI networks. However, these methods lose accuracy when the networks are comparatively large and have complex connectivity. To combat the noise that exists in these large PPI networks, we propose an improvement to the graph entropy approach which is one of the seed-growth style algorithms. As a novel information-theoretic definition, Graph Entropy is a measure of the structural complexity of a graph. For example, the loss of entropy represents an increase in modularity of the graph. The original algorithm only considers the interconnected nature of vertices, but the new modified definition now considers edge weights. These edge weights are achieved by measuring the semantic similarity of PPIs. The weighted graph entropy approach is applied to the S. cerevisiae PPI data set from BioGRID. The output clusters are compared with known protein complexes so that we can calculate /-scores and use them to evaluate the clusters accuracy. The proposed improvement to the graph entropy approach proves to enhance the quality of clusters as potential protein complexes when compared to the other seed-growth style algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops

自引率

0.00%

发文量

期刊最新文献

Towards comprehensive longitudinal healthcare data capture On the repetitive collection indexing problem Sampling low-energy protein-protein configurations with basin hopping The effect of measurement approach and noise level on gene selection stability Clinical research progress of treatment over Tourette syndrome with acup-mox therapy