Multi-stage graph peeling algorithm for probabilistic core decomposition

Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Pub Date : 2021-08-13 DOI:10.1145/3487351.3489470

Yang Guo, Xuekui Zhang, F. Esfahani, Venkatesh Srinivasan, Alex Thomo, Li Xing

{"title":"Multi-stage graph peeling algorithm for probabilistic core decomposition","authors":"Yang Guo, Xuekui Zhang, F. Esfahani, Venkatesh Srinivasan, Alex Thomo, Li Xing","doi":"10.1145/3487351.3489470","DOIUrl":null,"url":null,"abstract":"Mining dense subgraphs where vertices connect closely with each other is a common task when analyzing graphs. A very popular notion in subgraph analysis is core decomposition. Recently, Esfahani et al. presented a probabilistic core decomposition algorithm based on graph peeling and Central Limit Theorem (CLT) that is capable of handling very large graphs. Their proposed peeling algorithm (PA) starts from the lowest degree vertices and recursively deletes these vertices, assigning core numbers, and updating the degree of neighbour vertices until it reached the maximum core. However, in many applications, particularly in biology, more valuable information can be obtained from dense sub-communities and we are not interested in small cores where vertices do not interact much with others. To make the previous PA focus more on dense subgraphs, we propose a multi-stage graph peeling algorithm (M-PA) that has a two-stage data screening procedure added before the previous PA. After removing vertices from the graph based on the user-defined thresholds, we can reduce the graph complexity largely and without affecting the vertices in subgraphs that we are interested in. We show that M-PA is more efficient than the previous PA and with the properly set filtering threshold, can produce very similar if not identical dense subgraphs to the previous PA (in terms of graph density and clustering coefficient).","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487351.3489470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Mining dense subgraphs where vertices connect closely with each other is a common task when analyzing graphs. A very popular notion in subgraph analysis is core decomposition. Recently, Esfahani et al. presented a probabilistic core decomposition algorithm based on graph peeling and Central Limit Theorem (CLT) that is capable of handling very large graphs. Their proposed peeling algorithm (PA) starts from the lowest degree vertices and recursively deletes these vertices, assigning core numbers, and updating the degree of neighbour vertices until it reached the maximum core. However, in many applications, particularly in biology, more valuable information can be obtained from dense sub-communities and we are not interested in small cores where vertices do not interact much with others. To make the previous PA focus more on dense subgraphs, we propose a multi-stage graph peeling algorithm (M-PA) that has a two-stage data screening procedure added before the previous PA. After removing vertices from the graph based on the user-defined thresholds, we can reduce the graph complexity largely and without affecting the vertices in subgraphs that we are interested in. We show that M-PA is more efficient than the previous PA and with the properly set filtering threshold, can produce very similar if not identical dense subgraphs to the previous PA (in terms of graph density and clustering coefficient).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

概率核分解的多阶段图剥离算法

在分析图时，挖掘顶点紧密相连的密集子图是一项常见的任务。子图分析中一个非常流行的概念是核心分解。最近，Esfahani等人提出了一种基于图剥离和中心极限定理(CLT)的概率核心分解算法，能够处理非常大的图。他们提出的剥离算法(PA)从最低度的顶点开始，递归地删除这些顶点，分配核数，并更新相邻顶点的度，直到达到最大核。然而，在许多应用中，特别是在生物学中，更有价值的信息可以从密集的子群落中获得，我们对小的核心不感兴趣，那里的顶点与其他顶点没有太多的交互。为了使先前的PA更关注密集子图，我们提出了一种多阶段图剥离算法(M-PA)，该算法在先前的PA之前添加了两阶段的数据筛选过程。在根据用户定义的阈值从图中删除顶点后，我们可以在不影响我们感兴趣的子图中的顶点的情况下大大降低图的复杂性。我们证明M-PA比之前的PA更有效，并且通过适当设置的过滤阈值，可以产生与之前的PA非常相似(如果不是完全相同的话)的密集子图(就图密度和聚类系数而言)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

自引率

0.00%

发文量