Multi-stage graph peeling algorithm for probabilistic core decomposition

Yang Guo, Xuekui Zhang, F. Esfahani, Venkatesh Srinivasan, Alex Thomo, Li Xing
{"title":"Multi-stage graph peeling algorithm for probabilistic core decomposition","authors":"Yang Guo, Xuekui Zhang, F. Esfahani, Venkatesh Srinivasan, Alex Thomo, Li Xing","doi":"10.1145/3487351.3489470","DOIUrl":null,"url":null,"abstract":"Mining dense subgraphs where vertices connect closely with each other is a common task when analyzing graphs. A very popular notion in subgraph analysis is core decomposition. Recently, Esfahani et al. presented a probabilistic core decomposition algorithm based on graph peeling and Central Limit Theorem (CLT) that is capable of handling very large graphs. Their proposed peeling algorithm (PA) starts from the lowest degree vertices and recursively deletes these vertices, assigning core numbers, and updating the degree of neighbour vertices until it reached the maximum core. However, in many applications, particularly in biology, more valuable information can be obtained from dense sub-communities and we are not interested in small cores where vertices do not interact much with others. To make the previous PA focus more on dense subgraphs, we propose a multi-stage graph peeling algorithm (M-PA) that has a two-stage data screening procedure added before the previous PA. After removing vertices from the graph based on the user-defined thresholds, we can reduce the graph complexity largely and without affecting the vertices in subgraphs that we are interested in. We show that M-PA is more efficient than the previous PA and with the properly set filtering threshold, can produce very similar if not identical dense subgraphs to the previous PA (in terms of graph density and clustering coefficient).","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487351.3489470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Mining dense subgraphs where vertices connect closely with each other is a common task when analyzing graphs. A very popular notion in subgraph analysis is core decomposition. Recently, Esfahani et al. presented a probabilistic core decomposition algorithm based on graph peeling and Central Limit Theorem (CLT) that is capable of handling very large graphs. Their proposed peeling algorithm (PA) starts from the lowest degree vertices and recursively deletes these vertices, assigning core numbers, and updating the degree of neighbour vertices until it reached the maximum core. However, in many applications, particularly in biology, more valuable information can be obtained from dense sub-communities and we are not interested in small cores where vertices do not interact much with others. To make the previous PA focus more on dense subgraphs, we propose a multi-stage graph peeling algorithm (M-PA) that has a two-stage data screening procedure added before the previous PA. After removing vertices from the graph based on the user-defined thresholds, we can reduce the graph complexity largely and without affecting the vertices in subgraphs that we are interested in. We show that M-PA is more efficient than the previous PA and with the properly set filtering threshold, can produce very similar if not identical dense subgraphs to the previous PA (in terms of graph density and clustering coefficient).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
概率核分解的多阶段图剥离算法
在分析图时,挖掘顶点紧密相连的密集子图是一项常见的任务。子图分析中一个非常流行的概念是核心分解。最近,Esfahani等人提出了一种基于图剥离和中心极限定理(CLT)的概率核心分解算法,能够处理非常大的图。他们提出的剥离算法(PA)从最低度的顶点开始,递归地删除这些顶点,分配核数,并更新相邻顶点的度,直到达到最大核。然而,在许多应用中,特别是在生物学中,更有价值的信息可以从密集的子群落中获得,我们对小的核心不感兴趣,那里的顶点与其他顶点没有太多的交互。为了使先前的PA更关注密集子图,我们提出了一种多阶段图剥离算法(M-PA),该算法在先前的PA之前添加了两阶段的数据筛选过程。在根据用户定义的阈值从图中删除顶点后,我们可以在不影响我们感兴趣的子图中的顶点的情况下大大降低图的复杂性。我们证明M-PA比之前的PA更有效,并且通过适当设置的过滤阈值,可以产生与之前的PA非常相似(如果不是完全相同的话)的密集子图(就图密度和聚类系数而言)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Predicting COVID-19 with AI techniques: current research and future directions Predictions of drug metabolism pathways through CYP 3A4 enzyme by analysing drug-target interactions network graph An insight into network structure measures and number of driver nodes Temporal dynamics of posts and user engagement of influencers on Facebook and Instagram Vibe check: social resonance learning for enhanced recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1