Complex Network Analysis Using Parallel Approximate Motif Counting

2014 IEEE 28th International Parallel and Distributed Processing Symposium Pub Date : 2014-05-19 DOI:10.1109/IPDPS.2014.50

George M. Slota, Kamesh Madduri

{"title":"Complex Network Analysis Using Parallel Approximate Motif Counting","authors":"George M. Slota, Kamesh Madduri","doi":"10.1109/IPDPS.2014.50","DOIUrl":null,"url":null,"abstract":"Subgraph counting forms the basis of many complex network analysis metrics, including motif and anti-motif finding, relative graph let frequency distance, and graph let degree distribution agreements. Determining exact subgraph counts is computationally very expensive. In recent work, we present FASCIA, a shared-memory parallel algorithm and implementation for approximate subgraph counting. FASCIA uses a dynamic programming-based approach and is significantly faster than exhaustive enumeration, while generating high-quality approximations of subgraph counts. However, the memory usage of the dynamic programming step prohibits us from applying FASCIA to very large graphs. In this paper, we introduce a distributed-memory parallelization of FASCIA by partitioning the graph and the dynamic programming table. We discuss a new collective communication scheme to make the dynamic programming step memory-efficient. These optimizations enable scaling to much larger networks than before. We also present a simple parallelization strategy for distributed subgraph counting on smaller networks. The new additions let us use subgraph counts as graph signatures for a large network collection, and we analyze this collection using various subgraph count-based graph analytics.","PeriodicalId":309291,"journal":{"name":"2014 IEEE 28th International Parallel and Distributed Processing Symposium","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 28th International Parallel and Distributed Processing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2014.50","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 32

Abstract

Subgraph counting forms the basis of many complex network analysis metrics, including motif and anti-motif finding, relative graph let frequency distance, and graph let degree distribution agreements. Determining exact subgraph counts is computationally very expensive. In recent work, we present FASCIA, a shared-memory parallel algorithm and implementation for approximate subgraph counting. FASCIA uses a dynamic programming-based approach and is significantly faster than exhaustive enumeration, while generating high-quality approximations of subgraph counts. However, the memory usage of the dynamic programming step prohibits us from applying FASCIA to very large graphs. In this paper, we introduce a distributed-memory parallelization of FASCIA by partitioning the graph and the dynamic programming table. We discuss a new collective communication scheme to make the dynamic programming step memory-efficient. These optimizations enable scaling to much larger networks than before. We also present a simple parallelization strategy for distributed subgraph counting on smaller networks. The new additions let us use subgraph counts as graph signatures for a large network collection, and we analyze this collection using various subgraph count-based graph analytics.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于并行近似基序计数的复杂网络分析

子图计数构成了许多复杂网络分析指标的基础，包括基序和反基序发现、相对图let频率距离和图let度分布协议。确定精确的子图计数在计算上是非常昂贵的。在最近的工作中，我们提出了FASCIA，一种用于近似子图计数的共享内存并行算法和实现。FASCIA使用基于动态规划的方法，比穷举枚举要快得多，同时生成子图计数的高质量近似。然而，动态规划步骤的内存使用使我们无法将FASCIA应用于非常大的图形。本文通过图的划分和动态规划表的划分，介绍了FASCIA的分布式内存并行化。我们讨论了一种新的集体通信方案，使动态规划步骤节省内存。这些优化可以扩展到比以前大得多的网络。我们还提出了一种简单的并行化策略，用于较小网络上的分布式子图计数。新添加的功能使我们可以使用子图计数作为大型网络集合的图签名，并且我们使用各种基于子图计数的图分析来分析这个集合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 IEEE 28th International Parallel and Distributed Processing Symposium

自引率

0.00%

发文量