Towards Fast Overlapping Community Detection

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) Pub Date : 2016-05-16 DOI:10.1109/CCGrid.2016.98

I. El-Helw, Rutger F. H. Hofman, H. Bal

{"title":"Towards Fast Overlapping Community Detection","authors":"I. El-Helw, Rutger F. H. Hofman, H. Bal","doi":"10.1109/CCGrid.2016.98","DOIUrl":null,"url":null,"abstract":"Accelerating sequential algorithms in order to achieve high performance is often a nontrivial task. However, there are certain properties that can exacerbate this process and make it particularly daunting. For example, building an efficient parallel solution for a data-intensive algorithm requires a deep analysis of the memory access patterns and data reuse potential. Attempting to scale out the computations on clusters of machines introduces further complications due to network speed limitations. In this context, the optimization landscape can be extremely complex owing to the large number of trade-off decisions. In this paper, we discuss our experience designing two parallel implementations of an existing data-intensive machine learning algorithm that detects overlapping communities in graphs. The first design uses a single GPU to accelerate the computations of small data sets. We employed a code generation strategy in order to test and identify the best performing combination of optimizations. The second design uses a cluster of machines to scale out the computations for larger problem sizes. We used a mixture of MPI, RDMA and pipelining in order to circumvent networking overhead. Both these efforts bring us closer to understanding the complex relationships hidden within networks of entities.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2016.98","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Accelerating sequential algorithms in order to achieve high performance is often a nontrivial task. However, there are certain properties that can exacerbate this process and make it particularly daunting. For example, building an efficient parallel solution for a data-intensive algorithm requires a deep analysis of the memory access patterns and data reuse potential. Attempting to scale out the computations on clusters of machines introduces further complications due to network speed limitations. In this context, the optimization landscape can be extremely complex owing to the large number of trade-off decisions. In this paper, we discuss our experience designing two parallel implementations of an existing data-intensive machine learning algorithm that detects overlapping communities in graphs. The first design uses a single GPU to accelerate the computations of small data sets. We employed a code generation strategy in order to test and identify the best performing combination of optimizations. The second design uses a cluster of machines to scale out the computations for larger problem sizes. We used a mixture of MPI, RDMA and pipelining in order to circumvent networking overhead. Both these efforts bring us closer to understanding the complex relationships hidden within networks of entities.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

快速重叠社团检测研究

加速顺序算法以获得高性能通常是一项非常重要的任务。然而，某些属性会加剧这一过程，并使其特别令人生畏。例如，为数据密集型算法构建有效的并行解决方案需要对内存访问模式和数据重用潜力进行深入分析。由于网络速度限制，尝试在机器集群上扩展计算会带来进一步的复杂性。在这种情况下，由于大量的权衡决策，优化环境可能非常复杂。在本文中，我们讨论了我们设计现有数据密集型机器学习算法的两个并行实现的经验，该算法可以检测图中的重叠社区。第一个设计使用单个GPU来加速小数据集的计算。我们采用了代码生成策略来测试和确定最佳性能的优化组合。第二种设计使用一组机器来扩展计算以解决更大的问题。为了避免网络开销，我们混合使用了MPI、RDMA和流水线。这两种努力都使我们更接近于理解隐藏在实体网络中的复杂关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)

自引率

0.00%

发文量