Communication-avoiding parallel minimum cuts and connected components

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Pub Date : 2018-02-10 DOI:10.1145/3178487.3178504

Lukas Gianinazzi, Pavel Kalvoda, A. Palma, Maciej Besta, T. Hoefler

{"title":"Communication-avoiding parallel minimum cuts and connected components","authors":"Lukas Gianinazzi, Pavel Kalvoda, A. Palma, Maciej Besta, T. Hoefler","doi":"10.1145/3178487.3178504","DOIUrl":null,"url":null,"abstract":"We present novel scalable parallel algorithms for finding global minimum cuts and connected components, which are important and fundamental problems in graph processing. To take advantage of future massively parallel architectures, our algorithms are communication-avoiding: they reduce the costs of communication across the network and the cache hierarchy. The fundamental technique underlying our work is the randomized sparsification of a graph: removing a fraction of graph edges, deriving a solution for such a sparsified graph, and using the result to obtain a solution for the original input. We design and implement sparsification with O(1) synchronization steps. Our global minimum cut algorithm decreases communication costs and computation compared to the state-of-the-art, while our connected components algorithm incurs few cache misses and synchronization steps. We validate our approach by evaluating MPI implementations of the algorithms on a petascale supercomputer. We also provide an approximate variant of the minimum cut algorithm and show that it approximates the exact solutions well while using a fraction of cores in a fraction of time.","PeriodicalId":193776,"journal":{"name":"Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","volume":"2 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3178487.3178504","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

Abstract

We present novel scalable parallel algorithms for finding global minimum cuts and connected components, which are important and fundamental problems in graph processing. To take advantage of future massively parallel architectures, our algorithms are communication-avoiding: they reduce the costs of communication across the network and the cache hierarchy. The fundamental technique underlying our work is the randomized sparsification of a graph: removing a fraction of graph edges, deriving a solution for such a sparsified graph, and using the result to obtain a solution for the original input. We design and implement sparsification with O(1) synchronization steps. Our global minimum cut algorithm decreases communication costs and computation compared to the state-of-the-art, while our connected components algorithm incurs few cache misses and synchronization steps. We validate our approach by evaluating MPI implementations of the algorithms on a petascale supercomputer. We also provide an approximate variant of the minimum cut algorithm and show that it approximates the exact solutions well while using a fraction of cores in a fraction of time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通信-避免平行最小切割和连接组件

我们提出了一种新的可扩展并行算法来寻找全局最小切割和连接分量，这是图处理中的重要和基本问题。为了利用未来的大规模并行架构，我们的算法是避免通信的:它们减少了跨网络和缓存层次结构的通信成本。我们工作的基本技术是图的随机稀疏化:去除一小部分图边，推导出这种稀疏化图的解，并使用结果获得原始输入的解。我们设计并实现了O(1)个同步步骤的稀疏化。与最先进的算法相比，我们的全局最小割算法降低了通信成本和计算量，而我们的连接组件算法导致的缓存丢失和同步步骤很少。我们通过在千兆级超级计算机上评估算法的MPI实现来验证我们的方法。我们还提供了最小切割算法的近似变体，并表明它在一小部分时间内使用一小部分核心时可以很好地近似精确解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

自引率

0.00%

发文量

期刊最新文献

Graph partitioning applied to DAG scheduling to reduce NUMA effects Juggler: a dependence-aware task-based execution framework for GPUs Performance modeling for GPUs using abstract kernel emulation Automated code acceleration targeting heterogeneous openCL devices Layrub: layer-centric GPU memory reuse and data migration in extreme-scale deep learning systems