PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs

IF 0.9 Q3 COMPUTER SCIENCE, THEORY & METHODS ACM Transactions on Parallel Computing Pub Date : 2019-01-23 DOI:10.1145/3298989
Rong Chen, Jiaxin Shi, Yanzhe Chen, B. Zang, Haibing Guan, Haibo Chen
{"title":"PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs","authors":"Rong Chen, Jiaxin Shi, Yanzhe Chen, B. Zang, Haibing Guan, Haibo Chen","doi":"10.1145/3298989","DOIUrl":null,"url":null,"abstract":"Natural graphs with skewed distributions raise unique challenges to distributed graph computation and partitioning. Existing graph-parallel systems usually use a “one-size-fits-all” design that uniformly processes all vertices, which either suffer from notable load imbalance and high contention for high-degree vertices (e.g., Pregel and GraphLab) or incur high communication cost and memory consumption even for low-degree vertices (e.g., PowerGraph and GraphX). In this article, we argue that skewed distributions in natural graphs also necessitate differentiated processing on high-degree and low-degree vertices. We then introduce PowerLyra, a new distributed graph processing system that embraces the best of both worlds of existing graph-parallel systems. Specifically, PowerLyra uses centralized computation for low-degree vertices to avoid frequent communications and distributes the computation for high-degree vertices to balance workloads. PowerLyra further provides an efficient hybrid graph partitioning algorithm (i.e., hybrid-cut) that combines edge-cut (for low-degree vertices) and vertex-cut (for high-degree vertices) with heuristics. To improve cache locality of inter-node graph accesses, PowerLyra further provides a locality-conscious data layout optimization. PowerLyra is implemented based on the latest GraphLab and can seamlessly support various graph algorithms running in both synchronous and asynchronous execution modes. A detailed evaluation on three clusters using various graph-analytics and MLDM (Machine Learning and Data Mining) applications shows that PowerLyra outperforms PowerGraph by up to 5.53X (from 1.24X) and 3.26X (from 1.49X) for real-world and synthetic graphs, respectively, and is much faster than other systems like GraphX and Giraph, yet with much less memory consumption. A porting of hybrid-cut to GraphX further confirms the efficiency and generality of PowerLyra.","PeriodicalId":42115,"journal":{"name":"ACM Transactions on Parallel Computing","volume":"5 1","pages":"13:1-13:39"},"PeriodicalIF":0.9000,"publicationDate":"2019-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"323","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Parallel Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3298989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 323

Abstract

Natural graphs with skewed distributions raise unique challenges to distributed graph computation and partitioning. Existing graph-parallel systems usually use a “one-size-fits-all” design that uniformly processes all vertices, which either suffer from notable load imbalance and high contention for high-degree vertices (e.g., Pregel and GraphLab) or incur high communication cost and memory consumption even for low-degree vertices (e.g., PowerGraph and GraphX). In this article, we argue that skewed distributions in natural graphs also necessitate differentiated processing on high-degree and low-degree vertices. We then introduce PowerLyra, a new distributed graph processing system that embraces the best of both worlds of existing graph-parallel systems. Specifically, PowerLyra uses centralized computation for low-degree vertices to avoid frequent communications and distributes the computation for high-degree vertices to balance workloads. PowerLyra further provides an efficient hybrid graph partitioning algorithm (i.e., hybrid-cut) that combines edge-cut (for low-degree vertices) and vertex-cut (for high-degree vertices) with heuristics. To improve cache locality of inter-node graph accesses, PowerLyra further provides a locality-conscious data layout optimization. PowerLyra is implemented based on the latest GraphLab and can seamlessly support various graph algorithms running in both synchronous and asynchronous execution modes. A detailed evaluation on three clusters using various graph-analytics and MLDM (Machine Learning and Data Mining) applications shows that PowerLyra outperforms PowerGraph by up to 5.53X (from 1.24X) and 3.26X (from 1.49X) for real-world and synthetic graphs, respectively, and is much faster than other systems like GraphX and Giraph, yet with much less memory consumption. A porting of hybrid-cut to GraphX further confirms the efficiency and generality of PowerLyra.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PowerLyra:歪斜图上的微分图计算和划分
具有倾斜分布的自然图对分布式图的计算和划分提出了独特的挑战。现有的图并行系统通常使用“一刀切”的设计,统一处理所有顶点,这要么会导致明显的负载不平衡和高程度顶点(例如,Pregel和GraphLab)的高争用,要么即使对于低程度顶点(例如,PowerGraph和GraphX)也会产生高通信成本和内存消耗。在本文中,我们认为在自然图的歪斜分布中,也需要对高次顶点和低次顶点进行区分处理。然后我们介绍了PowerLyra,一个新的分布式图形处理系统,它包含了现有图形并行系统的两个世界的优点。具体来说,PowerLyra对低度顶点使用集中计算以避免频繁的通信,并对高度顶点分配计算以平衡工作负载。PowerLyra进一步提供了一种高效的混合图划分算法(即hybrid-cut),它将边切(用于低度顶点)和顶点切(用于高度顶点)与启发式相结合。为了提高节点间图访问的缓存局域性,PowerLyra进一步提供了一个局域意识数据布局优化。PowerLyra是基于最新的GraphLab实现的,可以无缝地支持在同步和异步执行模式下运行的各种图形算法。使用各种图形分析和MLDM(机器学习和数据挖掘)应用程序对三个集群进行的详细评估表明,PowerLyra在实际和合成图形方面分别比PowerGraph高出5.53倍(从1.24倍)和3.26倍(从1.49倍),并且比其他系统(如GraphX和Giraph)快得多,但内存消耗少得多。将hybrid-cut移植到GraphX进一步证实了PowerLyra的效率和通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing COMPUTER SCIENCE, THEORY & METHODS-
CiteScore
4.10
自引率
0.00%
发文量
16
期刊最新文献
Introduction to the Special Issue for SPAA’21 A Conflict-Resilient Lock-Free Linearizable Calendar Queue HPS Cholesky: Hierarchical Parallelized Supernodal Cholesky with Adaptive Parameters Improved Online Scheduling of Moldable Task Graphs under Common Speedup Models Checkpointing strategies to tolerate non-memoryless failures on HPC platforms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1