PCGCN: Partition-Centric Processing for Accelerating Graph Convolutional Network

2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2020-05-01 DOI:10.1109/IPDPS47924.2020.00100

Chao Tian, Lingxiao Ma, Zhi Yang, Yafei Dai

{"title":"PCGCN: Partition-Centric Processing for Accelerating Graph Convolutional Network","authors":"Chao Tian, Lingxiao Ma, Zhi Yang, Yafei Dai","doi":"10.1109/IPDPS47924.2020.00100","DOIUrl":null,"url":null,"abstract":"Inspired by the successes of convolutional neural networks (CNN) in computer vision, the convolutional operation has been moved beyond low-dimension grids (e.g., images) to high-dimensional graph-structured data (e.g., web graphs, social networks), leading to graph convolutional network (GCN). And GCN has been gaining popularity due to its success in real-world applications such as recommendation, natural language processing, etc. Because neural network and graph propagation have high computation complexity, GPUs have been introduced to both neural network training and graph processing. However, it is notoriously difficult to perform efficient GCN computing on data parallel hardware like GPU due to the sparsity and irregularity in graphs. In this paper, we present PCGCN, a novel and general method to accelerate GCN computing by taking advantage of the locality in graphs. We experimentally demonstrate that real-world graphs usually have the clustering property that can be used to enhance the data locality in GCN computing. Then, PCGCN proposes to partition the whole graph into chunks according to locality and process subgraphs with a dual-mode computing strategy which includes a selective and a full processing methods for sparse and dense subgraphs, respectively. Compared to existing state-of-the-art implementations of GCN on real-world and synthetic datasets, our implementation on top of TensorFlow achieves up to 8.8× speedup over the fastest one of the baselines.","PeriodicalId":6805,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"1 1","pages":"936-945"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS47924.2020.00100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

Abstract

Inspired by the successes of convolutional neural networks (CNN) in computer vision, the convolutional operation has been moved beyond low-dimension grids (e.g., images) to high-dimensional graph-structured data (e.g., web graphs, social networks), leading to graph convolutional network (GCN). And GCN has been gaining popularity due to its success in real-world applications such as recommendation, natural language processing, etc. Because neural network and graph propagation have high computation complexity, GPUs have been introduced to both neural network training and graph processing. However, it is notoriously difficult to perform efficient GCN computing on data parallel hardware like GPU due to the sparsity and irregularity in graphs. In this paper, we present PCGCN, a novel and general method to accelerate GCN computing by taking advantage of the locality in graphs. We experimentally demonstrate that real-world graphs usually have the clustering property that can be used to enhance the data locality in GCN computing. Then, PCGCN proposes to partition the whole graph into chunks according to locality and process subgraphs with a dual-mode computing strategy which includes a selective and a full processing methods for sparse and dense subgraphs, respectively. Compared to existing state-of-the-art implementations of GCN on real-world and synthetic datasets, our implementation on top of TensorFlow achieves up to 8.8× speedup over the fastest one of the baselines.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

加速图卷积网络的以分区为中心的处理

受卷积神经网络(CNN)在计算机视觉领域成功的启发，卷积运算已经从低维网格(如图像)转移到高维图结构数据(如网络图、社交网络)，从而产生了图卷积网络(GCN)。由于GCN在推荐、自然语言处理等实际应用中的成功，它已经越来越受欢迎。由于神经网络和图的传播具有较高的计算复杂度，gpu被引入到神经网络训练和图的处理中。然而，由于图形的稀疏性和不规则性，在像GPU这样的数据并行硬件上执行高效的GCN计算是出了名的困难。本文提出了一种利用图的局部性来加速GCN计算的新方法——PCGCN。我们通过实验证明，真实世界的图通常具有聚类特性，可以用来增强GCN计算中的数据局部性。然后，PCGCN提出根据局部性将整个图划分为块，并采用双模式计算策略处理子图，其中对稀疏子图采用选择性处理方法，对密集子图采用完全处理方法。与现有最先进的GCN在真实世界和合成数据集上的实现相比，我们在TensorFlow之上的实现比最快的基线实现了高达8.8倍的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量

期刊最新文献

Asynch-SGBDT: Train Stochastic Gradient Boosting Decision Trees in an Asynchronous Parallel Manner Resilience at Extreme Scale and Connections with Other Domains A Tale of Two C's: Convergence and Composability 12 Ways to Fool the Masses with Irreproducible Results Is Asymptotic Cost Analysis Useful in Developing Practical Parallel Algorithms