GraphR: Accelerating Graph Processing Using ReRAM

2018 IEEE International Symposium on High Performance Computer Architecture (HPCA) Pub Date : 2017-08-21 DOI:10.1109/HPCA.2018.00052

Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Helen Li, Yiran Chen

{"title":"GraphR: Accelerating Graph Processing Using ReRAM","authors":"Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Helen Li, Yiran Chen","doi":"10.1109/HPCA.2018.00052","DOIUrl":null,"url":null,"abstract":"Graph processing recently received intensive interests in light of a wide range of needs to understand relationships. It is well-known for the poor locality and high memory bandwidth requirement. In conventional architectures, they incur a significant amount of data movements and energy consumption which motivates several hardware graph processing accelerators. The current graph processing accelerators rely on memory access optimizations or placing computation logics close to memory. Distinct from all existing approaches, we leverage an emerging memory technology to accelerate graph processing with analog computation. This paper presents GRAPHR, the first ReRAM-based graph processing accelerator. GRAPHR follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. The analog computation is suitable for graph processing because: 1) The algorithms are iterative and could inherently tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and Collaborative Filtering) and typical graph algorithms involving integers (e.g., BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a vertex program of a graph algorithm can be expressed in sparse matrix vector multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We show that this assumption is generally true for a large set of graph algorithms. GRAPHR is a novel accelerator architecture consisting of two components: memory ReRAM and graph engine (GE). The core graph computations are performed in sparse matrix format in GEs (ReRAM crossbars). The vector/matrix-based graph computation is not new, but ReRAM offers the unique opportunity to realize the massive parallelism with unprecedented energy efficiency and low hardware cost. With small subgraphs processed by GEs, the gain of performing parallel operations overshadows the wastes due to sparsity. The experiment results show that GRAPHR achieves a 16.01× (up to 132.67×) speedup and a 33.82× energy saving on geometric mean compared to a CPU baseline system. Compared to GPU, GRAPHR achieves 1.69× to 2.19× speedup and consumes 4.77× to 8.91× less energy. GRAPHR gains a speedup of 1.16× to 4.12×, and is 3.67× to 10.96× more energy efficiency compared to PIM-based architecture.","PeriodicalId":154694,"journal":{"name":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"202","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2018.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 202

Abstract

Graph processing recently received intensive interests in light of a wide range of needs to understand relationships. It is well-known for the poor locality and high memory bandwidth requirement. In conventional architectures, they incur a significant amount of data movements and energy consumption which motivates several hardware graph processing accelerators. The current graph processing accelerators rely on memory access optimizations or placing computation logics close to memory. Distinct from all existing approaches, we leverage an emerging memory technology to accelerate graph processing with analog computation. This paper presents GRAPHR, the first ReRAM-based graph processing accelerator. GRAPHR follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. The analog computation is suitable for graph processing because: 1) The algorithms are iterative and could inherently tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and Collaborative Filtering) and typical graph algorithms involving integers (e.g., BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a vertex program of a graph algorithm can be expressed in sparse matrix vector multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We show that this assumption is generally true for a large set of graph algorithms. GRAPHR is a novel accelerator architecture consisting of two components: memory ReRAM and graph engine (GE). The core graph computations are performed in sparse matrix format in GEs (ReRAM crossbars). The vector/matrix-based graph computation is not new, but ReRAM offers the unique opportunity to realize the massive parallelism with unprecedented energy efficiency and low hardware cost. With small subgraphs processed by GEs, the gain of performing parallel operations overshadows the wastes due to sparsity. The experiment results show that GRAPHR achieves a 16.01× (up to 132.67×) speedup and a 33.82× energy saving on geometric mean compared to a CPU baseline system. Compared to GPU, GRAPHR achieves 1.69× to 2.19× speedup and consumes 4.77× to 8.91× less energy. GRAPHR gains a speedup of 1.16× to 4.12×, and is 3.67× to 10.96× more energy efficiency compared to PIM-based architecture.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GraphR:使用ReRAM加速图形处理

鉴于理解关系的广泛需求，图处理最近受到了广泛的关注。它以局部性差和内存带宽要求高而闻名。在传统的架构中，它们会产生大量的数据移动和能源消耗，这促使了几个硬件图形处理加速器。当前的图形处理加速器依赖于内存访问优化或将计算逻辑放置在内存附近。与所有现有的方法不同，我们利用新兴的内存技术来加速模拟计算的图形处理。本文提出了GRAPHR——第一个基于rerram的图形处理加速器。GRAPHR遵循近数据处理的原则，并探索了以低硬件和能源成本执行大规模并行模拟操作的机会。模拟计算适合于图形处理，因为:1)算法是迭代的，可以固有地容忍不精确;2)概率计算(例如，PageRank和协同过滤)和典型的涉及整数的图算法(例如，BFS/SSSP)对错误都有弹性。GRAPHR的关键思想是，如果图算法的顶点程序可以用稀疏矩阵向量乘法(SpMV)表示，则可以用ReRAM交叉条有效地执行。我们证明了这个假设对于大量的图算法来说通常是正确的。GRAPHR是一种新型的加速器架构，由内存ReRAM和图形引擎(GE)两部分组成。核心图计算在GEs (ReRAM交叉条)中以稀疏矩阵格式进行。基于向量/矩阵的图计算并不新鲜，但ReRAM提供了独特的机会，以前所未有的能源效率和低硬件成本实现大规模并行。对于由GEs处理的小子图，执行并行操作的收益掩盖了由于稀疏性造成的浪费。实验结果表明，与CPU基准系统相比，GRAPHR实现了16.01倍(最高132.67倍)的加速和33.82倍的几何平均节能。与GPU相比，GRAPHR的速度提升了1.69 ~ 2.19倍，能耗降低了4.77 ~ 8.91倍。与基于pim的架构相比，GRAPHR的速度提高了1.16到4.12倍，能效提高了3.67到10.96倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)

自引率

0.00%

发文量

期刊最新文献

Record-Replay Architecture as a General Security Framework LATTE-CC: Latency Tolerance Aware Adaptive Cache Compression Management for Energy Efficient GPUs Secure DIMM: Moving ORAM Primitives Closer to Memory OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator WIR: Warp Instruction Reuse to Minimize Repeated Computations in GPUs