GraphReduce:在基于加速器的系统上处理大规模图形

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2015-11-15 DOI:10.1145/2807591.2807655

D. Sengupta, S. Song, K. Agarwal, K. Schwan

{"title":"GraphReduce:在基于加速器的系统上处理大规模图形","authors":"D. Sengupta, S. Song, K. Agarwal, K. Schwan","doi":"10.1145/2807591.2807655","DOIUrl":null,"url":null,"abstract":"Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device's internal memory capacity. GraphReduce adopts a combination of edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the host and device. GraphReduce-based programming is performed via device functions that include gatherMap, gatherReduce, apply, and scatter, implemented by programmers for the graph algorithms they wish to realize. Extensive experimental evaluations for a wide variety of graph inputs and algorithms demonstrate that GraphReduce significantly outperforms other competing out-of-memory approaches.","PeriodicalId":117494,"journal":{"name":"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"89","resultStr":"{\"title\":\"GraphReduce: processing large-scale graphs on accelerator-based systems\",\"authors\":\"D. Sengupta, S. Song, K. Agarwal, K. Schwan\",\"doi\":\"10.1145/2807591.2807655\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device's internal memory capacity. GraphReduce adopts a combination of edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the host and device. GraphReduce-based programming is performed via device functions that include gatherMap, gatherReduce, apply, and scatter, implemented by programmers for the graph algorithms they wish to realize. Extensive experimental evaluations for a wide variety of graph inputs and algorithms demonstrate that GraphReduce significantly outperforms other competing out-of-memory approaches.\",\"PeriodicalId\":117494,\"journal\":{\"name\":\"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"89\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2807591.2807655\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC15: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2807591.2807655","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 89

摘要

最近关于现实世界图形分析的工作试图利用GPU设备提供的大量并行性，但由于图形算法固有的不规则性和GPU驻留内存对存储大型图形的限制，挑战仍然存在。我们提出GraphReduce，这是一个高效且可扩展的基于gpu的框架，可以处理超出设备内部内存容量的图形。GraphReduce采用了以边缘为中心和以顶点为中心的集合-应用-分散编程模型的组合实现，并在多个异步GPU流上运行，以充分利用GPU的高度并行性，在主机和设备之间高效地移动图形数据。基于graphreduce的编程是通过包括gatherMap、gatherReduce、apply和scatter在内的设备函数来执行的，这些函数由程序员为他们希望实现的图算法实现。对各种图形输入和算法的广泛实验评估表明，GraphReduce明显优于其他竞争的内存不足方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

GraphReduce: processing large-scale graphs on accelerator-based systems

Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device's internal memory capacity. GraphReduce adopts a combination of edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the host and device. GraphReduce-based programming is performed via device functions that include gatherMap, gatherReduce, apply, and scatter, implemented by programmers for the graph algorithms they wish to realize. Extensive experimental evaluations for a wide variety of graph inputs and algorithms demonstrate that GraphReduce significantly outperforms other competing out-of-memory approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

SC15: International Conference for High Performance Computing, Networking, Storage and Analysis

自引率

0.00%

发文量