We develop several parallel algorithms for shortest distance queries in planar graphs that use graph partitioning in the preprocessing phase to precompute and store distances between selected pairs of vertices. In the query phase, given a pair of arbitrary vertices v and w, the stored information is used to find the distance between v and w fast. The algorithms are implemented and tested on a high performance cluster with upto 256 16-core CPUs and their performances are analyzed and compared.
{"title":"Parallel Shortest-Path Queries in Planar Graphs","authors":"L. Aleksandrov, Guillaume Chapuis, H. Djidjev","doi":"10.1145/2915516.2915518","DOIUrl":"https://doi.org/10.1145/2915516.2915518","url":null,"abstract":"We develop several parallel algorithms for shortest distance queries in planar graphs that use graph partitioning in the preprocessing phase to precompute and store distances between selected pairs of vertices. In the query phase, given a pair of arbitrary vertices v and w, the stored information is used to find the distance between v and w fast. The algorithms are implemented and tested on a high performance cluster with upto 256 16-core CPUs and their performances are analyzed and compared.","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"346 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75463900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applications which need to process and manage large graph data sets have imposed significant challenges for data science community in recent times. This talk discusses the key challenges which need to be handled when implementing a next-generation graph processing and management platform. There are several key problems which needs to be addressed in building such large graph processing system. First, optimized techniques needs to be followed for managing extremely large graph data. Second, new programming models and software tools need to be created for efficiently processing large graphs. This talk will discuss the approaches which need to be followed in addressing these two major issues and will highlight our vision in achieving the challenges of next-generation graph processing and management.
{"title":"Towards Next-Generation Graph Processing and Management Platform","authors":"T. Suzumura","doi":"10.1145/2915516.2915517","DOIUrl":"https://doi.org/10.1145/2915516.2915517","url":null,"abstract":"Applications which need to process and manage large graph data sets have imposed significant challenges for data science community in recent times. This talk discusses the key challenges which need to be handled when implementing a next-generation graph processing and management platform. There are several key problems which needs to be addressed in building such large graph processing system. First, optimized techniques needs to be followed for managing extremely large graph data. Second, new programming models and software tools need to be created for efficiently processing large graphs. This talk will discuss the approaches which need to be followed in addressing these two major issues and will highlight our vision in achieving the challenges of next-generation graph processing and management.","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78278549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Big data has shifted the computing paradigm of data analysis. While some of the data can be treated as simple texts or independent data records, many other applications have data with structural patterns which are modeled as a graph, such as social media, road network traffic and smart grid, etc. However, there is still limited amount of work has been done to address the velocity problem of graph processing. In this work, we aim to develop a distributed processing system for solving pattern matching queries on streaming graphs where graphs evolve over time upon the arrives of streaming graph update events. To achieve the goal, we proposed an incremental pattern matching algorithm and implemented it on GPS, a vertex centric distributed graph computing framework. We also extended the GPS framework to support streaming graph, and adapted a subgraphcentric data model to further reduce communication overhead and system performance. Our evaluation using real wiki trace shows that our approach achieves a 3x -- 10x speedup over the batch algorithm, and significantly reduces network and memory usage.
{"title":"Distributed Incremental Pattern Matching on Streaming Graphs","authors":"Jyun-Sheng Kao, J. Chou","doi":"10.1145/2915516.2915519","DOIUrl":"https://doi.org/10.1145/2915516.2915519","url":null,"abstract":"Big data has shifted the computing paradigm of data analysis. While some of the data can be treated as simple texts or independent data records, many other applications have data with structural patterns which are modeled as a graph, such as social media, road network traffic and smart grid, etc. However, there is still limited amount of work has been done to address the velocity problem of graph processing. In this work, we aim to develop a distributed processing system for solving pattern matching queries on streaming graphs where graphs evolve over time upon the arrives of streaming graph update events. To achieve the goal, we proposed an incremental pattern matching algorithm and implemented it on GPS, a vertex centric distributed graph computing framework. We also extended the GPS framework to support streaming graph, and adapted a subgraphcentric data model to further reduce communication overhead and system performance. Our evaluation using real wiki trace shows that our approach achieves a 3x -- 10x speedup over the batch algorithm, and significantly reduces network and memory usage.","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88599480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leyuan Wang, Yangzihao Wang, Carl Yang, John Douglas Owens
We implement exact triangle counting in graphs on the GPU using three different methodologies: subgraph matching to a triangle pattern; programmable graph analytics, with a set-intersection approach; and a matrix formulation based on sparse matrix-matrix multiplies. All three deliver best-of-class performance over CPU implementations and over comparable GPU implementations, with the graph-analytic approach achieving the best performance due to its ability to exploit efficient filtering steps to remove unnecessary work and its high-performance set-intersection core.
{"title":"A Comparative Study on Exact Triangle Counting Algorithms on the GPU","authors":"Leyuan Wang, Yangzihao Wang, Carl Yang, John Douglas Owens","doi":"10.1145/2915516.2915521","DOIUrl":"https://doi.org/10.1145/2915516.2915521","url":null,"abstract":"We implement exact triangle counting in graphs on the GPU using three different methodologies: subgraph matching to a triangle pattern; programmable graph analytics, with a set-intersection approach; and a matrix formulation based on sparse matrix-matrix multiplies. All three deliver best-of-class performance over CPU implementations and over comparable GPU implementations, with the graph-analytic approach achieving the best performance due to its ability to exploit efficient filtering steps to remove unnecessary work and its high-performance set-intersection core.","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84942446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Full Papers Session 2","authors":"T. Suzumura","doi":"10.1145/3260995","DOIUrl":"https://doi.org/10.1145/3260995","url":null,"abstract":"","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"78 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86706778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Querying graph data often involves identifying matching paths, either as an end product, or as an intermediate step for further graph analysis. Distributed graph querying, suffers from high communication to computation costs, due to challenges in constructing comprehensive structural indexes. This could result in severe performance degradation in terms of turnaround time, which often worsens with increasing graph size and density. In this paper, we propose a novel topology abstraction layer, that helps improve query response time by reducing the communication overhead for selective exploration of large distributed graphs. We demonstrate the effectiveness of our model and also go on to show that our abstraction layer works well in both data-parallel and graph-parallel paradigms.
{"title":"Graph Topology Abstraction for Distributed Path Queries","authors":"Janani Balaji, Rajshekhar Sunderraman","doi":"10.1145/2915516.2915520","DOIUrl":"https://doi.org/10.1145/2915516.2915520","url":null,"abstract":"Querying graph data often involves identifying matching paths, either as an end product, or as an intermediate step for further graph analysis. Distributed graph querying, suffers from high communication to computation costs, due to challenges in constructing comprehensive structural indexes. This could result in severe performance degradation in terms of turnaround time, which often worsens with increasing graph size and density. In this paper, we propose a novel topology abstraction layer, that helps improve query response time by reducing the communication overhead for selective exploration of large distributed graphs. We demonstrate the effectiveness of our model and also go on to show that our abstraction layer works well in both data-parallel and graph-parallel paradigms.","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88525435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Keynote Address","authors":"T. Suzumura","doi":"10.1145/3260994","DOIUrl":"https://doi.org/10.1145/3260994","url":null,"abstract":"","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73064434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Che, Marc S. Orr, Gregory P. Rodgers, J. Gallmeier
This paper studies different approaches to implementing betweenness centrality in a heterogeneous system. Betweenness centrality is an important algorithm in graph processing. It presents multiple levels of parallelism when processing a graph, and is an interesting problem to exploit various optimizations. We implement different versions of betweenness centrality on an AMD accelerated processing unit (APU). These include GPU-only implementations with two edge distribution methods, GPU-side load balancing, CPU-GPU load balancing in a master-worker model with queue monitoring and in a work stealing model. We take advantage of the latest development of heterogeneous system architecture (HSA), such as the features of unified virtual address space and diverse atomics. We also use different memory scope and ordering options for different synchronization scenarios. We compare multiple implementations of betweenness centrality, analyze their performance, and discuss important future research directions.
{"title":"Betweenness Centrality in an HSA-enabled System","authors":"Shuai Che, Marc S. Orr, Gregory P. Rodgers, J. Gallmeier","doi":"10.1145/2915516.2915526","DOIUrl":"https://doi.org/10.1145/2915516.2915526","url":null,"abstract":"This paper studies different approaches to implementing betweenness centrality in a heterogeneous system. Betweenness centrality is an important algorithm in graph processing. It presents multiple levels of parallelism when processing a graph, and is an interesting problem to exploit various optimizations. We implement different versions of betweenness centrality on an AMD accelerated processing unit (APU). These include GPU-only implementations with two edge distribution methods, GPU-side load balancing, CPU-GPU load balancing in a master-worker model with queue monitoring and in a work stealing model. We take advantage of the latest development of heterogeneous system architecture (HSA), such as the features of unified virtual address space and diverse atomics. We also use different memory scope and ordering options for different synchronization scenarios. We compare multiple implementations of betweenness centrality, analyze their performance, and discuss important future research directions.","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"105 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80691827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Short Papers Session","authors":"T. Suzumura","doi":"10.1145/3260996","DOIUrl":"https://doi.org/10.1145/3260996","url":null,"abstract":"","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"72 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78135184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Full Papers Session 1","authors":"T. Suzumura","doi":"10.1145/3260993","DOIUrl":"https://doi.org/10.1145/3260993","url":null,"abstract":"","PeriodicalId":20568,"journal":{"name":"Proceedings of the ACM Workshop on High Performance Graph Processing","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90639953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}