Seung-Hwan Lim, S. Lee, Gautam Ganesh, Tyler C. Brown, S. Sukumar
{"title":"Graph Processing Platforms at Scale: Practices and Experiences","authors":"Seung-Hwan Lim, S. Lee, Gautam Ganesh, Tyler C. Brown, S. Sukumar","doi":"10.1109/ISPASS.2015.7095783","DOIUrl":null,"url":null,"abstract":"Graph analysis has revealed patterns and relationships hidden in data from a variety of domains such as transportation networks, social networks, clinical pathways, and collaboration networks. As these networks grow in size, variety and complexity, it is a challenge to find the right combination of tools and implementation of algorithms to discover new insights from the data. Addressing this challenge, our study presents an extensive empirical evaluation of three representative graph processing platforms: Pegasus, GraphX, and Urika. Each system represents a combination of options in data model, processing paradigm, and infrastructure. We benchmark each platform using three popular graph mining operations, degree distribution, connected components, and PageRank over real-world graphs. Our experiments show that each graph processing platform owns a particular strength for different types of graph operations. While Urika performs the best in non-iterative graph operations like degree distribution, GraphX outperforms iterative operations like connected components and PageRank. We conclude this paper by discussing options to optimize the performance of a graph-theoretic operation on each platform for large-scale real world graphs.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"55 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Graph analysis has revealed patterns and relationships hidden in data from a variety of domains such as transportation networks, social networks, clinical pathways, and collaboration networks. As these networks grow in size, variety and complexity, it is a challenge to find the right combination of tools and implementation of algorithms to discover new insights from the data. Addressing this challenge, our study presents an extensive empirical evaluation of three representative graph processing platforms: Pegasus, GraphX, and Urika. Each system represents a combination of options in data model, processing paradigm, and infrastructure. We benchmark each platform using three popular graph mining operations, degree distribution, connected components, and PageRank over real-world graphs. Our experiments show that each graph processing platform owns a particular strength for different types of graph operations. While Urika performs the best in non-iterative graph operations like degree distribution, GraphX outperforms iterative operations like connected components and PageRank. We conclude this paper by discussing options to optimize the performance of a graph-theoretic operation on each platform for large-scale real world graphs.