Yongli Cheng, F. Wang, Hong Jiang, Yu Hua, D. Feng, XiuNeng Wang
{"title":"DD-Graph:一个高性价比的分布式基于磁盘的图形处理框架","authors":"Yongli Cheng, F. Wang, Hong Jiang, Yu Hua, D. Feng, XiuNeng Wang","doi":"10.1145/2907294.2907299","DOIUrl":null,"url":null,"abstract":"Existing distributed graph-processing frameworks, e.g.,GPS, Pregel and Giraph, handle large-scale graphs in the memory of clusters built of commodity compute nodes for better scalability and performance. While capable of scaling out according to the size of graphs up to thousands of compute nodes, for graphs beyond a certain size, these frameworks usually require the investments of machines that are either beyond the financial capability of or unprofitable for most small and medium-sized organizations. At the other end of the spectrum of graph-processing frameworks research, the single-node disk-based graph-processing frameworks, e.g., GraphChi, handle large-scale graphs on one commodity computer, leading to high efficiency in the use of hardware but at the cost of low user performance and limited scalability. Motivated by this dichotomy, in this paper we propose a distributed disk-based graph-processing framework, called DD-Graph, that can process super-large graphs on a small cluster while achieving the high performance of existing distributed in-memory graph-processing frameworks.","PeriodicalId":20515,"journal":{"name":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"DD-Graph: A Highly Cost-Effective Distributed Disk-based Graph-Processing Framework\",\"authors\":\"Yongli Cheng, F. Wang, Hong Jiang, Yu Hua, D. Feng, XiuNeng Wang\",\"doi\":\"10.1145/2907294.2907299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing distributed graph-processing frameworks, e.g.,GPS, Pregel and Giraph, handle large-scale graphs in the memory of clusters built of commodity compute nodes for better scalability and performance. While capable of scaling out according to the size of graphs up to thousands of compute nodes, for graphs beyond a certain size, these frameworks usually require the investments of machines that are either beyond the financial capability of or unprofitable for most small and medium-sized organizations. At the other end of the spectrum of graph-processing frameworks research, the single-node disk-based graph-processing frameworks, e.g., GraphChi, handle large-scale graphs on one commodity computer, leading to high efficiency in the use of hardware but at the cost of low user performance and limited scalability. Motivated by this dichotomy, in this paper we propose a distributed disk-based graph-processing framework, called DD-Graph, that can process super-large graphs on a small cluster while achieving the high performance of existing distributed in-memory graph-processing frameworks.\",\"PeriodicalId\":20515,\"journal\":{\"name\":\"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2907294.2907299\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2907294.2907299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DD-Graph: A Highly Cost-Effective Distributed Disk-based Graph-Processing Framework
Existing distributed graph-processing frameworks, e.g.,GPS, Pregel and Giraph, handle large-scale graphs in the memory of clusters built of commodity compute nodes for better scalability and performance. While capable of scaling out according to the size of graphs up to thousands of compute nodes, for graphs beyond a certain size, these frameworks usually require the investments of machines that are either beyond the financial capability of or unprofitable for most small and medium-sized organizations. At the other end of the spectrum of graph-processing frameworks research, the single-node disk-based graph-processing frameworks, e.g., GraphChi, handle large-scale graphs on one commodity computer, leading to high efficiency in the use of hardware but at the cost of low user performance and limited scalability. Motivated by this dichotomy, in this paper we propose a distributed disk-based graph-processing framework, called DD-Graph, that can process super-large graphs on a small cluster while achieving the high performance of existing distributed in-memory graph-processing frameworks.