{"title":"GSC: Greedy shard caching algorithm for improved I/O efficiency in GraphChi","authors":"Dagang Li, Zehua Zheng","doi":"10.1109/ICNP.2017.8117588","DOIUrl":null,"url":null,"abstract":"Disk-based large scale graph computation on a single machine has been attracting much attention, with GraphChi as one of the most well-accepted solutions. However, we find out that the performance of GraphChi becomes I/O-constrained when memory is moderately abundant, and from some point adding more memory does not help with the performance any more. In this work, a greedy caching algorithm GSC is proposed for GraphChi to make better use of the memory. It alleviates the I/O constraint by caching and delaying the write-backs of GraphChi shards that have already been loaded into the memory. Experimental results show that by minimizing unnecessary I/Os, GSC can be up to 4x faster during computation than standard GraphChi under memory constraint, and achieve about 3x performance gain when sufficient memory is available.","PeriodicalId":6462,"journal":{"name":"2017 IEEE 25th International Conference on Network Protocols (ICNP)","volume":"4 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 25th International Conference on Network Protocols (ICNP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNP.2017.8117588","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Disk-based large scale graph computation on a single machine has been attracting much attention, with GraphChi as one of the most well-accepted solutions. However, we find out that the performance of GraphChi becomes I/O-constrained when memory is moderately abundant, and from some point adding more memory does not help with the performance any more. In this work, a greedy caching algorithm GSC is proposed for GraphChi to make better use of the memory. It alleviates the I/O constraint by caching and delaying the write-backs of GraphChi shards that have already been loaded into the memory. Experimental results show that by minimizing unnecessary I/Os, GSC can be up to 4x faster during computation than standard GraphChi under memory constraint, and achieve about 3x performance gain when sufficient memory is available.
在单个机器上基于磁盘的大规模图计算已经引起了很多关注,GraphChi是最被广泛接受的解决方案之一。然而,我们发现,当内存适度充裕时,GraphChi的性能会受到I/ o约束,从某种程度上说,增加更多的内存对性能不再有帮助。为了更好地利用GraphChi的内存,本文提出了一种贪心缓存算法GSC。它通过缓存和延迟已经加载到内存中的GraphChi分片的回写来缓解I/O约束。实验结果表明,通过最小化不必要的I/ o, GSC在内存限制下的计算速度可以比标准GraphChi快4倍,并且在足够的内存可用时可以实现约3倍的性能提升。