基于内存MapReduce的高效shuffle设计研究

Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond Pub Date : 2016-06-26 DOI:10.1145/2926534.2926538

Harunobu Daikoku, H. Kawashima, O. Tatebe

{"title":"基于内存MapReduce的高效shuffle设计研究","authors":"Harunobu Daikoku, H. Kawashima, O. Tatebe","doi":"10.1145/2926534.2926538","DOIUrl":null,"url":null,"abstract":"MapReduce is commonly used as a way of big data analysis in many fields. Shuffling, the inter-node data exchange phase of MapReduce, has been reported as the major bottleneck of the framework. Acceleration of shuffling has been studied in literature, and we raise two questions in this paper. The first question pertains to the effect of Remote Direct Memory Access (RDMA) on the performance of shuffling. RDMA enables one machine to read and write data on the local memory of another and has been known to be an efficient data transfer mechanism. Does the pure use of RDMA affect the performance of shuffling? The second question is the data transfer algorithm to use. There are two types of shuffling algorithms for the conventional MapReduce implementations: Fully-Connected and more sophisticated algorithms such as Pairwise. Does the data transfer algorithm affect the performance of shuffling? To answer these questions, we designed and implemented yet another MapReduce system from scratch in C/C++ to gain the maximum performance and to reserve design flexibility. For the first question, we compared RDMA shuffling based on rsocket with the one based on IPoIB. The results of experiments with GroupBy showed that RDMA accelerates map+shuffle phase by around 50%. For the second question, we first compared our in-memory system with Apache Spark to investigate whether our system performed more efficiently than the existing system. Our system demonstrated performance improvement by a factor of 3.04 on Word Count, and by a factor of 2.64 on BiGram Count as compared to Spark. Then, we compared the two data exchange algorithms, Fully-Connected and Pairwise. The results of experiments with BiGram Count showed that Fully-Connected without RDMA was 13% more efficient than Pairwise with RDMA. We conclude that it is necessary to overlap map and shuffle phases to gain performance improvement. The reason of the relatively small percentage of improvement can be attributed to the time-consuming insertions of key-value pairs into the hash-map in the map phase.","PeriodicalId":393776,"journal":{"name":"Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"On exploring efficient shuffle design for in-memory MapReduce\",\"authors\":\"Harunobu Daikoku, H. Kawashima, O. Tatebe\",\"doi\":\"10.1145/2926534.2926538\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MapReduce is commonly used as a way of big data analysis in many fields. Shuffling, the inter-node data exchange phase of MapReduce, has been reported as the major bottleneck of the framework. Acceleration of shuffling has been studied in literature, and we raise two questions in this paper. The first question pertains to the effect of Remote Direct Memory Access (RDMA) on the performance of shuffling. RDMA enables one machine to read and write data on the local memory of another and has been known to be an efficient data transfer mechanism. Does the pure use of RDMA affect the performance of shuffling? The second question is the data transfer algorithm to use. There are two types of shuffling algorithms for the conventional MapReduce implementations: Fully-Connected and more sophisticated algorithms such as Pairwise. Does the data transfer algorithm affect the performance of shuffling? To answer these questions, we designed and implemented yet another MapReduce system from scratch in C/C++ to gain the maximum performance and to reserve design flexibility. For the first question, we compared RDMA shuffling based on rsocket with the one based on IPoIB. The results of experiments with GroupBy showed that RDMA accelerates map+shuffle phase by around 50%. For the second question, we first compared our in-memory system with Apache Spark to investigate whether our system performed more efficiently than the existing system. Our system demonstrated performance improvement by a factor of 3.04 on Word Count, and by a factor of 2.64 on BiGram Count as compared to Spark. Then, we compared the two data exchange algorithms, Fully-Connected and Pairwise. The results of experiments with BiGram Count showed that Fully-Connected without RDMA was 13% more efficient than Pairwise with RDMA. We conclude that it is necessary to overlap map and shuffle phases to gain performance improvement. The reason of the relatively small percentage of improvement can be attributed to the time-consuming insertions of key-value pairs into the hash-map in the map phase.\",\"PeriodicalId\":393776,\"journal\":{\"name\":\"Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2926534.2926538\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2926534.2926538","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

MapReduce在很多领域都是常用的大数据分析方法。据报道，MapReduce节点间数据交换阶段的变换是该框架的主要瓶颈。文献对洗牌的加速进行了研究，本文提出了两个问题。第一个问题是关于远程直接内存访问(RDMA)对变换性能的影响。RDMA使一台机器能够在另一台机器的本地内存上读写数据，并且已经被认为是一种有效的数据传输机制。单纯使用RDMA会影响变换的性能吗?第二个问题是要使用的数据传输算法。对于传统的MapReduce实现，有两种类型的变换算法:完全连接和更复杂的算法，如Pairwise。数据传输算法是否影响洗牌的性能?为了回答这些问题，我们用C/ c++从头开始设计并实现了另一个MapReduce系统，以获得最大的性能并保留设计灵活性。对于第一个问题，我们比较了基于rsocket的RDMA变换和基于IPoIB的RDMA变换。GroupBy的实验结果表明，RDMA将map+shuffle阶段加速了约50%。对于第二个问题，我们首先将我们的内存系统与Apache Spark进行比较，以调查我们的系统是否比现有系统执行得更有效。与Spark相比，我们的系统在Word Count上的性能提高了3.04倍，在BiGram Count上的性能提高了2.64倍。然后，我们比较了两种数据交换算法，Fully-Connected和Pairwise。使用BiGram Count进行的实验结果表明，不使用RDMA的full - connected比使用RDMA的Pairwise效率高13%。我们得出结论，有必要重叠映射和shuffle阶段以获得性能改进。改进百分比相对较小的原因可以归结为在映射阶段将键值对插入到哈希映射中非常耗时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

On exploring efficient shuffle design for in-memory MapReduce

MapReduce is commonly used as a way of big data analysis in many fields. Shuffling, the inter-node data exchange phase of MapReduce, has been reported as the major bottleneck of the framework. Acceleration of shuffling has been studied in literature, and we raise two questions in this paper. The first question pertains to the effect of Remote Direct Memory Access (RDMA) on the performance of shuffling. RDMA enables one machine to read and write data on the local memory of another and has been known to be an efficient data transfer mechanism. Does the pure use of RDMA affect the performance of shuffling? The second question is the data transfer algorithm to use. There are two types of shuffling algorithms for the conventional MapReduce implementations: Fully-Connected and more sophisticated algorithms such as Pairwise. Does the data transfer algorithm affect the performance of shuffling? To answer these questions, we designed and implemented yet another MapReduce system from scratch in C/C++ to gain the maximum performance and to reserve design flexibility. For the first question, we compared RDMA shuffling based on rsocket with the one based on IPoIB. The results of experiments with GroupBy showed that RDMA accelerates map+shuffle phase by around 50%. For the second question, we first compared our in-memory system with Apache Spark to investigate whether our system performed more efficiently than the existing system. Our system demonstrated performance improvement by a factor of 3.04 on Word Count, and by a factor of 2.64 on BiGram Count as compared to Spark. Then, we compared the two data exchange algorithms, Fully-Connected and Pairwise. The results of experiments with BiGram Count showed that Fully-Connected without RDMA was 13% more efficient than Pairwise with RDMA. We conclude that it is necessary to overlap map and shuffle phases to gain performance improvement. The reason of the relatively small percentage of improvement can be attributed to the time-consuming insertions of key-value pairs into the hash-map in the map phase.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond

自引率

0.00%

发文量

期刊最新文献

Tight bounds on one- and two-pass MapReduce algorithms for matrix multiplication On exploring efficient shuffle design for in-memory MapReduce Faucet: a user-level, modular technique for flow control in dataflow engines Toward elastic memory management for cloud data analytics Bridging the gap: towards optimization across linear and relational algebra