Non-work-conserving effects in MapReduce: diffusion limit and criticality

Jian Tan, Yandong Wang, Weikuan Yu, Li Zhang
{"title":"Non-work-conserving effects in MapReduce: diffusion limit and criticality","authors":"Jian Tan, Yandong Wang, Weikuan Yu, Li Zhang","doi":"10.1145/2591971.2592007","DOIUrl":null,"url":null,"abstract":"Sequentially arriving jobs share a MapReduce cluster, each desiring a fair allocation of computing resources to serve its associated map and reduce tasks. The model of such a system consists of a processor sharing queue for the MapTasks and a multi-server queue for the ReduceTasks. These two queues are dependent through a constraint that the input data of each ReduceTask are fetched from the intermediate data generated by the MapTasks belonging to the same job. A more generalized form of MapReduce queueing model can capture the essence of other distributed data processing systems that contain interdependent processor sharing queues and multi-server queues.\n Through theoretical modeling and extensive experiments, we show that, this dependence, if not carefully dealt with, can cause non-work-conserving effects that negatively impact system performance and scalability. First, we characterize the heavy-traffic approximation. Depending on how tasks are scheduled, the number of jobs in the system can even exhibit jumps in diffusion limits, resulting in prolonged job execution times. This problem can be mitigated through carefully applying a tie-breaking rule for ReduceTasks, which as a theoretical finding has direct engineering implications. Second, we empirically validate a criticality phenomenon using experiments. MapReduce systems experience an undesirable performance degradation when they have reached certain critical points, another finding that offers fundamental guidance on managing MapReduce systems.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement and Modeling of Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2591971.2592007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Sequentially arriving jobs share a MapReduce cluster, each desiring a fair allocation of computing resources to serve its associated map and reduce tasks. The model of such a system consists of a processor sharing queue for the MapTasks and a multi-server queue for the ReduceTasks. These two queues are dependent through a constraint that the input data of each ReduceTask are fetched from the intermediate data generated by the MapTasks belonging to the same job. A more generalized form of MapReduce queueing model can capture the essence of other distributed data processing systems that contain interdependent processor sharing queues and multi-server queues. Through theoretical modeling and extensive experiments, we show that, this dependence, if not carefully dealt with, can cause non-work-conserving effects that negatively impact system performance and scalability. First, we characterize the heavy-traffic approximation. Depending on how tasks are scheduled, the number of jobs in the system can even exhibit jumps in diffusion limits, resulting in prolonged job execution times. This problem can be mitigated through carefully applying a tie-breaking rule for ReduceTasks, which as a theoretical finding has direct engineering implications. Second, we empirically validate a criticality phenomenon using experiments. MapReduce systems experience an undesirable performance degradation when they have reached certain critical points, another finding that offers fundamental guidance on managing MapReduce systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MapReduce中的非功守恒效应:扩散极限和临界
顺序到达的作业共享一个MapReduce集群,每个作业都希望公平分配计算资源,以服务于其相关的map和reduce任务。这种系统的模型由一个用于MapTasks的处理器共享队列和一个用于ReduceTasks的多服务器队列组成。这两个队列依赖于一个约束,即每个ReduceTask的输入数据是从属于同一作业的MapTasks生成的中间数据中获取的。MapReduce队列模型的一种更广义的形式可以捕捉到其他分布式数据处理系统的本质,这些系统包含相互依赖的处理器共享队列和多服务器队列。通过理论建模和广泛的实验,我们表明,这种依赖关系,如果不仔细处理,可能会导致对系统性能和可扩展性产生负面影响的非工作节省效应。首先,我们描述了大流量近似。根据任务的调度方式,系统中的作业数量甚至可能出现扩散限制的跳跃,从而导致作业执行时间延长。这个问题可以通过仔细应用ReduceTasks的tie-breaking规则来缓解,作为一个理论发现,它具有直接的工程意义。其次,我们通过实验对临界现象进行实证验证。当MapReduce系统达到某个临界点时,会出现不希望出现的性能下降,这是另一个为管理MapReduce系统提供基本指导的发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Queueing delays in buffered multistage interconnection networks Data dissemination performance in large-scale sensor networks Index policies for a multi-class queue with convex holding cost and abandonments Neighbor-cell assisted error correction for MLC NAND flash memories Collecting, organizing, and sharing pins in pinterest: interest-driven or social-driven?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1