MRONLINE: MapReduce online performance tuning

Min Li, Liangzhao Zeng, S. Meng, Jian Tan, Li Zhang, A. Butt, Nicholas C. Fuller
{"title":"MRONLINE: MapReduce online performance tuning","authors":"Min Li, Liangzhao Zeng, S. Meng, Jian Tan, Li Zhang, A. Butt, Nicholas C. Fuller","doi":"10.1145/2600212.2600229","DOIUrl":null,"url":null,"abstract":"MapReduce job parameter tuning is a daunting and time consuming task. The parameter configuration space is huge; there are more than 70 parameters that impact job performance. It is also difficult for users to determine suitable values for the parameters without first having a good understanding of the MapReduce application characteristics. Thus, it is a challenge to systematically explore the parameter space and select a near-optimal configuration. Extant offline tuning approaches are slow and inefficient as they entail multiple test runs and significant human effort.\n To this end, we propose an online performance tuning system, MRONLINE, that monitors a job's execution, tunes associated performance-tuning parameters based on collected statistics, and provides fine-grained control over parameter configuration. MRONLINE allows each task to have a different configuration, instead of having to use the same configuration for all tasks. Moreover, we design a gray-box based smart hill climbing algorithm that can efficiently converge to a near-optimal configuration with high probability. To improve the search quality and increase convergence speed, we also incorporate a set of MapReduce-specific tuning rules in MRONLINE. Our results using a real implementation on a representative 19-node cluster show that dynamic performance tuning can effectively improve MapReduce application performance by up to 30% compared to the default configuration used in YARN.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"139","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2600212.2600229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 139

Abstract

MapReduce job parameter tuning is a daunting and time consuming task. The parameter configuration space is huge; there are more than 70 parameters that impact job performance. It is also difficult for users to determine suitable values for the parameters without first having a good understanding of the MapReduce application characteristics. Thus, it is a challenge to systematically explore the parameter space and select a near-optimal configuration. Extant offline tuning approaches are slow and inefficient as they entail multiple test runs and significant human effort. To this end, we propose an online performance tuning system, MRONLINE, that monitors a job's execution, tunes associated performance-tuning parameters based on collected statistics, and provides fine-grained control over parameter configuration. MRONLINE allows each task to have a different configuration, instead of having to use the same configuration for all tasks. Moreover, we design a gray-box based smart hill climbing algorithm that can efficiently converge to a near-optimal configuration with high probability. To improve the search quality and increase convergence speed, we also incorporate a set of MapReduce-specific tuning rules in MRONLINE. Our results using a real implementation on a representative 19-node cluster show that dynamic performance tuning can effectively improve MapReduce application performance by up to 30% compared to the default configuration used in YARN.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MRONLINE: MapReduce在线性能调优
MapReduce作业参数调优是一项艰巨而耗时的任务。参数配置空间大;影响工作绩效的参数有70多个。如果不首先很好地理解MapReduce应用程序的特征,用户也很难确定合适的参数值。因此,系统地探索参数空间并选择接近最优的配置是一个挑战。现有的离线调优方法缓慢且效率低下,因为它们需要多次测试运行和大量的人力。为此,我们提出了一个在线性能调优系统MRONLINE,它监视作业的执行,根据收集的统计数据调优相关的性能调优参数,并提供对参数配置的细粒度控制。MRONLINE允许每个任务具有不同的配置,而不必为所有任务使用相同的配置。此外,我们设计了一种基于灰盒的智能爬坡算法,该算法可以高效地以高概率收敛到接近最优配置。为了提高搜索质量和提高收敛速度,我们还在MRONLINE中加入了一组mapreduce特定的调优规则。我们在一个具有代表性的19节点集群上使用实际实现的结果表明,与YARN中使用的默认配置相比,动态性能调优可以有效地将MapReduce应用程序的性能提高30%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Data filtering for scalable high-dimensional k-NN search on multicore systems Communication-driven scheduling for virtual clusters in cloud When paxos meets erasure code: reduce network and storage cost in state machine replication Domino: an incremental computing framework in cloud with eventual synchronization TOP-PIM: throughput-oriented programmable processing in memory
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1