A Recommendation-Based Parameter Tuning Approach for Hadoop

Lin Cai, Yong Qi, Jingwei Li
{"title":"A Recommendation-Based Parameter Tuning Approach for Hadoop","authors":"Lin Cai, Yong Qi, Jingwei Li","doi":"10.1109/SC2.2017.41","DOIUrl":null,"url":null,"abstract":"Nowadays we have entered the big data era. Hadoop, one of the popular big data processing platforms, has many parameters that relate closely to the utilization of resources (e.g. CPU or memory). Tuning these parameters thus becomes one of the important approaches to improve the resource utilization of Hadoop. However, tuning parameters manually is impractical because the time cost fortuning is too high. Hence it is necessary to configure parameters automatically and quickly to optimize resource utilization. The former auto-tuning methods often take a long time before getting the optimal configuration, which would reduce the overall resource efficiency of cluster. In this paper, we propose mrEtalon, an adaptive tuning framework to recommend a near-optimal configuration for the new job in a short time. mrEtalon sets a configuration repository to provide candidate configurations, as well as a collaborative filtering based recommendation engine that can accelerate the optimization for parameters. We have deployed mrEtalon in our experimental cluster, and the results demonstrate that, for a new MapReduce application, compared to the former methods, mrEtalon can reduce the recommend time to 20% to 30% while keeping nearly the same recommendation quality.","PeriodicalId":188326,"journal":{"name":"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC2.2017.41","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Nowadays we have entered the big data era. Hadoop, one of the popular big data processing platforms, has many parameters that relate closely to the utilization of resources (e.g. CPU or memory). Tuning these parameters thus becomes one of the important approaches to improve the resource utilization of Hadoop. However, tuning parameters manually is impractical because the time cost fortuning is too high. Hence it is necessary to configure parameters automatically and quickly to optimize resource utilization. The former auto-tuning methods often take a long time before getting the optimal configuration, which would reduce the overall resource efficiency of cluster. In this paper, we propose mrEtalon, an adaptive tuning framework to recommend a near-optimal configuration for the new job in a short time. mrEtalon sets a configuration repository to provide candidate configurations, as well as a collaborative filtering based recommendation engine that can accelerate the optimization for parameters. We have deployed mrEtalon in our experimental cluster, and the results demonstrate that, for a new MapReduce application, compared to the former methods, mrEtalon can reduce the recommend time to 20% to 30% while keeping nearly the same recommendation quality.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于推荐的Hadoop参数调优方法
如今,我们已经进入了大数据时代。Hadoop是流行的大数据处理平台之一,它有许多与资源利用率(例如CPU或内存)密切相关的参数。因此,调优这些参数成为提高Hadoop资源利用率的重要方法之一。然而,手动调优参数是不切实际的,因为时间成本太高。因此,有必要自动、快速地配置参数,以优化资源利用。以往的自动调优方法往往需要较长的时间才能得到最优配置,这将降低集群的整体资源效率。在本文中,我们提出了mrEtalon,这是一个自适应调优框架,可以在短时间内为新作业推荐接近最优的配置。mrEtalon设置了一个配置存储库来提供候选配置,以及一个基于协作过滤的推荐引擎,可以加速参数的优化。我们在我们的实验集群中部署了mrEtalon,结果表明,对于一个新的MapReduce应用程序,与以前的方法相比,mrEtalon可以将推荐时间减少到20%到30%,同时保持几乎相同的推荐质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Multilayered Cloud Applications Autoscaling Performance Estimation Optimal Placement of Network Security Monitoring Functions in NFV-Enabled Data Centers Application-Aware Traffic Redirection: A Mobile Edge Computing Implementation Toward Future 5G Networks A Mobile Cloud-Based Biofeedback Platform for Evaluating Medication Response Platform-as-a-Service for Human-Based Applications: Ontology-Driven Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1