{"title":"A Recommendation-Based Parameter Tuning Approach for Hadoop","authors":"Lin Cai, Yong Qi, Jingwei Li","doi":"10.1109/SC2.2017.41","DOIUrl":null,"url":null,"abstract":"Nowadays we have entered the big data era. Hadoop, one of the popular big data processing platforms, has many parameters that relate closely to the utilization of resources (e.g. CPU or memory). Tuning these parameters thus becomes one of the important approaches to improve the resource utilization of Hadoop. However, tuning parameters manually is impractical because the time cost fortuning is too high. Hence it is necessary to configure parameters automatically and quickly to optimize resource utilization. The former auto-tuning methods often take a long time before getting the optimal configuration, which would reduce the overall resource efficiency of cluster. In this paper, we propose mrEtalon, an adaptive tuning framework to recommend a near-optimal configuration for the new job in a short time. mrEtalon sets a configuration repository to provide candidate configurations, as well as a collaborative filtering based recommendation engine that can accelerate the optimization for parameters. We have deployed mrEtalon in our experimental cluster, and the results demonstrate that, for a new MapReduce application, compared to the former methods, mrEtalon can reduce the recommend time to 20% to 30% while keeping nearly the same recommendation quality.","PeriodicalId":188326,"journal":{"name":"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC2.2017.41","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Nowadays we have entered the big data era. Hadoop, one of the popular big data processing platforms, has many parameters that relate closely to the utilization of resources (e.g. CPU or memory). Tuning these parameters thus becomes one of the important approaches to improve the resource utilization of Hadoop. However, tuning parameters manually is impractical because the time cost fortuning is too high. Hence it is necessary to configure parameters automatically and quickly to optimize resource utilization. The former auto-tuning methods often take a long time before getting the optimal configuration, which would reduce the overall resource efficiency of cluster. In this paper, we propose mrEtalon, an adaptive tuning framework to recommend a near-optimal configuration for the new job in a short time. mrEtalon sets a configuration repository to provide candidate configurations, as well as a collaborative filtering based recommendation engine that can accelerate the optimization for parameters. We have deployed mrEtalon in our experimental cluster, and the results demonstrate that, for a new MapReduce application, compared to the former methods, mrEtalon can reduce the recommend time to 20% to 30% while keeping nearly the same recommendation quality.