Otterman: A Novel Approach of Spark Auto-tuning by a Hybrid Strategy

Haizhou Du, Ping Han, Wei Chen, Yi Wang, Chenlu Zhang
{"title":"Otterman: A Novel Approach of Spark Auto-tuning by a Hybrid Strategy","authors":"Haizhou Du, Ping Han, Wei Chen, Yi Wang, Chenlu Zhang","doi":"10.1109/ICSAI.2018.8599304","DOIUrl":null,"url":null,"abstract":"Spark has become a very attractive platform for big data analytics in recent years due to its unique advantages such as parallelism, fault tolerance, and complexity associated with clusters setup. On the spark platform, users can adjust parameter configurations according to different job requirements and specific applications to optimize performance. This leads to a problem that we can’t ignore, Spark already has more than 180 parameters, and its huge combination of parameters means that we can’t rely on manual tuning to grasp the impact of all parameters on performance. In order to solve the problem of relying heavily on expert experience and manual operation, we propose Otterman, a parameters optimization approach based on the combination of Simulated Annealing algorithm and Least Squares method, which can help us dynamically adjust parameters according to job types to obtain optimal configuration to improve performance. Simulated Annealing can find the optimal solution, but has poor convergence. We make use of the Least Squares method to effectively improve the speed at which the former converges to the optimal solution. Otterman is simple and easy to perform, with no additional cost. The effectiveness of the approach is verified by experiments, the results show that Otterman’s average performance has increased by 30% compared to the default parameters configuration, with an accuracy of about 68%.","PeriodicalId":375852,"journal":{"name":"2018 5th International Conference on Systems and Informatics (ICSAI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2018.8599304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Spark has become a very attractive platform for big data analytics in recent years due to its unique advantages such as parallelism, fault tolerance, and complexity associated with clusters setup. On the spark platform, users can adjust parameter configurations according to different job requirements and specific applications to optimize performance. This leads to a problem that we can’t ignore, Spark already has more than 180 parameters, and its huge combination of parameters means that we can’t rely on manual tuning to grasp the impact of all parameters on performance. In order to solve the problem of relying heavily on expert experience and manual operation, we propose Otterman, a parameters optimization approach based on the combination of Simulated Annealing algorithm and Least Squares method, which can help us dynamically adjust parameters according to job types to obtain optimal configuration to improve performance. Simulated Annealing can find the optimal solution, but has poor convergence. We make use of the Least Squares method to effectively improve the speed at which the former converges to the optimal solution. Otterman is simple and easy to perform, with no additional cost. The effectiveness of the approach is verified by experiments, the results show that Otterman’s average performance has increased by 30% compared to the default parameters configuration, with an accuracy of about 68%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Otterman:一种基于混合策略的Spark自动调谐新方法
近年来,由于其独特的优势,如并行性、容错性和与集群设置相关的复杂性,Spark已经成为一个非常有吸引力的大数据分析平台。在spark平台上,用户可以根据不同的工作要求和特定的应用调整参数配置,以优化性能。这导致了一个我们不能忽视的问题,Spark已经有超过180个参数,其庞大的参数组合意味着我们不能依靠手动调优来掌握所有参数对性能的影响。为了解决严重依赖专家经验和人工操作的问题,我们提出了一种基于模拟退火算法和最小二乘法相结合的参数优化方法Otterman,它可以帮助我们根据作业类型动态调整参数以获得最优配置,从而提高性能。模拟退火可以找到最优解,但收敛性较差。我们利用最小二乘法有效地提高了前者收敛到最优解的速度。Otterman操作简单,无需额外费用。通过实验验证了该方法的有效性,结果表明,与默认参数配置相比,Otterman的平均性能提高了30%,准确率约为68%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Research on Improvement of Text Processing and Clustering Algorithms in Public Opinion Early Warning System Mutation Relation Extraction and Genes Network Analysis in Colon Cancer Discovering Transportation Mode of Tourists Using Low-Sampling-Rate Trajectory of Cellular Data Sound Source Separation by Instantaneous Estimation-Based Spectral Subtraction Evaluation Of Electricity Market Operation Efficiency Based On Analytic Hierarchy Process-Grey Relational Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1