Employment of Optimal Approximations on Apache Hadoop Checkpoint Technique for Performance Improvements

Paulo Vinicius Cardoso, R. Fazul, P. Barcelos
{"title":"Employment of Optimal Approximations on Apache Hadoop Checkpoint Technique for Performance Improvements","authors":"Paulo Vinicius Cardoso, R. Fazul, P. Barcelos","doi":"10.1109/ICSA47634.2020.00009","DOIUrl":null,"url":null,"abstract":"The Checkpoint and Recovery (CR) technique is widely used due to its fault tolerance efficiency. The Apache Hadoop framework uses this technique as a way to avoid failures in its distributed file system. However, determining the optimal interval between successive checkpoints is a challenge, mainly inside Hadoop as it does not allow real-time modifications. The Dynamic Configuration Architecture (DCA) was created to solve this issue by enabling changes in the checkpoint period without any interruption of the Hadoop services. This paper presents improvements for the DCA through the configuration of the Hadoop checkpoint period in real-time based on optimal period approximations that were already endorsed by the literature. The proposed improvement depends on the tracking of the system resources. The data collected from these resources are stored in a history of attributes: a tree of monitored elements where data is updated as new observations are experienced in the system. This feature enables the user to estimate system factors so that our solution computes the checkpoints costs and the mean time between failures (MTBF). For the validation, experiments with transient failure in the NameNode were created and the usage of the history of attributes was tested in different scenarios. The evaluation results show that an adaptive configuration of checkpoint periods reduces the wasted time caused by failures in the NameNode and improves Hadoop performance. Also, the history of attributes demonstrated its value by providing an efficient way to estimate the system factors.","PeriodicalId":6599,"journal":{"name":"2017 IEEE International Conference on Software Architecture (ICSA)","volume":"32 1","pages":"1-10"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Software Architecture (ICSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSA47634.2020.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The Checkpoint and Recovery (CR) technique is widely used due to its fault tolerance efficiency. The Apache Hadoop framework uses this technique as a way to avoid failures in its distributed file system. However, determining the optimal interval between successive checkpoints is a challenge, mainly inside Hadoop as it does not allow real-time modifications. The Dynamic Configuration Architecture (DCA) was created to solve this issue by enabling changes in the checkpoint period without any interruption of the Hadoop services. This paper presents improvements for the DCA through the configuration of the Hadoop checkpoint period in real-time based on optimal period approximations that were already endorsed by the literature. The proposed improvement depends on the tracking of the system resources. The data collected from these resources are stored in a history of attributes: a tree of monitored elements where data is updated as new observations are experienced in the system. This feature enables the user to estimate system factors so that our solution computes the checkpoints costs and the mean time between failures (MTBF). For the validation, experiments with transient failure in the NameNode were created and the usage of the history of attributes was tested in different scenarios. The evaluation results show that an adaptive configuration of checkpoint periods reduces the wasted time caused by failures in the NameNode and improves Hadoop performance. Also, the history of attributes demonstrated its value by providing an efficient way to estimate the system factors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用Apache Hadoop检查点技术的最优逼近来提高性能
检查点与恢复(CR)技术以其容错效率得到了广泛的应用。Apache Hadoop框架使用这种技术来避免分布式文件系统中的故障。然而,确定连续检查点之间的最佳间隔是一个挑战,主要是在Hadoop内部,因为它不允许实时修改。动态配置架构(Dynamic Configuration Architecture, DCA)就是为了解决这个问题而创建的,它允许在检查点期间进行更改,而不会中断Hadoop服务。本文通过基于文献已经认可的最优周期近似的实时配置Hadoop检查点周期,提出了对DCA的改进。提出的改进依赖于对系统资源的跟踪。从这些资源收集的数据存储在属性的历史记录中:一个被监视元素的树,其中的数据随着系统中出现新的观察结果而更新。此功能使用户能够估计系统因素,以便我们的解决方案计算检查点成本和平均故障间隔时间(MTBF)。为了验证,在NameNode中创建了瞬态故障实验,并在不同场景中测试了属性历史的使用情况。评估结果表明,自适应配置检查点周期减少了由于NameNode故障造成的浪费时间,提高了Hadoop的性能。此外,属性的历史通过提供一种有效的方法来估计系统因素,从而证明了它的价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Message from the ICSA 2022 General Chairs and Program Chairs Software Architecture: 16th European Conference, ECSA 2022, Prague, Czech Republic, September 19–23, 2022, Proceedings Software Architecture: 15th European Conference, ECSA 2021 Tracks and Workshops; Växjö, Sweden, September 13–17, 2021, Revised Selected Papers Software Architecture: 15th European Conference, ECSA 2021, Virtual Event, Sweden, September 13-17, 2021, Proceedings Employment of Optimal Approximations on Apache Hadoop Checkpoint Technique for Performance Improvements
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1