Developing Checkpointing and Recovery Procedures with the Storage Services of Amazon Web Services

Luan Teylo, R. Brum, L. Arantes, Pierre Sens, Lúcia M. A. Drummond
{"title":"Developing Checkpointing and Recovery Procedures with the Storage Services of Amazon Web Services","authors":"Luan Teylo, R. Brum, L. Arantes, Pierre Sens, Lúcia M. A. Drummond","doi":"10.1145/3409390.3409407","DOIUrl":null,"url":null,"abstract":"In recent years, cloud computing has grown in popularity as they give users easy and almost instantaneous access to different computational resources. Some cloud providers, like Amazon, took advantage of the growing popularity and offered their VMs in some different hiring types: on-demand, reserved, and spot. The last type is usually offered at lower prices but can be terminated by the provider at any time. To deal with those failures, checkpoint and recovery procedures are typically used. In this context, we propose and analyze checkpoint and recovery procedures using three different storage services from Amazon: Amazon Simple Storage Service (S3), Amazon Elastic Block Store (EBS) and Amazon Elastic File System (EFS), considering spot VMs. These procedures were built upon the HADS framework, designed to schedule bag-of-tasks applications to spot and on-demand VMs. Our results showed that EBS outperformed the other approaches in terms of time spent on recording a checkpoint. But it required more time in the recovery procedure. EFS presented checkpointing and recovery times close to EBS but with higher monetary costs than the other services. S3 proved to be the best option in terms of monetary cost but required a longer time for recording a checkpoint, individually. However, when concurrent checkpoints were analysed, which can occur in a real application with lots of tasks, in our tests, S3 outperformed EFS in terms of execution time also.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop Proceedings of the 49th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3409390.3409407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

In recent years, cloud computing has grown in popularity as they give users easy and almost instantaneous access to different computational resources. Some cloud providers, like Amazon, took advantage of the growing popularity and offered their VMs in some different hiring types: on-demand, reserved, and spot. The last type is usually offered at lower prices but can be terminated by the provider at any time. To deal with those failures, checkpoint and recovery procedures are typically used. In this context, we propose and analyze checkpoint and recovery procedures using three different storage services from Amazon: Amazon Simple Storage Service (S3), Amazon Elastic Block Store (EBS) and Amazon Elastic File System (EFS), considering spot VMs. These procedures were built upon the HADS framework, designed to schedule bag-of-tasks applications to spot and on-demand VMs. Our results showed that EBS outperformed the other approaches in terms of time spent on recording a checkpoint. But it required more time in the recovery procedure. EFS presented checkpointing and recovery times close to EBS but with higher monetary costs than the other services. S3 proved to be the best option in terms of monetary cost but required a longer time for recording a checkpoint, individually. However, when concurrent checkpoints were analysed, which can occur in a real application with lots of tasks, in our tests, S3 outperformed EFS in terms of execution time also.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用Amazon Web Services的存储服务开发检查点和恢复程序
近年来,云计算越来越受欢迎,因为它们使用户可以轻松且几乎即时地访问不同的计算资源。一些云提供商,如亚马逊,利用了日益流行的趋势,并提供了一些不同类型的虚拟机:按需、保留和现货。最后一种通常以较低的价格提供,但提供商可以随时终止。为了处理这些故障,通常使用检查点和恢复过程。在此背景下,我们使用Amazon的三种不同的存储服务:Amazon Simple storage Service (S3)、Amazon Elastic Block Store (EBS)和Amazon Elastic File System (EFS)提出并分析了检查点和恢复过程,并考虑了spot vm。这些过程构建在HADS框架之上,旨在将任务包应用程序调度到指定vm和按需vm。我们的结果表明,EBS在记录检查点所花费的时间方面优于其他方法。但在恢复过程中需要更多时间。EFS提供的检查点和恢复时间接近EBS,但比其他服务的货币成本更高。S3被证明是在金钱成本方面的最佳选择,但是单独记录检查点需要更长的时间。但是,当分析并发检查点时(这可能发生在具有许多任务的实际应用程序中),在我们的测试中,S3在执行时间方面也优于EFS。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Characterizing the Cost-Accuracy Performance of Cloud Applications Symmetric Tokens based Group Mutual Exclusion Fast Modeling of Network Contention in Batch Point-to-point Communications by Packet-level Simulation with Dynamic Time-stepping Exploiting Dynamism in HPC Applications to Optimize Energy-Efficiency Developing Checkpointing and Recovery Procedures with the Storage Services of Amazon Web Services
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1