Strategies for storage of checkpointing data using non-dedicated repositories on Grid systems

R. Camargo, Renato Cerqueira, Fabio Kon
{"title":"Strategies for storage of checkpointing data using non-dedicated repositories on Grid systems","authors":"R. Camargo, Renato Cerqueira, Fabio Kon","doi":"10.1145/1101499.1101500","DOIUrl":null,"url":null,"abstract":"Dealing with the large amounts of data generated by long-running parallel applications is one of the most challenging aspects of Grid Computing. Periodic checkpoints might be taken to guarantee application progression, producing even more data. The classical approach is to employ high-throughput checkpoint servers connected to the computational nodes by high speed networks. In the case of Opportunistic Grid Computing, we do not want to be forced to rely on such dedicated hardware. Instead, we want to use the shared Grid nodes to store application data in a distributed fashion.In this work, we evaluate several strategies to store checkpoints on distributed non-dedicated repositories. We consider the tradeoff among computational overhead, storage overhead, and degree of fault-tolerance of these strategies. We compare the use of replication, parity information, and information dispersal (IDA). We used InteGrade, an object-oriented Grid middleware, to implement the storage strategies and perform evaluation experiments.","PeriodicalId":313448,"journal":{"name":"Middleware for Grid Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2005-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Middleware for Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1101499.1101500","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

Dealing with the large amounts of data generated by long-running parallel applications is one of the most challenging aspects of Grid Computing. Periodic checkpoints might be taken to guarantee application progression, producing even more data. The classical approach is to employ high-throughput checkpoint servers connected to the computational nodes by high speed networks. In the case of Opportunistic Grid Computing, we do not want to be forced to rely on such dedicated hardware. Instead, we want to use the shared Grid nodes to store application data in a distributed fashion.In this work, we evaluate several strategies to store checkpoints on distributed non-dedicated repositories. We consider the tradeoff among computational overhead, storage overhead, and degree of fault-tolerance of these strategies. We compare the use of replication, parity information, and information dispersal (IDA). We used InteGrade, an object-oriented Grid middleware, to implement the storage strategies and perform evaluation experiments.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在网格系统上使用非专用存储库存储检查点数据的策略
处理由长时间运行的并行应用程序生成的大量数据是网格计算最具挑战性的方面之一。可以采用定期检查点来保证应用程序的进展,从而产生更多的数据。经典的方法是采用高速网络连接到计算节点的高吞吐量检查点服务器。在机会网格计算的情况下,我们不希望被迫依赖这种专用硬件。相反,我们希望使用共享网格节点以分布式方式存储应用程序数据。在这项工作中,我们评估了几种在分布式非专用存储库上存储检查点的策略。我们考虑了这些策略的计算开销、存储开销和容错程度之间的权衡。我们比较了复制、奇偶校验信息和信息分散(IDA)的使用。我们使用面向对象的网格中间件InteGrade来实现存储策略并进行评估实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Replication for dependability on virtualized cloud environments VMR: volunteer MapReduce over the large scale internet An analytical approach for predicting QoS of web services choreographies Towards an SPL-based monitoring middleware strategy for cloud computing applications Estimating resource costs of data-intensive workloads in public clouds
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1