Strategies for storage of checkpointing data using non-dedicated repositories on Grid systems

Middleware for Grid Computing Pub Date : 2005-11-28 DOI:10.1145/1101499.1101500

R. Camargo, Renato Cerqueira, Fabio Kon

引用次数: 20

Abstract

Dealing with the large amounts of data generated by long-running parallel applications is one of the most challenging aspects of Grid Computing. Periodic checkpoints might be taken to guarantee application progression, producing even more data. The classical approach is to employ high-throughput checkpoint servers connected to the computational nodes by high speed networks. In the case of Opportunistic Grid Computing, we do not want to be forced to rely on such dedicated hardware. Instead, we want to use the shared Grid nodes to store application data in a distributed fashion.In this work, we evaluate several strategies to store checkpoints on distributed non-dedicated repositories. We consider the tradeoff among computational overhead, storage overhead, and degree of fault-tolerance of these strategies. We compare the use of replication, parity information, and information dispersal (IDA). We used InteGrade, an object-oriented Grid middleware, to implement the storage strategies and perform evaluation experiments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在网格系统上使用非专用存储库存储检查点数据的策略

处理由长时间运行的并行应用程序生成的大量数据是网格计算最具挑战性的方面之一。可以采用定期检查点来保证应用程序的进展，从而产生更多的数据。经典的方法是采用高速网络连接到计算节点的高吞吐量检查点服务器。在机会网格计算的情况下，我们不希望被迫依赖这种专用硬件。相反，我们希望使用共享网格节点以分布式方式存储应用程序数据。在这项工作中，我们评估了几种在分布式非专用存储库上存储检查点的策略。我们考虑了这些策略的计算开销、存储开销和容错程度之间的权衡。我们比较了复制、奇偶校验信息和信息分散(IDA)的使用。我们使用面向对象的网格中间件InteGrade来实现存储策略并进行评估实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Middleware for Grid Computing

自引率

0.00%

发文量

期刊最新文献

Replication for dependability on virtualized cloud environments VMR: volunteer MapReduce over the large scale internet An analytical approach for predicting QoS of web services choreographies Towards an SPL-based monitoring middleware strategy for cloud computing applications Estimating resource costs of data-intensive workloads in public clouds