{"title":"Efficient Message Logging to Support Process Replicas in a Volunteer Computing Environment","authors":"M. Islam, Hien Nguyen, J. Subhlok, E. Gabriel","doi":"10.1109/IPDPSW.2015.91","DOIUrl":null,"url":null,"abstract":"The context of this research is Volpex, a communication framework based on Put/Get calls to an abstract global space that can seamlessly handle multiple active replicas of communicating processes. Volpex is designed for a heterogeneous and unreliable execution environment where parallel applications need replication as well as check pointing to make continuous progress. Since different instances of the same process can execute in the same logical state at different clock times, communicated data objects must be logged to ensure consistent execution of process replicas. Logging to support redundancy can be the source of a significant overhead in execution time and storage and can limit scalability. In this paper we develop, implement, and evaluate Log on Read and Log on Write logging schemes to support redundant communication. Log on Read schemes log a copy of the data object returned to every Get (or Read) request. On the other hand, Log on Write schemes log the old data object only when a Put request is overwriting a data object. This reduces redundant copying, but identifying the correct data object to return to a Get request is complex. A Virtual Time Stamp (VTS) that captures global execution state is logged along with the data object to make this possible. We develop an optimized Log on Read scheme that minimizes redundancy and an optimized Log on Write scheme that reduces the VTS size and overhead. Experimental results show that the optimizations are effective in terms of storage and time overhead and an optimized Log on Read scheme presents the best tradeoffs for most scenarios.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2015.91","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The context of this research is Volpex, a communication framework based on Put/Get calls to an abstract global space that can seamlessly handle multiple active replicas of communicating processes. Volpex is designed for a heterogeneous and unreliable execution environment where parallel applications need replication as well as check pointing to make continuous progress. Since different instances of the same process can execute in the same logical state at different clock times, communicated data objects must be logged to ensure consistent execution of process replicas. Logging to support redundancy can be the source of a significant overhead in execution time and storage and can limit scalability. In this paper we develop, implement, and evaluate Log on Read and Log on Write logging schemes to support redundant communication. Log on Read schemes log a copy of the data object returned to every Get (or Read) request. On the other hand, Log on Write schemes log the old data object only when a Put request is overwriting a data object. This reduces redundant copying, but identifying the correct data object to return to a Get request is complex. A Virtual Time Stamp (VTS) that captures global execution state is logged along with the data object to make this possible. We develop an optimized Log on Read scheme that minimizes redundancy and an optimized Log on Write scheme that reduces the VTS size and overhead. Experimental results show that the optimizations are effective in terms of storage and time overhead and an optimized Log on Read scheme presents the best tradeoffs for most scenarios.
本研究的背景是Volpex,这是一个基于对抽象全局空间的Put/Get调用的通信框架,可以无缝地处理通信过程的多个活动副本。Volpex是为异构和不可靠的执行环境而设计的,在这种环境中,并行应用程序需要复制和检查指向来进行持续的进展。由于同一流程的不同实例可以在不同的时钟时间以相同的逻辑状态执行,因此必须记录通信数据对象,以确保流程副本的一致执行。支持冗余的日志记录可能是执行时间和存储开销的重要来源,并可能限制可伸缩性。在本文中,我们开发、实现和评估了Log on Read和Log on Write日志方案,以支持冗余通信。登录读取方案记录返回给每个Get(或Read)请求的数据对象的副本。另一方面,Log On Write模式只在Put请求覆盖数据对象时记录旧数据对象。这减少了冗余复制,但是识别要返回给Get请求的正确数据对象是复杂的。捕获全局执行状态的虚拟时间戳(VTS)与数据对象一起被记录下来,以实现这一点。我们开发了一个优化的Log on Read方案,最大限度地减少冗余,以及一个优化的Log on Write方案,减少VTS大小和开销。实验结果表明,优化在存储和时间开销方面是有效的,优化的Log on Read方案在大多数情况下都是最佳的折衷方案。