Recoverable Mutual Exclusion Under System-Wide Failures

W. Golab, Danny Hendler
{"title":"Recoverable Mutual Exclusion Under System-Wide Failures","authors":"W. Golab, Danny Hendler","doi":"10.1145/3212734.3212755","DOIUrl":null,"url":null,"abstract":"Recoverable mutual exclusion (RME) is a variation on the classic mutual exclusion (ME) problem that allows processes to crash and recover. The time complexity of RME algorithms is quantified in the same way as for ME, namely by counting remote memory references -- expensive memory operations that traverse the processor-to-memory interconnect. Prior work has established that the RMR complexity of the RME problem for n processes is Θ(log n) for the class of algorithms that use read/write registers and single-word comparison primitives such as Compare-And-Swap (Golab and Ramaraju 2016), O(log n / log log n) for the class of algorithms that use read/write registers and additional single-word read-modify-primitives such as Fetch-And-Store (Golab and Hendler 2017), and Θ(1) for the class of algorithms that use read/write registers and specialized double-word read-modify-write primitives (Golab and Hendler 2017). These complexity bounds hold in a model of computation where processes may fail independently, and where a process that fails while accessing the mutex is required to recover eventually. This body of work leaves open two important questions: (i) what is the tight bound on the RMR complexity of RME for the class of algorithms that use read/write registers and commonly supported single-word read-modify-primitives; and (ii) how is the RMR complexity of RME affected by variations in the failure model? This paper answers both questions partially by showing that RME can be solved using O(1) RMRs per passage in the worst case in a model where failures are system-wide (i.e., all processes crash simultaneously), and processes receive additional information from the environment regarding the occurrence of the failure. The upper bound algorithm we present relies crucially on a novel RMR-efficient barrier that processes use to synchronize recovery actions after each failure. The barrier uses read/write registers and single-word Compare-And-Swap only. Additionally, we present a transformation that can add properties such as critical section re-entry and a strong notion of starvation freedom to any RME algorithm while preserving its asymptotic RMR complexity.","PeriodicalId":198284,"journal":{"name":"Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3212734.3212755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

Recoverable mutual exclusion (RME) is a variation on the classic mutual exclusion (ME) problem that allows processes to crash and recover. The time complexity of RME algorithms is quantified in the same way as for ME, namely by counting remote memory references -- expensive memory operations that traverse the processor-to-memory interconnect. Prior work has established that the RMR complexity of the RME problem for n processes is Θ(log n) for the class of algorithms that use read/write registers and single-word comparison primitives such as Compare-And-Swap (Golab and Ramaraju 2016), O(log n / log log n) for the class of algorithms that use read/write registers and additional single-word read-modify-primitives such as Fetch-And-Store (Golab and Hendler 2017), and Θ(1) for the class of algorithms that use read/write registers and specialized double-word read-modify-write primitives (Golab and Hendler 2017). These complexity bounds hold in a model of computation where processes may fail independently, and where a process that fails while accessing the mutex is required to recover eventually. This body of work leaves open two important questions: (i) what is the tight bound on the RMR complexity of RME for the class of algorithms that use read/write registers and commonly supported single-word read-modify-primitives; and (ii) how is the RMR complexity of RME affected by variations in the failure model? This paper answers both questions partially by showing that RME can be solved using O(1) RMRs per passage in the worst case in a model where failures are system-wide (i.e., all processes crash simultaneously), and processes receive additional information from the environment regarding the occurrence of the failure. The upper bound algorithm we present relies crucially on a novel RMR-efficient barrier that processes use to synchronize recovery actions after each failure. The barrier uses read/write registers and single-word Compare-And-Swap only. Additionally, we present a transformation that can add properties such as critical section re-entry and a strong notion of starvation freedom to any RME algorithm while preserving its asymptotic RMR complexity.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在系统范围故障下可恢复的互斥
可恢复互斥(RME)是经典互斥(ME)问题的一种变体,它允许进程崩溃并恢复。RME算法的时间复杂度以与ME相同的方式进行量化,即通过计算远程内存引用——遍历处理器到内存互连的昂贵内存操作。先前的工作已经确定,对于使用读写寄存器和单字比较原语(如比较和交换)的算法类,n个进程的RME问题的RMR复杂性为Θ(log n) (Golab和Ramaraju 2016),对于使用读写寄存器和其他单字读-修改原语(如Fetch-And-Store)的算法类,RMR复杂性为O(log n / log log n) (Golab和Hendler 2017)。和Θ(1)用于使用读/写寄存器和专用双字读-修改-写原语的算法类(Golab和Hendler 2017)。这些复杂性界限适用于这样的计算模型,即进程可能独立失败,并且访问互斥锁时失败的进程需要最终恢复。这项工作留下了两个重要的问题:(i)对于使用读/写寄存器和通常支持的单字读-修改原语的算法类,RME的RMR复杂性的严格界限是什么;(ii) RME的RMR复杂性如何受到失效模型变化的影响?本文部分地回答了这两个问题,在模型的最坏情况下,故障是系统范围的(即,所有进程同时崩溃),并且进程从环境接收有关故障发生的附加信息,通过展示每个通道使用O(1)个RME可以解决。我们提出的上限算法主要依赖于一种新的rmr高效屏障,该屏障用于在每次故障后同步恢复动作。屏障只使用读/写寄存器和单字比较与交换。此外,我们提出了一种转换,该转换可以为任何RME算法添加诸如临界区重入和强饥饿自由概念等属性,同时保持其渐近RMR复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Tutorial: Consistency Choices in Modern Distributed Systems Locking Timestamps versus Locking Objects Recoverable Mutual Exclusion Under System-Wide Failures Nesting-Safe Recoverable Linearizability: Modular Constructions for Non-Volatile Memory Brief Announcement: Beeping a Time-Optimal Leader Election
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1