Two Elementary Instructions Make Compare-and-Swap

P. Khanchandani, Roger Wattenhofer
{"title":"Two Elementary Instructions Make Compare-and-Swap","authors":"P. Khanchandani, Roger Wattenhofer","doi":"10.1109/IPDPS.2019.00046","DOIUrl":null,"url":null,"abstract":"The consensus number of an object is the maximum number of processes among which binary consensus can be solved using any number of instances of the object and read-write registers. Herlihy [1] showed in his seminal work that if an object has a consensus number of n, then its instances can be used to implement any non-trivial object or data structure that is shared among n processes, so that the implementation is wait-free and linearizable. Thus, an object such as compare-and-set with an infinite consensus number is \"advanced\" because its instances can be used to implement any non-trivial concurrent object shared among any number of processes. On the other hand, objects such as fetch-and-add or fetch-and-multiply have a consensus number of two and are \"elementary\". An important consequence of Herlihy's result was that any number of reasonable elementary objects are provably insufficient to implement an advanced object like compare-and-set. However, Ellen et al. [2] observed recently that real multiprocessors do not compute using objects but using instructions that are applied on memory locations. Using this observation, they show that it is possible to use a couple of elementary instructions on the same memory location to implement an advanced one, and consequently any non-trivial object or data structure. However, the above result is only a possibility and uses a generic universal construction as a black-box, which is not how we implement objects in practice, as the generic construction is quite inefficient with respect to the number of steps taken by a process and the number of shared objects used in the worst case. Instead, the efficient implementations are built upon the widely supported compare-and-set instruction and one cannot conclude from the previous result whether the elementary instructions can also produce equally efficient implementations like compare-and-set does or they are fundamentally limited in this respect. In this paper, we answer this question by giving a wait-free and linearizable implementation of compare-and-set using just two elementary instructions, half-max and max-write. The implementation takes O(1) steps per process and uses O(1) shared objects per process. Thus, any known or unknown compare-and-set based implementation can also be done using only two elementary instructions without any loss in efficiency. An interesting aspect of these elementary instructions is that depending on the underlying system, their throughput in a highly concurrent setting is larger than that of the compare-and-set instructions by a factor proportional to n.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2019.00046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The consensus number of an object is the maximum number of processes among which binary consensus can be solved using any number of instances of the object and read-write registers. Herlihy [1] showed in his seminal work that if an object has a consensus number of n, then its instances can be used to implement any non-trivial object or data structure that is shared among n processes, so that the implementation is wait-free and linearizable. Thus, an object such as compare-and-set with an infinite consensus number is "advanced" because its instances can be used to implement any non-trivial concurrent object shared among any number of processes. On the other hand, objects such as fetch-and-add or fetch-and-multiply have a consensus number of two and are "elementary". An important consequence of Herlihy's result was that any number of reasonable elementary objects are provably insufficient to implement an advanced object like compare-and-set. However, Ellen et al. [2] observed recently that real multiprocessors do not compute using objects but using instructions that are applied on memory locations. Using this observation, they show that it is possible to use a couple of elementary instructions on the same memory location to implement an advanced one, and consequently any non-trivial object or data structure. However, the above result is only a possibility and uses a generic universal construction as a black-box, which is not how we implement objects in practice, as the generic construction is quite inefficient with respect to the number of steps taken by a process and the number of shared objects used in the worst case. Instead, the efficient implementations are built upon the widely supported compare-and-set instruction and one cannot conclude from the previous result whether the elementary instructions can also produce equally efficient implementations like compare-and-set does or they are fundamentally limited in this respect. In this paper, we answer this question by giving a wait-free and linearizable implementation of compare-and-set using just two elementary instructions, half-max and max-write. The implementation takes O(1) steps per process and uses O(1) shared objects per process. Thus, any known or unknown compare-and-set based implementation can also be done using only two elementary instructions without any loss in efficiency. An interesting aspect of these elementary instructions is that depending on the underlying system, their throughput in a highly concurrent setting is larger than that of the compare-and-set instructions by a factor proportional to n.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
两个基本指令进行比较与交换
对象的共识数是使用任意数量的对象实例和读写寄存器解决二进制共识的最大进程数。Herlihy[1]在他的开创性工作中表明,如果一个对象具有n个共识数,那么它的实例可以用于实现在n个进程之间共享的任何非平凡对象或数据结构,因此实现是无等待和线性化的。因此,具有无限共识数的对象(如compare-and-set)是“高级”的,因为它的实例可用于实现在任意数量的进程之间共享的任何非平凡并发对象。另一方面,诸如“获取并添加”或“获取并相乘”之类的对象的共识数为2,并且是“基本的”。Herlihy结果的一个重要结论是,任何数量的合理基本对象都不足以实现像比较与设置这样的高级对象。然而,Ellen等人最近观察到,真正的多处理器不使用对象计算,而是使用应用于内存位置的指令。通过这一观察,他们表明可以在相同的内存位置上使用一对基本指令来实现高级指令,从而实现任何重要的对象或数据结构。然而,上述结果只是一种可能性,并且使用了通用的通用结构作为黑盒,这不是我们在实践中实现对象的方式,因为在最坏的情况下,就进程所采取的步骤数量和所使用的共享对象数量而言,通用结构是相当低效的。相反,有效的实现是建立在广泛支持的比较和设置指令之上的,人们不能从前面的结果中得出结论,基本指令是否也能像比较和设置一样产生同样有效的实现,或者它们在这方面基本上是有限的。在本文中,我们给出了一个无等待且线性化的比较与设置的实现,只使用两个基本指令half-max和max-write来回答这个问题。每个进程执行O(1)个步骤,每个进程使用O(1)个共享对象。因此,任何已知或未知的基于比较和设置的实现也可以只用两个基本指令完成,而不会损失任何效率。这些基本指令的一个有趣的方面是,根据底层系统的不同,它们在高度并发设置下的吞吐量比比较和设置指令的吞吐量大一个与n成正比的系数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Distributed Weighted All Pairs Shortest Paths Through Pipelining SAFIRE: Scalable and Accurate Fault Injection for Parallel Multithreaded Applications Architecting Racetrack Memory Preshift through Pattern-Based Prediction Mechanisms Z-Dedup:A Case for Deduplicating Compressed Contents in Cloud Dual Pattern Compression Using Data-Preprocessing for Large-Scale GPU Architectures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1