Consistency is one of the fundamental issues of distributed computing. There are many competing consistency models, with subtly different power in principle. In practice, the well known the Consistency-Availability-Partition Tolerance trade-off translates to difficult choices between fault tolerance, performance, and programmability. The issues and trade-offs are particularly vexing at scale, with a large number of processes or a large shared database, and in the presence of high latency and failure-prone networks. It is clear that there is no one universally best solution. Possible approaches cover the whole spectrum between strong and eventual consistency. Strong consistency (linearizability or serializability, achieved via total ordering) provides familiar and intuitive semantics but requires slow and fragile synchronization and coordination overheads. The unlimited parallelism allowed by weaker models such as eventual consistency promises high performance, but divergence and conflicts make it difficult to ensure useful application invariants, and metadata is hard to keep in check. The research and development communities are actively exploring intermediate models (replicated data types, monotonic programming, CRDTs, LVars, causal consistency, red-blue consistency, invariant- and proof-based systems, etc.), designed to improve efficiency, programmability, and overall operation without negatively impacting scalability. This workshop aims to investigate the principles and practice of consistency models for large-scale, distributed shared data systems. It will bring together theoreticians and practitioners from different horizons: system development, distributed algorithms, concurrency, fault tolerance, databases, language and verification, including both academia and industry.
{"title":"Proceedings of the 2nd Workshop on the Principles and Practice of Consistency for Distributed Data","authors":"P. Alvaro, A. Bessani","doi":"10.1145/2911151","DOIUrl":"https://doi.org/10.1145/2911151","url":null,"abstract":"Consistency is one of the fundamental issues of distributed computing. There are many competing consistency models, with subtly different power in principle. In practice, the well known the Consistency-Availability-Partition Tolerance trade-off translates to difficult choices between fault tolerance, performance, and programmability. The issues and trade-offs are particularly vexing at scale, with a large number of processes or a large shared database, and in the presence of high latency and failure-prone networks. It is clear that there is no one universally best solution. Possible approaches cover the whole spectrum between strong and eventual consistency. Strong consistency (linearizability or serializability, achieved via total ordering) provides familiar and intuitive semantics but requires slow and fragile synchronization and coordination overheads. The unlimited parallelism allowed by weaker models such as eventual consistency promises high performance, but divergence and conflicts make it difficult to ensure useful application invariants, and metadata is hard to keep in check. The research and development communities are actively exploring intermediate models (replicated data types, monotonic programming, CRDTs, LVars, causal consistency, red-blue consistency, invariant- and proof-based systems, etc.), designed to improve efficiency, programmability, and overall operation without negatively impacting scalability. \u0000 \u0000This workshop aims to investigate the principles and practice of consistency models for large-scale, distributed shared data systems. It will bring together theoreticians and practitioners from different horizons: system development, distributed algorithms, concurrency, fault tolerance, databases, language and verification, including both academia and industry.","PeriodicalId":259835,"journal":{"name":"Proceedings of the 2nd Workshop on the Principles and Practice of Consistency for Distributed Data","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127061839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conflict-free Replicated Data Types (CRDTs) can simplify the design of deterministic eventual consistency. Considering the several CRDTs that have been deployed in production systems, counters are among the first. Counters are apparently simple, with a straightforward inc/dec/read API, but can require complex implementations and several variants have been specified and coded. Unlike sets and registers, that can be adapted to operate inside maps, current counter approaches exhibit anomalies when embedded in maps. Here, we illustrate the anomaly and propose a solution, based on a new counter model and implementation.
{"title":"The problem with embedded CRDT counters and a solution","authors":"Carlos Baquero, Paulo Sérgio Almeida, Carl Lerche","doi":"10.1145/2911151.2911159","DOIUrl":"https://doi.org/10.1145/2911151.2911159","url":null,"abstract":"Conflict-free Replicated Data Types (CRDTs) can simplify the design of deterministic eventual consistency. Considering the several CRDTs that have been deployed in production systems, counters are among the first. Counters are apparently simple, with a straightforward inc/dec/read API, but can require complex implementations and several variants have been specified and coded. Unlike sets and registers, that can be adapted to operate inside maps, current counter approaches exhibit anomalies when embedded in maps. Here, we illustrate the anomaly and propose a solution, based on a new counter model and implementation.","PeriodicalId":259835,"journal":{"name":"Proceedings of the 2nd Workshop on the Principles and Practice of Consistency for Distributed Data","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115419096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
CRDT[24] Sets as implemented in Riak[6] perform poorly for writes, both as cardinality grows, and for sets larger than 500KB[25]. Riak users wish to create high cardinality CRDT sets, and expect better than O(n) performance for individual insert and remove operations. By decomposing a CRDT set on disk, and employing delta-replication[2], we can achieve far better performance than just delta replication alone: relative to the size of causal metadata, not the cardinality of the set, and we can support sets that are 100s times the size of Riak sets, while still providing the same level of consistency. There is a trade-off in read performance but we expect it is mitigated by enabling queries on sets.
{"title":"Big(ger) sets: decomposed delta CRDT sets in Riak","authors":"R. Brown, Zeeshan Ali Lakhani, P. Place","doi":"10.1145/2911151.2911156","DOIUrl":"https://doi.org/10.1145/2911151.2911156","url":null,"abstract":"CRDT[24] Sets as implemented in Riak[6] perform poorly for writes, both as cardinality grows, and for sets larger than 500KB[25]. Riak users wish to create high cardinality CRDT sets, and expect better than O(n) performance for individual insert and remove operations. By decomposing a CRDT set on disk, and employing delta-replication[2], we can achieve far better performance than just delta replication alone: relative to the size of causal metadata, not the cardinality of the set, and we can support sets that are 100s times the size of Riak sets, while still providing the same level of consistency. There is a trade-off in read performance but we expect it is mitigated by enabling queries on sets.","PeriodicalId":259835,"journal":{"name":"Proceedings of the 2nd Workshop on the Principles and Practice of Consistency for Distributed Data","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131714810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Zawirski, Carlos Baquero, Annette Bieniusa, Nuno M. Preguiça, M. Shapiro
In order to converge in the presence of concurrent updates, modern eventually consistent replication systems rely on causality information and operation semantics. It is relatively easy to use semantics of high-level operations on replicated data structures, such as sets, lists, etc. However, it is difficult to exploit semantics of operations on registers, which store opaque data. In existing register designs, concurrent writes are resolved either by the application, or by arbitrating them according to their timestamps. The former is complex and may require user intervention, whereas the latter causes arbitrary updates to be lost. In this work, we identify a register construction that generalizes existing ones by combining runtime causality ordering, to identify concurrent writes, with static data semantics, to resolve them. We propose a simple conflict resolution template based on an application-predefined order on the domain of values. It eliminates or reduces the number of conflicts that need to be resolved by the user or by an explicit application logic. We illustrate some variants of our approach with use cases, and how it generalizes existing designs.
{"title":"Eventually consistent register revisited","authors":"M. Zawirski, Carlos Baquero, Annette Bieniusa, Nuno M. Preguiça, M. Shapiro","doi":"10.1145/2911151.2911157","DOIUrl":"https://doi.org/10.1145/2911151.2911157","url":null,"abstract":"In order to converge in the presence of concurrent updates, modern eventually consistent replication systems rely on causality information and operation semantics. It is relatively easy to use semantics of high-level operations on replicated data structures, such as sets, lists, etc. However, it is difficult to exploit semantics of operations on registers, which store opaque data. In existing register designs, concurrent writes are resolved either by the application, or by arbitrating them according to their timestamps. The former is complex and may require user intervention, whereas the latter causes arbitrary updates to be lost. In this work, we identify a register construction that generalizes existing ones by combining runtime causality ordering, to identify concurrent writes, with static data semantics, to resolve them. We propose a simple conflict resolution template based on an application-predefined order on the domain of values. It eliminates or reduces the number of conflicts that need to be resolved by the user or by an explicit application logic. We illustrate some variants of our approach with use cases, and how it generalizes existing designs.","PeriodicalId":259835,"journal":{"name":"Proceedings of the 2nd Workshop on the Principles and Practice of Consistency for Distributed Data","volume":"30 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127835499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}