Ricardo Gonçalves, Paulo Sérgio Almeida, Carlos Baquero, V. Fonte
{"title":"DottedDB:反熵没有默克尔树,删除没有墓碑","authors":"Ricardo Gonçalves, Paulo Sérgio Almeida, Carlos Baquero, V. Fonte","doi":"10.1109/SRDS.2017.28","DOIUrl":null,"url":null,"abstract":"To achieve high availability in the face of network partitions, many distributed databases adopt eventual consistency, allow temporary conflicts due to concurrent writes, and use some form of per-key logical clock to detect and resolve such conflicts. Furthermore, nodes synchronize periodically to ensure replica convergence in a process called anti-entropy, normally using Merkle Trees. We present the design of DottedDB, a Dynamo-like key-value store, which uses a novel node-wide logical clock framework, overcoming three fundamental limitations of the state of the art: (1) minimize the metadata per key necessary to track causality, avoiding its growth even in the face of node churn; (2) correctly and durably delete keys, with no need for tombstones; (3) offer a lightweight anti-entropy mechanism to converge replicated data, avoiding the need for Merkle Trees. We evaluate DottedDB against MerkleDB, an otherwise identical database, but using per-key logical clocks and Merkle Trees for anti-entropy, to precisely measure the impact of the novel approach. Results show that: causality metadata per object always converges rapidly to only one id-counter pair; distributed deletes are correctly achieved without global coordination and with constant metadata; divergent nodes are synchronized faster, with less memory-footprint and with less communication overhead than using Merkle Trees.","PeriodicalId":6475,"journal":{"name":"2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS)","volume":"35 1","pages":"194-203"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"DottedDB: Anti-Entropy without Merkle Trees, Deletes without Tombstones\",\"authors\":\"Ricardo Gonçalves, Paulo Sérgio Almeida, Carlos Baquero, V. Fonte\",\"doi\":\"10.1109/SRDS.2017.28\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To achieve high availability in the face of network partitions, many distributed databases adopt eventual consistency, allow temporary conflicts due to concurrent writes, and use some form of per-key logical clock to detect and resolve such conflicts. Furthermore, nodes synchronize periodically to ensure replica convergence in a process called anti-entropy, normally using Merkle Trees. We present the design of DottedDB, a Dynamo-like key-value store, which uses a novel node-wide logical clock framework, overcoming three fundamental limitations of the state of the art: (1) minimize the metadata per key necessary to track causality, avoiding its growth even in the face of node churn; (2) correctly and durably delete keys, with no need for tombstones; (3) offer a lightweight anti-entropy mechanism to converge replicated data, avoiding the need for Merkle Trees. We evaluate DottedDB against MerkleDB, an otherwise identical database, but using per-key logical clocks and Merkle Trees for anti-entropy, to precisely measure the impact of the novel approach. Results show that: causality metadata per object always converges rapidly to only one id-counter pair; distributed deletes are correctly achieved without global coordination and with constant metadata; divergent nodes are synchronized faster, with less memory-footprint and with less communication overhead than using Merkle Trees.\",\"PeriodicalId\":6475,\"journal\":{\"name\":\"2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS)\",\"volume\":\"35 1\",\"pages\":\"194-203\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SRDS.2017.28\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2017.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DottedDB: Anti-Entropy without Merkle Trees, Deletes without Tombstones
To achieve high availability in the face of network partitions, many distributed databases adopt eventual consistency, allow temporary conflicts due to concurrent writes, and use some form of per-key logical clock to detect and resolve such conflicts. Furthermore, nodes synchronize periodically to ensure replica convergence in a process called anti-entropy, normally using Merkle Trees. We present the design of DottedDB, a Dynamo-like key-value store, which uses a novel node-wide logical clock framework, overcoming three fundamental limitations of the state of the art: (1) minimize the metadata per key necessary to track causality, avoiding its growth even in the face of node churn; (2) correctly and durably delete keys, with no need for tombstones; (3) offer a lightweight anti-entropy mechanism to converge replicated data, avoiding the need for Merkle Trees. We evaluate DottedDB against MerkleDB, an otherwise identical database, but using per-key logical clocks and Merkle Trees for anti-entropy, to precisely measure the impact of the novel approach. Results show that: causality metadata per object always converges rapidly to only one id-counter pair; distributed deletes are correctly achieved without global coordination and with constant metadata; divergent nodes are synchronized faster, with less memory-footprint and with less communication overhead than using Merkle Trees.