Optimal Bloom Filters and Adaptive Merging for LSM-Trees

ACM Transactions on Database Systems (TODS) Pub Date : 2018-12-08 DOI:10.1145/3276980

Niv Dayan, Manos Athanassoulis, Stratos Idreos

{"title":"Optimal Bloom Filters and Adaptive Merging for LSM-Trees","authors":"Niv Dayan, Manos Athanassoulis, Stratos Idreos","doi":"10.1145/3276980","DOIUrl":null,"url":null,"abstract":"In this article, we show that key-value stores backed by a log-structured merge-tree (LSM-tree) exhibit an intrinsic tradeoff between lookup cost, update cost, and main memory footprint, yet all existing designs expose a suboptimal and difficult to tune tradeoff among these metrics. We pinpoint the problem to the fact that modern key-value stores suboptimally co-tune the merge policy, the buffer size, and the Bloom filters’ false-positive rates across the LSM-tree’s different levels. We present Monkey, an LSM-tree based key-value store that strikes the optimal balance between the costs of updates and lookups with any given main memory budget. The core insight is that worst-case lookup cost is proportional to the sum of the false-positive rates of the Bloom filters across all levels of the LSM-tree. Contrary to state-of-the-art key-value stores that assign a fixed number of bits-per-element to all Bloom filters, Monkey allocates memory to filters across different levels so as to minimize the sum of their false-positive rates. We show analytically that Monkey reduces the asymptotic complexity of the worst-case lookup I/O cost, and we verify empirically using an implementation on top of RocksDB that Monkey reduces lookup latency by an increasing margin as the data volume grows (50--80% for the data sizes we experimented with). Furthermore, we map the design space onto a closed-form model that enables adapting the merging frequency and memory allocation to strike the best tradeoff among lookup cost, update cost and main memory, depending on the workload (proportion of lookups and updates), the dataset (number and size of entries), and the underlying hardware (main memory available, disk vs. flash). We show how to use this model to answer what-if design questions about how changes in environmental parameters impact performance and how to adapt the design of the key-value store for optimal performance.","PeriodicalId":6983,"journal":{"name":"ACM Transactions on Database Systems (TODS)","volume":"1 1","pages":"1 - 48"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Database Systems (TODS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3276980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

Abstract

In this article, we show that key-value stores backed by a log-structured merge-tree (LSM-tree) exhibit an intrinsic tradeoff between lookup cost, update cost, and main memory footprint, yet all existing designs expose a suboptimal and difficult to tune tradeoff among these metrics. We pinpoint the problem to the fact that modern key-value stores suboptimally co-tune the merge policy, the buffer size, and the Bloom filters’ false-positive rates across the LSM-tree’s different levels. We present Monkey, an LSM-tree based key-value store that strikes the optimal balance between the costs of updates and lookups with any given main memory budget. The core insight is that worst-case lookup cost is proportional to the sum of the false-positive rates of the Bloom filters across all levels of the LSM-tree. Contrary to state-of-the-art key-value stores that assign a fixed number of bits-per-element to all Bloom filters, Monkey allocates memory to filters across different levels so as to minimize the sum of their false-positive rates. We show analytically that Monkey reduces the asymptotic complexity of the worst-case lookup I/O cost, and we verify empirically using an implementation on top of RocksDB that Monkey reduces lookup latency by an increasing margin as the data volume grows (50--80% for the data sizes we experimented with). Furthermore, we map the design space onto a closed-form model that enables adapting the merging frequency and memory allocation to strike the best tradeoff among lookup cost, update cost and main memory, depending on the workload (proportion of lookups and updates), the dataset (number and size of entries), and the underlying hardware (main memory available, disk vs. flash). We show how to use this model to answer what-if design questions about how changes in environmental parameters impact performance and how to adapt the design of the key-value store for optimal performance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

lsm树的最优Bloom过滤器和自适应合并

在本文中，我们展示了由日志结构的合并树(LSM-tree)支持的键值存储在查找成本、更新成本和主内存占用之间表现出内在的权衡，然而所有现有的设计都暴露了这些指标之间的次优且难以调优的权衡。我们将问题定位为这样一个事实，即现代键值存储在lsm树的不同级别上对合并策略、缓冲区大小和Bloom过滤器的误报率进行了次优协同调优。我们介绍了Monkey，一个基于lsm树的键值存储，它在任何给定的主内存预算下都能在更新和查找成本之间达到最佳平衡。核心观点是，最坏情况下的查找成本与跨lsm树的所有级别的Bloom过滤器的假阳性率的总和成正比。与最先进的键值存储(为所有Bloom过滤器分配固定数量的每个元素的位数)不同，Monkey为不同级别的过滤器分配内存，以最小化其误报率的总和。我们通过分析表明，Monkey降低了最坏情况下查找I/O成本的渐近复杂性，并且我们通过使用RocksDB之上的实现经验验证，随着数据量的增长，Monkey减少查找延迟的幅度越来越大(对于我们实验的数据大小，为50% -80%)。此外，我们将设计空间映射到一个封闭形式的模型上，该模型允许调整合并频率和内存分配，以便根据工作负载(查找和更新的比例)、数据集(条目的数量和大小)和底层硬件(可用的主存、磁盘与闪存)，在查找成本、更新成本和主存之间实现最佳权衡。我们将展示如何使用该模型来回答有关环境参数的变化如何影响性能以及如何调整键值存储的设计以获得最佳性能的假设设计问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Database Systems (TODS)

自引率

0.00%

发文量

期刊最新文献

On Finding Rank Regret Representatives Answering (Unions of) Conjunctive Queries using Random Access and Random-Order Enumeration Persistent Summaries Influence Maximization Revisited: Efficient Sampling with Bound Tightened The Space-Efficient Core of Vadalog