LSbM-tree: Re-Enabling Buffer Caching in Data Management for Mixed Reads and Writes

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) Pub Date : 2017-06-05 DOI:10.1109/ICDCS.2017.70

Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan Ma, Yanfeng Zhang, Xiaodong Zhang

{"title":"LSbM-tree: Re-Enabling Buffer Caching in Data Management for Mixed Reads and Writes","authors":"Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan Ma, Yanfeng Zhang, Xiaodong Zhang","doi":"10.1109/ICDCS.2017.70","DOIUrl":null,"url":null,"abstract":"LSM-tree has been widely used in data management production systems for write-intensive workloads. However, as read and write workloads co-exist under LSM-tree, data accesses can experience long latency and low throughput due to the interferences to buffer caching from the compaction, a major and frequent operation in LSM-tree. After a compaction, the existing data blocks are reorganized and written to other locations on disks. As a result, the related data blocks that have been loaded in the buffer cache are invalidated since their referencing addresses are changed, causing serious performance degradations. In order to re-enable high-speed buffer caching during intensive writes, we propose Log-Structured buffered-Merge tree (simplified as LSbM-tree) by adding a compaction buffer on disks, to minimize the cache invalidations on buffer cache caused by compactions. The compaction buffer efficiently and adaptively maintains the frequently visited data sets. In LSbM, strong locality objects can be effectively kept in the buffer cache with minimum or without harmful invalidations. With the help of a small on-disk compaction buffer, LSbM achieves a high query performance by enabling effective buffer caching, while retaining all the merits of LSM-tree for write-intensive data processing, and providing high bandwidth of disks for range queries. We have implemented LSbM based on LevelDB. We show that with a standard buffer cache and a hard disk, LSbM can achieve 2x performance improvement over LevelDB. We have also compared LSbM with other existing solutions to show its strong effectiveness.","PeriodicalId":127689,"journal":{"name":"2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.2017.70","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

Abstract

LSM-tree has been widely used in data management production systems for write-intensive workloads. However, as read and write workloads co-exist under LSM-tree, data accesses can experience long latency and low throughput due to the interferences to buffer caching from the compaction, a major and frequent operation in LSM-tree. After a compaction, the existing data blocks are reorganized and written to other locations on disks. As a result, the related data blocks that have been loaded in the buffer cache are invalidated since their referencing addresses are changed, causing serious performance degradations. In order to re-enable high-speed buffer caching during intensive writes, we propose Log-Structured buffered-Merge tree (simplified as LSbM-tree) by adding a compaction buffer on disks, to minimize the cache invalidations on buffer cache caused by compactions. The compaction buffer efficiently and adaptively maintains the frequently visited data sets. In LSbM, strong locality objects can be effectively kept in the buffer cache with minimum or without harmful invalidations. With the help of a small on-disk compaction buffer, LSbM achieves a high query performance by enabling effective buffer caching, while retaining all the merits of LSM-tree for write-intensive data processing, and providing high bandwidth of disks for range queries. We have implemented LSbM based on LevelDB. We show that with a standard buffer cache and a hard disk, LSbM can achieve 2x performance improvement over LevelDB. We have also compared LSbM with other existing solutions to show its strong effectiveness.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

lsm -tree:在数据管理中为混合读写重新启用缓冲区缓存

LSM-tree在数据管理生产系统中广泛用于写密集型工作负载。然而，由于读写工作负载在LSM-tree下共存，数据访问可能会经历长延迟和低吞吐量，这是由于压缩(LSM-tree中一个主要且频繁的操作)对缓冲区缓存的干扰。在压缩之后，现有的数据块被重新组织并写入磁盘上的其他位置。结果，加载到缓存中的相关数据块由于其引用地址被更改而失效，从而导致严重的性能下降。为了在密集写入期间重新启用高速缓冲区缓存，我们提出了日志结构缓冲合并树(简化为lsm -tree)，通过在磁盘上添加压缩缓冲区，以最大限度地减少由于压缩导致的缓冲区缓存失效。压缩缓冲区有效且自适应地维护频繁访问的数据集。在lsdb中，强局部性对象可以有效地保存在缓冲缓存中，而不会产生有害的无效。在一个小的磁盘上压缩缓冲区的帮助下，通过启用有效的缓冲区缓存，LSbM实现了高查询性能，同时保留了LSM-tree的所有优点，用于写密集型数据处理，并为范围查询提供高带宽的磁盘。我们已经实现了基于LevelDB的lsdb。我们表明，使用标准缓冲缓存和硬盘，LSbM可以比LevelDB实现2倍的性能改进。我们还将LSbM与其他现有解决方案进行了比较，显示了其强大的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

自引率

0.00%

发文量