Kill Two Birds with One Stone: Auto-tuning RocksDB for High Bandwidth and Low Latency

2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS) Pub Date : 2020-11-01 DOI:10.1109/ICDCS47774.2020.00113

Yichen Jia, Feng Chen

{"title":"Kill Two Birds with One Stone: Auto-tuning RocksDB for High Bandwidth and Low Latency","authors":"Yichen Jia, Feng Chen","doi":"10.1109/ICDCS47774.2020.00113","DOIUrl":null,"url":null,"abstract":"Log-Structured Merge (LSM) tree based key-value stores are widely deployed in data centers. Due to its complex internal structures, appropriately configuring a modern key-value data store system, which can have more than 50 parameters with various hardware and system settings, is a highly challenging task. Currently, the industry still heavily relies on a traditional, experience-based, hand-tuning approach for performance tuning. Many simply adopt the default setting out of the box with no changes. Auto-tuning, as a self-adaptive solution, is thus highly appealing for achieving optimal or near-optimal performance in real-world deployment.In this paper, we quantitatively study and compare five optimization methods for auto-tuning the performance of LSM-tree based key-value stores. In order to evaluate the auto-tuning processes, we have conducted an exhaustive set of experiments over RocksDB, a representative LSM-tree data store. We have collected over 12,000 experimental records in 6 months, with about 2,000 software configurations of 6 parameters on different hardware setups. We have compared five representative algorithms, in terms of throughput, the 99th percentile tail latency, convergence time, real-time system throughput, and the iteration process, etc. We find that multi-objective optimization (MOO) methods can achieve a good balance among multiple targets, which satisfies the unique needs of key-value services. The more specific Quality of Service (QoS) requirements users can provide, the better performance these algorithms can achieve. We also find that the number of concurrent threads and the write buffer size are the two most impactful parameters determining the throughput and the 99th percentile tail latency across different hardware and workloads. Finally, we provide system-level explanations for the auto-tuning results and also discuss the associated implications for system designers and practitioners. We hope this work will pave the way towards a practical, high-speed auto-tuning solution for key-value data store systems.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"24 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS47774.2020.00113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Log-Structured Merge (LSM) tree based key-value stores are widely deployed in data centers. Due to its complex internal structures, appropriately configuring a modern key-value data store system, which can have more than 50 parameters with various hardware and system settings, is a highly challenging task. Currently, the industry still heavily relies on a traditional, experience-based, hand-tuning approach for performance tuning. Many simply adopt the default setting out of the box with no changes. Auto-tuning, as a self-adaptive solution, is thus highly appealing for achieving optimal or near-optimal performance in real-world deployment.In this paper, we quantitatively study and compare five optimization methods for auto-tuning the performance of LSM-tree based key-value stores. In order to evaluate the auto-tuning processes, we have conducted an exhaustive set of experiments over RocksDB, a representative LSM-tree data store. We have collected over 12,000 experimental records in 6 months, with about 2,000 software configurations of 6 parameters on different hardware setups. We have compared five representative algorithms, in terms of throughput, the 99th percentile tail latency, convergence time, real-time system throughput, and the iteration process, etc. We find that multi-objective optimization (MOO) methods can achieve a good balance among multiple targets, which satisfies the unique needs of key-value services. The more specific Quality of Service (QoS) requirements users can provide, the better performance these algorithms can achieve. We also find that the number of concurrent threads and the write buffer size are the two most impactful parameters determining the throughput and the 99th percentile tail latency across different hardware and workloads. Finally, we provide system-level explanations for the auto-tuning results and also discuss the associated implications for system designers and practitioners. We hope this work will pave the way towards a practical, high-speed auto-tuning solution for key-value data store systems.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一石二鸟:自动调整RocksDB的高带宽和低延迟

基于日志结构合并(Log-Structured Merge, LSM)树的键值存储广泛应用于数据中心。由于其复杂的内部结构，适当地配置一个现代键值数据存储系统是一项极具挑战性的任务，它可以有50多个参数和各种硬件和系统设置。目前，业界仍然严重依赖传统的、基于经验的手动调优方法进行性能调优。许多人直接采用默认设置，不做任何更改。因此，自动调优作为一种自适应解决方案，对于在实际部署中实现最优或接近最优的性能非常有吸引力。在本文中，我们定量地研究和比较了五种自动调优基于lsm树的键值存储性能的优化方法。为了评估自动调优过程，我们在RocksDB(一个代表性的lsm树数据存储)上进行了一组详尽的实验。我们在6个月的时间里收集了超过12000条实验记录，在不同的硬件设置下，有大约2000个6个参数的软件配置。我们从吞吐量、第99百分位尾延迟、收敛时间、实时系统吞吐量和迭代过程等方面比较了五种代表性算法。研究发现，多目标优化(MOO)方法可以很好地实现多个目标之间的平衡，满足键值服务的独特需求。用户提供的服务质量(QoS)要求越具体，这些算法的性能就越好。我们还发现并发线程的数量和写缓冲区大小是决定吞吐量和不同硬件和工作负载的第99百分位尾部延迟的两个最具影响力的参数。最后，我们提供了自动调优结果的系统级解释，并讨论了系统设计者和实践者的相关含义。我们希望这项工作将为键值数据存储系统的实用、高速自动调优解决方案铺平道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)

自引率

0.00%

发文量

期刊最新文献

An Energy-Efficient Edge Offloading Scheme for UAV-Assisted Internet of Things Kill Two Birds with One Stone: Auto-tuning RocksDB for High Bandwidth and Low Latency BlueFi: Physical-layer Cross-Technology Communication from Bluetooth to WiFi [Title page i] Distributionally Robust Edge Learning with Dirichlet Process Prior