Kangaroo: Theory and Practice of Caching Billions of Tiny Objects on Flash

ACM Transactions on Storage (TOS) Pub Date : 2022-06-13 DOI:10.1145/3542928

Sara McAllister, Benjamin Berg, Julian Tutuncu-Macias, Juncheng Yang, S. Gunasekar, Jimmy Lu, Daniel S. Berger, Nathan Beckmann, G. Ganger

{"title":"Kangaroo: Theory and Practice of Caching Billions of Tiny Objects on Flash","authors":"Sara McAllister, Benjamin Berg, Julian Tutuncu-Macias, Juncheng Yang, S. Gunasekar, Jimmy Lu, Daniel S. Berger, Nathan Beckmann, G. Ganger","doi":"10.1145/3542928","DOIUrl":null,"url":null,"abstract":"Many social-media and IoT services have very large working sets consisting of billions of tiny (≈100 B) objects. Large, flash-based caches are important to serving these working sets at acceptable monetary cost. However, caching tiny objects on flash is challenging for two reasons: (i) SSDs can read/write data only in multi-KB “pages” that are much larger than a single object, stressing the limited number of times flash can be written; and (ii) very few bits per cached object can be kept in DRAM without losing flash’s cost advantage. Unfortunately, existing flash-cache designs fall short of addressing these challenges: write-optimized designs require too much DRAM, and DRAM-optimized designs require too many flash writes. We present Kangaroo, a new flash-cache design that optimizes both DRAM usage and flash writes to maximize cache performance while minimizing cost. Kangaroo combines a large, set-associative cache with a small, log-structured cache. The set-associative cache requires minimal DRAM, while the log-structured cache minimizes Kangaroo’s flash writes. Experiments using traces from Meta and Twitter show that Kangaroo achieves DRAM usage close to the best prior DRAM-optimized design, flash writes close to the best prior write-optimized design, and miss ratios better than both. Kangaroo’s design is Pareto-optimal across a range of allowed write rates, DRAM sizes, and flash sizes, reducing misses by 29% over the state of the art. These results are corroborated by analytical models presented herein and with a test deployment of Kangaroo in a production flash cache at Meta.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3542928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Many social-media and IoT services have very large working sets consisting of billions of tiny (≈100 B) objects. Large, flash-based caches are important to serving these working sets at acceptable monetary cost. However, caching tiny objects on flash is challenging for two reasons: (i) SSDs can read/write data only in multi-KB “pages” that are much larger than a single object, stressing the limited number of times flash can be written; and (ii) very few bits per cached object can be kept in DRAM without losing flash’s cost advantage. Unfortunately, existing flash-cache designs fall short of addressing these challenges: write-optimized designs require too much DRAM, and DRAM-optimized designs require too many flash writes. We present Kangaroo, a new flash-cache design that optimizes both DRAM usage and flash writes to maximize cache performance while minimizing cost. Kangaroo combines a large, set-associative cache with a small, log-structured cache. The set-associative cache requires minimal DRAM, while the log-structured cache minimizes Kangaroo’s flash writes. Experiments using traces from Meta and Twitter show that Kangaroo achieves DRAM usage close to the best prior DRAM-optimized design, flash writes close to the best prior write-optimized design, and miss ratios better than both. Kangaroo’s design is Pareto-optimal across a range of allowed write rates, DRAM sizes, and flash sizes, reducing misses by 29% over the state of the art. These results are corroborated by analytical models presented herein and with a test deployment of Kangaroo in a production flash cache at Meta.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

袋鼠:在Flash上缓存数十亿微小对象的理论和实践

许多社交媒体和物联网服务都有非常大的工作集，由数十亿个微小(≈100 B)对象组成。大型的基于闪存的缓存对于以可接受的货币成本为这些工作集提供服务非常重要。然而，在闪存上缓存微小对象是具有挑战性的，原因有两个:(i) ssd只能在多kb的“页”中读取/写入数据，这比单个对象大得多，强调闪存可以写入的次数有限;(ii)在不失去闪存的成本优势的情况下，每个缓存对象可以保存在DRAM中的比特很少。不幸的是，现有的闪存缓存设计无法解决这些挑战:写入优化设计需要太多的DRAM，而DRAM优化设计需要太多的闪存写入。我们提出袋鼠，一种新的闪存缓存设计，优化了DRAM的使用和闪存写入，以最大限度地提高缓存性能，同时最大限度地降低成本。Kangaroo结合了一个大的、集关联的缓存和一个小的、日志结构的缓存。集合关联缓存需要最少的DRAM，而日志结构缓存则最小化了Kangaroo的闪存写入。使用Meta和Twitter跟踪的实验表明，Kangaroo实现了接近最佳先前DRAM优化设计的DRAM使用率，接近最佳先前写入优化设计的闪存写入，并且缺失率优于两者。袋鼠的设计在允许的写入速率、DRAM大小和闪存大小范围内都是帕累托最优的，比目前的技术水平减少了29%的失误。这些结果得到了本文提出的分析模型的证实，并在Meta的生产闪存缓存中测试部署了Kangaroo。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Storage (TOS)

自引率

0.00%

发文量

期刊最新文献

WebAssembly-based Delta Sync for Cloud Storage Services DEFUSE: An Interface for Fast and Correct User Space File System Access Donag: Generating Efficient Patches and Diffs for Compressed Archives Building GC-free Key-value Store on HM-SMR Drives with ZoneFS Kangaroo: Theory and Practice of Caching Billions of Tiny Objects on Flash