Dong Huang, D. Feng, Qian-Qian Liu, Bo Ding, Wei Zhao, Xueliang Wei, Wei Tong
{"title":"SplitZNS: Towards an Efficient LSM-Tree on Zoned Namespace SSDs","authors":"Dong Huang, D. Feng, Qian-Qian Liu, Bo Ding, Wei Zhao, Xueliang Wei, Wei Tong","doi":"10.1145/3608476","DOIUrl":null,"url":null,"abstract":"The Zoned Namespace (ZNS) Solid State Drive (SSD) is a nascent form of storage device that offers novel prospects for the Log Structured Merge Tree (LSM-tree). ZNS exposes erase blocks in SSD as append-only zones, enabling the LSM-tree to gain awareness of the physical layout of data. Nevertheless, LSM-tree on ZNS SSDs necessitates Garbage Collection (GC) owing to the mismatch between the gigantic zones and relatively small Sorted String Tables (SSTables). Through extensive experiments, we observe that a smaller zone size can reduce data migration in GC at the cost of a significant performance decline owing to inadequate parallelism exploitation. In this article, we present SplitZNS, which introduces small zones by tweaking the zone-to-chip mapping to maximize GC efficiency for LSM-tree on ZNS SSDs. Following the multi-level peculiarity of LSM-tree and the inherent parallel architecture of ZNS SSDs, we propose a number of techniques to leverage and accelerate small zones to alleviate the performance impact due to underutilized parallelism. (1) First, we use small zones selectively to prevent exacerbating write slowdowns and stalls due to their suboptimal performance. (2) Second, to enhance parallelism utilization, we propose SubZone Ring, which employs a per-chip FIFO buffer to imitate a large zone writing style; (3) Read Prefetcher, which prefetches data concurrently through multiple chips during compactions; (4) and Read Scheduler, which assigns query requests the highest priority. We build a prototype integrated with SplitZNS to validate its efficiency and efficacy. Experimental results demonstrate that SplitZNS achieves up to 2.77× performance and reduces data migration considerably compared to the lifetime-based data placement.1","PeriodicalId":50920,"journal":{"name":"ACM Transactions on Architecture and Code Optimization","volume":"140 1","pages":"1 - 26"},"PeriodicalIF":1.5000,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Architecture and Code Optimization","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3608476","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The Zoned Namespace (ZNS) Solid State Drive (SSD) is a nascent form of storage device that offers novel prospects for the Log Structured Merge Tree (LSM-tree). ZNS exposes erase blocks in SSD as append-only zones, enabling the LSM-tree to gain awareness of the physical layout of data. Nevertheless, LSM-tree on ZNS SSDs necessitates Garbage Collection (GC) owing to the mismatch between the gigantic zones and relatively small Sorted String Tables (SSTables). Through extensive experiments, we observe that a smaller zone size can reduce data migration in GC at the cost of a significant performance decline owing to inadequate parallelism exploitation. In this article, we present SplitZNS, which introduces small zones by tweaking the zone-to-chip mapping to maximize GC efficiency for LSM-tree on ZNS SSDs. Following the multi-level peculiarity of LSM-tree and the inherent parallel architecture of ZNS SSDs, we propose a number of techniques to leverage and accelerate small zones to alleviate the performance impact due to underutilized parallelism. (1) First, we use small zones selectively to prevent exacerbating write slowdowns and stalls due to their suboptimal performance. (2) Second, to enhance parallelism utilization, we propose SubZone Ring, which employs a per-chip FIFO buffer to imitate a large zone writing style; (3) Read Prefetcher, which prefetches data concurrently through multiple chips during compactions; (4) and Read Scheduler, which assigns query requests the highest priority. We build a prototype integrated with SplitZNS to validate its efficiency and efficacy. Experimental results demonstrate that SplitZNS achieves up to 2.77× performance and reduces data migration considerably compared to the lifetime-based data placement.1
期刊介绍:
ACM Transactions on Architecture and Code Optimization (TACO) focuses on hardware, software, and system research spanning the fields of computer architecture and code optimization. Articles that appear in TACO will either present new techniques and concepts or report on experiences and experiments with actual systems. Insights useful to architects, hardware or software developers, designers, builders, and users will be emphasized.