{"title":"LaDy: Enabling <u>L</u> ocality- <u>a</u> ware <u>D</u> eduplication Technolog <u>y</u> on Shingled Magnetic Recording Drives","authors":"Jung-Hsiu Chang, Tzu-Yu Chang, Yi-Chao Shih, Tseng-Yi Chen","doi":"10.1145/3607921","DOIUrl":null,"url":null,"abstract":"The continuous increase in data volume has led to the adoption of shingled-magnetic recording (SMR) as the primary technology for modern storage drives. This technology offers high storage density and low unit cost but introduces significant performance overheads due to the read-update-write operation and garbage collection (GC) process. To reduce these overheads, data deduplication has been identified as an effective solution as it reduces the amount of written data to an SMR-based storage device. However, deduplication can result in poor data locality, leading to decreased read performance. To tackle this problem, this study proposes a data locality-aware deduplication technology, LaDy, that considers both the overheads of writing duplicate data and the impact on data locality to determine whether the duplicate data should be written. LaDy integrates with DiskSim, an open-source project, and modifies it to simulate an SMR-based drive. The experimental results demonstrate that LaDy can significantly reduce the response time in the best-case scenario by 87.3% compared with CAFTL on the SMR drive. LaDy achieves this by selectively writing duplicate data, which preserves data locality, resulting in improved read performance. The proposed solution provides an effective and efficient method for mitigating the performance overheads associated with data deduplication in SMR-based storage devices.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Embedded Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3607921","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The continuous increase in data volume has led to the adoption of shingled-magnetic recording (SMR) as the primary technology for modern storage drives. This technology offers high storage density and low unit cost but introduces significant performance overheads due to the read-update-write operation and garbage collection (GC) process. To reduce these overheads, data deduplication has been identified as an effective solution as it reduces the amount of written data to an SMR-based storage device. However, deduplication can result in poor data locality, leading to decreased read performance. To tackle this problem, this study proposes a data locality-aware deduplication technology, LaDy, that considers both the overheads of writing duplicate data and the impact on data locality to determine whether the duplicate data should be written. LaDy integrates with DiskSim, an open-source project, and modifies it to simulate an SMR-based drive. The experimental results demonstrate that LaDy can significantly reduce the response time in the best-case scenario by 87.3% compared with CAFTL on the SMR drive. LaDy achieves this by selectively writing duplicate data, which preserves data locality, resulting in improved read performance. The proposed solution provides an effective and efficient method for mitigating the performance overheads associated with data deduplication in SMR-based storage devices.
期刊介绍:
The design of embedded computing systems, both the software and hardware, increasingly relies on sophisticated algorithms, analytical models, and methodologies. ACM Transactions on Embedded Computing Systems (TECS) aims to present the leading work relating to the analysis, design, behavior, and experience with embedded computing systems.