{"title":"PLC-cache:用于基于重复数据删除的主存储的耐用SSD缓存","authors":"Jian Liu, Yunpeng Chai, X. Qin, Y. Xiao","doi":"10.1109/MSST.2014.6855536","DOIUrl":null,"url":null,"abstract":"Data deduplication techniques improve cost efficiency by dramatically reducing space needs of storage systems. SSD-based data cache has been adopted to remedy the declining I/O performance induced by deduplication operations in the latency-sensitive primary storage. Unfortunately, frequent data updates caused by classical cache algorithms (e.g., FIFO, LRU, and LFU) inevitably slow down SSDs' I/O processing speed while significantly shortening SSDs' lifetime. To address this problem, we propose a new approach-PLC-Cache-to greatly improve the I/O performance as well as write durability of SSDs. PLC-Cache is conducive to amplifying the proportion of the Popular and Long-term Cached (PLC) data, which is infrequently written and kept in SSD cache in a long time period to catalyze cache hits, in an entire SSD written data set. PLC-Cache advocates a two-phase approach. First, non-popular data are ruled out from being written into SSDs. Second, PLC-Cache makes an effort to convert SSD written data into PLC-data as much as possible. Our experimental results based on a practical deduplication system indicate that compared with the existing caching schemes, PLC-Cache shortens data access latency by an average of 23.4%. Importantly, PLC-Cache improves the lifetime of SSD-based caches by reducing the amount of data written to SSDs by a factor of 15.7.","PeriodicalId":188071,"journal":{"name":"2014 30th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"PLC-cache: Endurable SSD cache for deduplication-based primary storage\",\"authors\":\"Jian Liu, Yunpeng Chai, X. Qin, Y. Xiao\",\"doi\":\"10.1109/MSST.2014.6855536\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data deduplication techniques improve cost efficiency by dramatically reducing space needs of storage systems. SSD-based data cache has been adopted to remedy the declining I/O performance induced by deduplication operations in the latency-sensitive primary storage. Unfortunately, frequent data updates caused by classical cache algorithms (e.g., FIFO, LRU, and LFU) inevitably slow down SSDs' I/O processing speed while significantly shortening SSDs' lifetime. To address this problem, we propose a new approach-PLC-Cache-to greatly improve the I/O performance as well as write durability of SSDs. PLC-Cache is conducive to amplifying the proportion of the Popular and Long-term Cached (PLC) data, which is infrequently written and kept in SSD cache in a long time period to catalyze cache hits, in an entire SSD written data set. PLC-Cache advocates a two-phase approach. First, non-popular data are ruled out from being written into SSDs. Second, PLC-Cache makes an effort to convert SSD written data into PLC-data as much as possible. Our experimental results based on a practical deduplication system indicate that compared with the existing caching schemes, PLC-Cache shortens data access latency by an average of 23.4%. Importantly, PLC-Cache improves the lifetime of SSD-based caches by reducing the amount of data written to SSDs by a factor of 15.7.\",\"PeriodicalId\":188071,\"journal\":{\"name\":\"2014 30th Symposium on Mass Storage Systems and Technologies (MSST)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 30th Symposium on Mass Storage Systems and Technologies (MSST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MSST.2014.6855536\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 30th Symposium on Mass Storage Systems and Technologies (MSST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSST.2014.6855536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32
摘要
重复数据删除技术通过大幅降低存储系统对空间的需求,提高了成本效益。针对对延迟敏感的主存储中重复数据删除操作导致I/O性能下降的问题,采用了基于ssd的数据缓存。不幸的是,经典缓存算法(如FIFO、LRU和LFU)导致的频繁数据更新不可避免地降低了ssd的I/O处理速度,同时显著缩短了ssd的使用寿命。为了解决这个问题,我们提出了一种新的方法- plc - cache -大大提高了固态硬盘的I/O性能和写入持久性。PLC- cache有利于扩大PLC (Popular and long -term Cached)数据在整个SSD写入数据集中的比例,这些数据不经常写入,并且在很长一段时间内保存在SSD缓存中以催化缓存命中。PLC-Cache提倡两阶段方法。首先,不流行的数据被排除在写入ssd之外。其次,PLC-Cache尽可能地将SSD写入的数据转换为plc数据。基于一个实际的重复数据删除系统的实验结果表明,与现有的缓存方案相比,PLC-Cache平均缩短了23.4%的数据访问延迟。重要的是,PLC-Cache通过将写入ssd的数据量减少15.7倍,提高了基于ssd的缓存的寿命。
PLC-cache: Endurable SSD cache for deduplication-based primary storage
Data deduplication techniques improve cost efficiency by dramatically reducing space needs of storage systems. SSD-based data cache has been adopted to remedy the declining I/O performance induced by deduplication operations in the latency-sensitive primary storage. Unfortunately, frequent data updates caused by classical cache algorithms (e.g., FIFO, LRU, and LFU) inevitably slow down SSDs' I/O processing speed while significantly shortening SSDs' lifetime. To address this problem, we propose a new approach-PLC-Cache-to greatly improve the I/O performance as well as write durability of SSDs. PLC-Cache is conducive to amplifying the proportion of the Popular and Long-term Cached (PLC) data, which is infrequently written and kept in SSD cache in a long time period to catalyze cache hits, in an entire SSD written data set. PLC-Cache advocates a two-phase approach. First, non-popular data are ruled out from being written into SSDs. Second, PLC-Cache makes an effort to convert SSD written data into PLC-data as much as possible. Our experimental results based on a practical deduplication system indicate that compared with the existing caching schemes, PLC-Cache shortens data access latency by an average of 23.4%. Importantly, PLC-Cache improves the lifetime of SSD-based caches by reducing the amount of data written to SSDs by a factor of 15.7.