Cache What You Need to Cache

ACM Transactions on Storage (TOS) Pub Date : 2020-07-16 DOI:10.1145/3397766

Hua Wang, Jiawei Zhang, Ping-Hsiu Huang, Xinbo Yi, Bin Cheng, Ke Zhou

{"title":"Cache What You Need to Cache","authors":"Hua Wang, Jiawei Zhang, Ping-Hsiu Huang, Xinbo Yi, Bin Cheng, Ke Zhou","doi":"10.1145/3397766","DOIUrl":null,"url":null,"abstract":"The SSD has been playing a significantly important role in caching systems due to its high performance-to-cost ratio. Since the cache space is typically much smaller than that of the backend storage by one order of magnitude or even more, write density (defined as writes per unit time and space) of the SSD cache is therefore much more intensive than that of HDD storage, which brings about tremendous challenges to the SSD’s lifetime. Meanwhile, under social network workloads, quite a lot writes to the SSD cache are unnecessary. For example, our study on Tencent’s photo caching shows that about 61% of total photos are accessed only once, whereas they are still swapped in and out of the cache. Therefore, if we can predict these kinds of photos proactively and prevent them from entering the cache, we can eliminate unnecessary SSD cache writes and improve cache space utilization. To cope with the challenge, we put forward a “one-time-access criteria” that is applied to the cache space and further propose a “one-time-access-exclusion” policy. Based on these two techniques, we design a prediction-based classifier to facilitate the policy. Unlike the state-of-the-art history-based predictions, our prediction is non-history oriented, which is challenging to achieve good prediction accuracy. To address this issue, we integrate a decision tree into the classifier, extract social-related information as classifying features, and apply cost-sensitive learning to improve classification precision. Due to these techniques, we attain a prediction accuracy greater than 80%. Experimental results show that the one-time-access-exclusion approach results in outstanding cache performance in most aspects. Take LRU, for instance: applying our approach improves the hit rate by 4.4%, decreases the cache writes by 56.8%, and cuts the average access latency by 5.5%.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397766","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The SSD has been playing a significantly important role in caching systems due to its high performance-to-cost ratio. Since the cache space is typically much smaller than that of the backend storage by one order of magnitude or even more, write density (defined as writes per unit time and space) of the SSD cache is therefore much more intensive than that of HDD storage, which brings about tremendous challenges to the SSD’s lifetime. Meanwhile, under social network workloads, quite a lot writes to the SSD cache are unnecessary. For example, our study on Tencent’s photo caching shows that about 61% of total photos are accessed only once, whereas they are still swapped in and out of the cache. Therefore, if we can predict these kinds of photos proactively and prevent them from entering the cache, we can eliminate unnecessary SSD cache writes and improve cache space utilization. To cope with the challenge, we put forward a “one-time-access criteria” that is applied to the cache space and further propose a “one-time-access-exclusion” policy. Based on these two techniques, we design a prediction-based classifier to facilitate the policy. Unlike the state-of-the-art history-based predictions, our prediction is non-history oriented, which is challenging to achieve good prediction accuracy. To address this issue, we integrate a decision tree into the classifier, extract social-related information as classifying features, and apply cost-sensitive learning to improve classification precision. Due to these techniques, we attain a prediction accuracy greater than 80%. Experimental results show that the one-time-access-exclusion approach results in outstanding cache performance in most aspects. Take LRU, for instance: applying our approach improves the hit rate by 4.4%, decreases the cache writes by 56.8%, and cuts the average access latency by 5.5%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

缓存需要缓存的内容

由于其高性价比，SSD在缓存系统中一直扮演着非常重要的角色。由于缓存空间通常比后端存储小一个数量级甚至更多，因此SSD缓存的写密度(单位时间和空间的写)要比HDD存储密集得多，这对SSD的生命周期带来了巨大的挑战。同时，在社交网络工作负载下，对SSD缓存的大量写操作是不必要的。例如，我们对腾讯照片缓存的研究表明，大约61%的照片只被访问一次，而它们仍然在缓存中交换。因此，如果我们能够主动预测这类照片并阻止它们进入缓存，就可以消除不必要的SSD缓存写操作，提高缓存空间利用率。为了应对这一挑战，我们提出了适用于缓存空间的“一次访问标准”，并进一步提出了“一次访问排斥”策略。基于这两种技术，我们设计了一个基于预测的分类器来促进策略的实现。与最先进的基于历史的预测不同，我们的预测是非面向历史的，这对实现良好的预测精度具有挑战性。为了解决这个问题，我们将决策树集成到分类器中，提取社会相关信息作为分类特征，并应用代价敏感学习来提高分类精度。由于这些技术，我们获得了超过80%的预测精度。实验结果表明，一次性访问排除方法在大多数方面都具有优异的缓存性能。以LRU为例:应用我们的方法，命中率提高了4.4%，缓存写减少了56.8%，平均访问延迟减少了5.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Storage (TOS)

自引率

0.00%

发文量

期刊最新文献

WebAssembly-based Delta Sync for Cloud Storage Services DEFUSE: An Interface for Fast and Correct User Space File System Access Donag: Generating Efficient Patches and Diffs for Compressed Archives Building GC-free Key-value Store on HM-SMR Drives with ZoneFS Kangaroo: Theory and Practice of Caching Billions of Tiny Objects on Flash