A Machine Learning Based Write Policy for SSD Cache in Cloud Block Storage

2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2020-03-01 DOI:10.23919/DATE48585.2020.9116539

Yu Zhang, Ke Zhou, Ping Huang, Hua Wang, Jianying Hu, Yangtao Wang, Yongguang Ji, Bin Cheng

{"title":"A Machine Learning Based Write Policy for SSD Cache in Cloud Block Storage","authors":"Yu Zhang, Ke Zhou, Ping Huang, Hua Wang, Jianying Hu, Yangtao Wang, Yongguang Ji, Bin Cheng","doi":"10.23919/DATE48585.2020.9116539","DOIUrl":null,"url":null,"abstract":"Nowadays, SSD cache plays an important role in cloud storage systems. The associated write policy, which enforces an admission control policy regarding filling data into the cache, has a significant impact on the performance of the cache system and the amount of write traffic to SSD caches. Based on our analysis on a typical cloud block storage system, approximately 47.09% writes are write-only, i.e., writes to the blocks which have not been read during a certain time window. Naively writing the write-only data to the SSD cache unnecessarily introduces a large number of harmful writes to the SSD cache without any contribution to cache performance. On the other hand, it is a challenging task to identify and filter out those write-only data in a real-time manner, especially in a cloud environment running changing and diverse workloads.In this paper, to alleviate the above cache problem, we propose an ML-WP, Machine Learning Based Write Policy, which reduces write traffic to SSDs by avoiding writing write-only data. The main challenge in this approach is to identify write-only data in a real-time manner. To realize ML-WP and achieve accurate write-only data identification, we use machine learning methods to classify data into two groups (i.e., write-only and normal data). Based on this classification, the write-only data is directly written to backend storage without being cached. Experimental results show that, compared with the industry widely deployed write-back policy, ML-WP decreases write traffic to SSD cache by 41.52%, while improving the hit ratio by 2.61% and reducing the average read latency by 37.52%.","PeriodicalId":289525,"journal":{"name":"2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/DATE48585.2020.9116539","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Nowadays, SSD cache plays an important role in cloud storage systems. The associated write policy, which enforces an admission control policy regarding filling data into the cache, has a significant impact on the performance of the cache system and the amount of write traffic to SSD caches. Based on our analysis on a typical cloud block storage system, approximately 47.09% writes are write-only, i.e., writes to the blocks which have not been read during a certain time window. Naively writing the write-only data to the SSD cache unnecessarily introduces a large number of harmful writes to the SSD cache without any contribution to cache performance. On the other hand, it is a challenging task to identify and filter out those write-only data in a real-time manner, especially in a cloud environment running changing and diverse workloads.In this paper, to alleviate the above cache problem, we propose an ML-WP, Machine Learning Based Write Policy, which reduces write traffic to SSDs by avoiding writing write-only data. The main challenge in this approach is to identify write-only data in a real-time manner. To realize ML-WP and achieve accurate write-only data identification, we use machine learning methods to classify data into two groups (i.e., write-only and normal data). Based on this classification, the write-only data is directly written to backend storage without being cached. Experimental results show that, compared with the industry widely deployed write-back policy, ML-WP decreases write traffic to SSD cache by 41.52%, while improving the hit ratio by 2.61% and reducing the average read latency by 37.52%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于机器学习的云块存储SSD缓存写策略研究

当前，SSD缓存在云存储系统中扮演着重要的角色。关联写策略对cache系统的性能和对SSD cache的写流量有较大的影响。根据我们对一个典型的云块存储系统的分析，大约47.09%的写操作是只写，即写到某个时间窗口内没有被读的块。如果单纯地将只写数据写入SSD cache，会导致不必要的大量有害的写操作，对SSD cache的性能没有任何影响。另一方面，以实时方式识别和过滤这些只写数据是一项具有挑战性的任务，特别是在运行不断变化和多样化工作负载的云环境中。在本文中，为了缓解上述缓存问题，我们提出了一个ML-WP，基于机器学习的写策略，它通过避免写只写数据来减少对ssd的写流量。这种方法的主要挑战是以实时的方式识别只写数据。为了实现ML-WP并实现准确的只写数据识别，我们使用机器学习方法将数据分为两组(即只写数据和正常数据)。根据这种分类，只写数据直接写入后端存储，而不缓存。实验结果表明，与业界广泛部署的回写策略相比，ML-WP将对SSD缓存的写流量减少了41.52%，命中率提高了2.61%，平均读延迟降低了37.52%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)

自引率

0.00%

发文量