基于机器学习的云块存储SSD缓存写策略研究

2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2020-03-01 DOI:10.23919/DATE48585.2020.9116539

Yu Zhang, Ke Zhou, Ping Huang, Hua Wang, Jianying Hu, Yangtao Wang, Yongguang Ji, Bin Cheng

{"title":"基于机器学习的云块存储SSD缓存写策略研究","authors":"Yu Zhang, Ke Zhou, Ping Huang, Hua Wang, Jianying Hu, Yangtao Wang, Yongguang Ji, Bin Cheng","doi":"10.23919/DATE48585.2020.9116539","DOIUrl":null,"url":null,"abstract":"Nowadays, SSD cache plays an important role in cloud storage systems. The associated write policy, which enforces an admission control policy regarding filling data into the cache, has a significant impact on the performance of the cache system and the amount of write traffic to SSD caches. Based on our analysis on a typical cloud block storage system, approximately 47.09% writes are write-only, i.e., writes to the blocks which have not been read during a certain time window. Naively writing the write-only data to the SSD cache unnecessarily introduces a large number of harmful writes to the SSD cache without any contribution to cache performance. On the other hand, it is a challenging task to identify and filter out those write-only data in a real-time manner, especially in a cloud environment running changing and diverse workloads.In this paper, to alleviate the above cache problem, we propose an ML-WP, Machine Learning Based Write Policy, which reduces write traffic to SSDs by avoiding writing write-only data. The main challenge in this approach is to identify write-only data in a real-time manner. To realize ML-WP and achieve accurate write-only data identification, we use machine learning methods to classify data into two groups (i.e., write-only and normal data). Based on this classification, the write-only data is directly written to backend storage without being cached. Experimental results show that, compared with the industry widely deployed write-back policy, ML-WP decreases write traffic to SSD cache by 41.52%, while improving the hit ratio by 2.61% and reducing the average read latency by 37.52%.","PeriodicalId":289525,"journal":{"name":"2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A Machine Learning Based Write Policy for SSD Cache in Cloud Block Storage\",\"authors\":\"Yu Zhang, Ke Zhou, Ping Huang, Hua Wang, Jianying Hu, Yangtao Wang, Yongguang Ji, Bin Cheng\",\"doi\":\"10.23919/DATE48585.2020.9116539\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, SSD cache plays an important role in cloud storage systems. The associated write policy, which enforces an admission control policy regarding filling data into the cache, has a significant impact on the performance of the cache system and the amount of write traffic to SSD caches. Based on our analysis on a typical cloud block storage system, approximately 47.09% writes are write-only, i.e., writes to the blocks which have not been read during a certain time window. Naively writing the write-only data to the SSD cache unnecessarily introduces a large number of harmful writes to the SSD cache without any contribution to cache performance. On the other hand, it is a challenging task to identify and filter out those write-only data in a real-time manner, especially in a cloud environment running changing and diverse workloads.In this paper, to alleviate the above cache problem, we propose an ML-WP, Machine Learning Based Write Policy, which reduces write traffic to SSDs by avoiding writing write-only data. The main challenge in this approach is to identify write-only data in a real-time manner. To realize ML-WP and achieve accurate write-only data identification, we use machine learning methods to classify data into two groups (i.e., write-only and normal data). Based on this classification, the write-only data is directly written to backend storage without being cached. Experimental results show that, compared with the industry widely deployed write-back policy, ML-WP decreases write traffic to SSD cache by 41.52%, while improving the hit ratio by 2.61% and reducing the average read latency by 37.52%.\",\"PeriodicalId\":289525,\"journal\":{\"name\":\"2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/DATE48585.2020.9116539\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/DATE48585.2020.9116539","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

当前，SSD缓存在云存储系统中扮演着重要的角色。关联写策略对cache系统的性能和对SSD cache的写流量有较大的影响。根据我们对一个典型的云块存储系统的分析，大约47.09%的写操作是只写，即写到某个时间窗口内没有被读的块。如果单纯地将只写数据写入SSD cache，会导致不必要的大量有害的写操作，对SSD cache的性能没有任何影响。另一方面，以实时方式识别和过滤这些只写数据是一项具有挑战性的任务，特别是在运行不断变化和多样化工作负载的云环境中。在本文中，为了缓解上述缓存问题，我们提出了一个ML-WP，基于机器学习的写策略，它通过避免写只写数据来减少对ssd的写流量。这种方法的主要挑战是以实时的方式识别只写数据。为了实现ML-WP并实现准确的只写数据识别，我们使用机器学习方法将数据分为两组(即只写数据和正常数据)。根据这种分类，只写数据直接写入后端存储，而不缓存。实验结果表明，与业界广泛部署的回写策略相比，ML-WP将对SSD缓存的写流量减少了41.52%，命中率提高了2.61%，平均读延迟降低了37.52%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Machine Learning Based Write Policy for SSD Cache in Cloud Block Storage

Nowadays, SSD cache plays an important role in cloud storage systems. The associated write policy, which enforces an admission control policy regarding filling data into the cache, has a significant impact on the performance of the cache system and the amount of write traffic to SSD caches. Based on our analysis on a typical cloud block storage system, approximately 47.09% writes are write-only, i.e., writes to the blocks which have not been read during a certain time window. Naively writing the write-only data to the SSD cache unnecessarily introduces a large number of harmful writes to the SSD cache without any contribution to cache performance. On the other hand, it is a challenging task to identify and filter out those write-only data in a real-time manner, especially in a cloud environment running changing and diverse workloads.In this paper, to alleviate the above cache problem, we propose an ML-WP, Machine Learning Based Write Policy, which reduces write traffic to SSDs by avoiding writing write-only data. The main challenge in this approach is to identify write-only data in a real-time manner. To realize ML-WP and achieve accurate write-only data identification, we use machine learning methods to classify data into two groups (i.e., write-only and normal data). Based on this classification, the write-only data is directly written to backend storage without being cached. Experimental results show that, compared with the industry widely deployed write-back policy, ML-WP decreases write traffic to SSD cache by 41.52%, while improving the hit ratio by 2.61% and reducing the average read latency by 37.52%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)

自引率

0.00%

发文量