Sibyl:使用在线强化学习在混合存储系统中自适应和可扩展的数据放置

Gagandeep Singh, Rakesh Nadig, Jisung Park, Rahul Bera, Nastaran Hajinazar, D. Novo, Juan G'omez-Luna, S. Stuijk, H. Corporaal, O. Mutlu
{"title":"Sibyl:使用在线强化学习在混合存储系统中自适应和可扩展的数据放置","authors":"Gagandeep Singh, Rakesh Nadig, Jisung Park, Rahul Bera, Nastaran Hajinazar, D. Novo, Juan G'omez-Luna, S. Stuijk, H. Corporaal, O. Mutlu","doi":"10.1145/3470496.3527442","DOIUrl":null,"url":null,"abstract":"Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Data placement across different devices is critical to maximize the benefits of such a hybrid system. Recent research proposes various techniques that aim to accurately identify performance-critical data to place it in a \"best-fit\" storage device. Unfortunately, most of these techniques are rigid, which (1) limits their adaptivity to perform well for a wide range of workloads and storage device configurations, and (2) makes it difficult for designers to extend these techniques to different storage system configurations (e.g., with a different number or different types of storage devices) than the configuration they are designed for. Our goal is to design a new data placement technique for hybrid storage systems that overcomes these issues and provides: (1) adaptivity, by continuously learning from and adapting to the workload and the storage device characteristics, and (2) easy extensibility to a wide range of workloads and HSS configurations. We introduce Sibyl, the first technique that uses reinforcement learning for data placement in hybrid storage systems. Sibyl observes different features of the running workload as well as the storage devices to make system-aware data placement decisions. For every decision it makes, Sibyl receives a reward from the system that it uses to evaluate the long-term performance impact of its decision and continuously optimizes its data placement policy online. We implement Sibyl on real systems with various HSS configurations, including dual- and tri-hybrid storage systems, and extensively compare it against four previously proposed data placement techniques (both heuristic- and machine learning-based) over a wide range of workloads. Our results show that Sibyl provides 21.6%/19.9% performance improvement in a performance-oriented/cost-oriented HSS configuration compared to the best previous data placement technique. Our evaluation using an HSS configuration with three different storage devices shows that Sibyl outperforms the state-of-the-art data placement policy by 23.9%-48.2%, while significantly reducing the system architect's burden in designing a data placement mechanism that can simultaneously incorporate three storage devices. We show that Sibyl achieves 80% of the performance of an oracle policy that has complete knowledge offuture access patterns while incurring a very modest storage overhead of only 124.4 KiB.","PeriodicalId":337932,"journal":{"name":"Proceedings of the 49th Annual International Symposium on Computer Architecture","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning\",\"authors\":\"Gagandeep Singh, Rakesh Nadig, Jisung Park, Rahul Bera, Nastaran Hajinazar, D. Novo, Juan G'omez-Luna, S. Stuijk, H. Corporaal, O. Mutlu\",\"doi\":\"10.1145/3470496.3527442\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Data placement across different devices is critical to maximize the benefits of such a hybrid system. Recent research proposes various techniques that aim to accurately identify performance-critical data to place it in a \\\"best-fit\\\" storage device. Unfortunately, most of these techniques are rigid, which (1) limits their adaptivity to perform well for a wide range of workloads and storage device configurations, and (2) makes it difficult for designers to extend these techniques to different storage system configurations (e.g., with a different number or different types of storage devices) than the configuration they are designed for. Our goal is to design a new data placement technique for hybrid storage systems that overcomes these issues and provides: (1) adaptivity, by continuously learning from and adapting to the workload and the storage device characteristics, and (2) easy extensibility to a wide range of workloads and HSS configurations. We introduce Sibyl, the first technique that uses reinforcement learning for data placement in hybrid storage systems. Sibyl observes different features of the running workload as well as the storage devices to make system-aware data placement decisions. For every decision it makes, Sibyl receives a reward from the system that it uses to evaluate the long-term performance impact of its decision and continuously optimizes its data placement policy online. We implement Sibyl on real systems with various HSS configurations, including dual- and tri-hybrid storage systems, and extensively compare it against four previously proposed data placement techniques (both heuristic- and machine learning-based) over a wide range of workloads. Our results show that Sibyl provides 21.6%/19.9% performance improvement in a performance-oriented/cost-oriented HSS configuration compared to the best previous data placement technique. Our evaluation using an HSS configuration with three different storage devices shows that Sibyl outperforms the state-of-the-art data placement policy by 23.9%-48.2%, while significantly reducing the system architect's burden in designing a data placement mechanism that can simultaneously incorporate three storage devices. We show that Sibyl achieves 80% of the performance of an oracle policy that has complete knowledge offuture access patterns while incurring a very modest storage overhead of only 124.4 KiB.\",\"PeriodicalId\":337932,\"journal\":{\"name\":\"Proceedings of the 49th Annual International Symposium on Computer Architecture\",\"volume\":\"89 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 49th Annual International Symposium on Computer Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3470496.3527442\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 49th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3470496.3527442","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

混合存储系统(Hybrid storage system, HSS)使用多个不同的存储设备,提供高容量、可扩展的高性能存储。跨不同设备的数据放置对于最大限度地发挥这种混合系统的优势至关重要。最近的研究提出了各种技术,旨在准确识别性能关键数据,并将其放置在“最适合”的存储设备中。不幸的是,这些技术中的大多数都是刚性的,这(1)限制了它们的自适应能力,无法很好地适应各种工作负载和存储设备配置,并且(2)使得设计人员很难将这些技术扩展到不同的存储系统配置(例如,具有不同数量或不同类型的存储设备),而不是设计它们的配置。我们的目标是为混合存储系统设计一种新的数据放置技术,克服这些问题,并提供:(1)通过不断学习和适应工作负载和存储设备特性的适应性,以及(2)易于扩展到各种工作负载和HSS配置。我们介绍Sibyl,这是第一种在混合存储系统中使用强化学习进行数据放置的技术。Sibyl观察正在运行的工作负载以及存储设备的不同特性,以做出系统感知的数据放置决策。对于它所做的每一个决策,Sibyl都会从系统中获得奖励,用来评估其决策的长期性能影响,并不断优化其在线数据放置策略。我们在具有各种HSS配置的实际系统上实现了Sibyl,包括双混合和三混合存储系统,并在广泛的工作负载范围内将其与先前提出的四种数据放置技术(基于启发式和基于机器学习)进行了广泛的比较。我们的结果表明,与以前最好的数据放置技术相比,Sibyl在面向性能/面向成本的HSS配置中提供了21.6%/19.9%的性能提升。我们使用具有三个不同存储设备的HSS配置进行的评估表明,Sibyl的性能比最先进的数据放置策略高出23.9%-48.2%,同时显著减轻了系统架构师在设计可以同时包含三个存储设备的数据放置机制时的负担。我们表明,Sibyl达到了具有完整未来访问模式知识的oracle策略的80%的性能,同时仅产生非常适度的存储开销,仅为124.4 KiB。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning
Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Data placement across different devices is critical to maximize the benefits of such a hybrid system. Recent research proposes various techniques that aim to accurately identify performance-critical data to place it in a "best-fit" storage device. Unfortunately, most of these techniques are rigid, which (1) limits their adaptivity to perform well for a wide range of workloads and storage device configurations, and (2) makes it difficult for designers to extend these techniques to different storage system configurations (e.g., with a different number or different types of storage devices) than the configuration they are designed for. Our goal is to design a new data placement technique for hybrid storage systems that overcomes these issues and provides: (1) adaptivity, by continuously learning from and adapting to the workload and the storage device characteristics, and (2) easy extensibility to a wide range of workloads and HSS configurations. We introduce Sibyl, the first technique that uses reinforcement learning for data placement in hybrid storage systems. Sibyl observes different features of the running workload as well as the storage devices to make system-aware data placement decisions. For every decision it makes, Sibyl receives a reward from the system that it uses to evaluate the long-term performance impact of its decision and continuously optimizes its data placement policy online. We implement Sibyl on real systems with various HSS configurations, including dual- and tri-hybrid storage systems, and extensively compare it against four previously proposed data placement techniques (both heuristic- and machine learning-based) over a wide range of workloads. Our results show that Sibyl provides 21.6%/19.9% performance improvement in a performance-oriented/cost-oriented HSS configuration compared to the best previous data placement technique. Our evaluation using an HSS configuration with three different storage devices shows that Sibyl outperforms the state-of-the-art data placement policy by 23.9%-48.2%, while significantly reducing the system architect's burden in designing a data placement mechanism that can simultaneously incorporate three storage devices. We show that Sibyl achieves 80% of the performance of an oracle policy that has complete knowledge offuture access patterns while incurring a very modest storage overhead of only 124.4 KiB.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
BioHD: an efficient genome sequence search platform using HyperDimensional memorization MeNDA: a near-memory multi-way merge solution for sparse transposition and dataflows Graphite: optimizing graph neural networks on CPUs through cooperative software-hardware techniques INSPIRE: in-storage private information retrieval via protocol and architecture co-design CraterLake: a hardware accelerator for efficient unbounded computation on encrypted data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1