Offline and Online Algorithms for SSD Management

Tomer Lange, J. Naor, G. Yadgar
{"title":"Offline and Online Algorithms for SSD Management","authors":"Tomer Lange, J. Naor, G. Yadgar","doi":"10.1145/3491045","DOIUrl":null,"url":null,"abstract":"Flash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after data has been written to a physical location, it has to be erased before new data can be written to it. Moreover, SSDs support read and write operations in granularity of pages, while erasures are performed on entire blocks, which often contain hundreds of pages. When erasing a block, any valid data it stores must be rewritten to a clean location. As an SSD eventually wears out with progressing number of erasures, the efficiency of the management algorithm has a significant impact on its endurance. In this paper we first formally define the SSD management problem. We then explore this problem from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm that, given any input, performs a negligible number of rewrites (relative to the input length). We also discuss the hardness of the offline problem. In the online setting, we first consider algorithms that have no prior knowledge about the input. We prove that no deterministic algorithm outperforms the greedy algorithm in this setting, and discuss the possible benefit of randomization. We then augment our model, assuming that each request for a page arrives with a prediction of the next time the page is updated. We design an online algorithm that uses such predictions, and show that its performance improves as the prediction error decreases. We also show that the performance of our algorithm is never worse than that guaranteed by the greedy algorithm, even when the prediction error is large. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.","PeriodicalId":426760,"journal":{"name":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Measurement and Analysis of Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3491045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Flash-based solid state drives (SSDs) have gained a central role in the infrastructure of large-scale datacenters, as well as in commodity servers and personal devices. The main limitation of flash media is its inability to support update-in-place: after data has been written to a physical location, it has to be erased before new data can be written to it. Moreover, SSDs support read and write operations in granularity of pages, while erasures are performed on entire blocks, which often contain hundreds of pages. When erasing a block, any valid data it stores must be rewritten to a clean location. As an SSD eventually wears out with progressing number of erasures, the efficiency of the management algorithm has a significant impact on its endurance. In this paper we first formally define the SSD management problem. We then explore this problem from an algorithmic perspective, considering it in both offline and online settings. In the offline setting, we present a near-optimal algorithm that, given any input, performs a negligible number of rewrites (relative to the input length). We also discuss the hardness of the offline problem. In the online setting, we first consider algorithms that have no prior knowledge about the input. We prove that no deterministic algorithm outperforms the greedy algorithm in this setting, and discuss the possible benefit of randomization. We then augment our model, assuming that each request for a page arrives with a prediction of the next time the page is updated. We design an online algorithm that uses such predictions, and show that its performance improves as the prediction error decreases. We also show that the performance of our algorithm is never worse than that guaranteed by the greedy algorithm, even when the prediction error is large. We complement our theoretical findings with an empirical evaluation of our algorithms, comparing them with the state-of-the-art scheme. The results confirm that our algorithms exhibit an improved performance for a wide range of input traces.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SSD管理的离线和在线算法
基于闪存的固态硬盘(ssd)已经在大型数据中心的基础设施以及商用服务器和个人设备中发挥了核心作用。flash媒体的主要限制是它不能支持就地更新:在数据被写入物理位置之后,必须在新数据被写入之前将其擦除。此外,ssd支持以页面粒度进行读写操作,而擦除操作是在整个块上执行的,这些块通常包含数百个页面。当擦除一个块时,它存储的任何有效数据都必须重写到一个干净的位置。随着擦除次数的增加,SSD最终会磨损,因此管理算法的效率对SSD的耐用性有很大的影响。本文首先正式定义了固态硬盘的管理问题。然后,我们从算法的角度探讨这个问题,在离线和在线设置中考虑它。在离线设置中,我们提出了一个近乎最优的算法,给定任何输入,执行重写的次数可以忽略不计(相对于输入长度)。我们还讨论了离线问题的硬度。在在线设置中,我们首先考虑对输入没有先验知识的算法。我们证明了在这种情况下没有确定性算法优于贪婪算法,并讨论了随机化可能带来的好处。然后我们扩展我们的模型,假设对页面的每个请求到达时都预测了页面的下一次更新时间。我们设计了一个使用这种预测的在线算法,并表明其性能随着预测误差的减小而提高。我们还表明,即使在预测误差很大的情况下,我们的算法的性能也不会比贪婪算法所保证的性能差。我们通过对算法的实证评估来补充我们的理论发现,并将它们与最先进的方案进行比较。结果证实,我们的算法在大范围的输入迹线中表现出改进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.20
自引率
0.00%
发文量
0
期刊最新文献
A Large Scale Study and Classification of VirusTotal Reports on Phishing and Malware URLs POMACS V7, N2, June 2023 Editorial SplitRPC: A {Control + Data} Path Splitting RPC Stack for ML Inference Serving Smash: Flexible, Fast, and Resource-efficient Placement and Lookup of Distributed Storage Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1