文件系统不适合作为分布式存储后端:来自Ceph 10年发展的教训

Abutalib Aghayev, S. Weil, Michael Kuchnik, M. Nelson, G. Ganger, George Amvrosiadis
{"title":"文件系统不适合作为分布式存储后端:来自Ceph 10年发展的教训","authors":"Abutalib Aghayev, S. Weil, Michael Kuchnik, M. Nelson, G. Ganger, George Amvrosiadis","doi":"10.1145/3341301.3359656","DOIUrl":null,"url":null,"abstract":"For a decade, the Ceph distributed file system followed the conventional wisdom of building its storage backend on top of local file systems. This is a preferred choice for most distributed file systems today because it allows them to benefit from the convenience and maturity of battle-tested code. Ceph's experience, however, shows that this comes at a high price. First, developing a zero-overhead transaction mechanism is challenging. Second, metadata performance at the local level can significantly affect performance at the distributed level. Third, supporting emerging storage hardware is painstakingly slow. Ceph addressed these issues with BlueStore, a new back-end designed to run directly on raw storage devices. In only two years since its inception, BlueStore outperformed previous established backends and is adopted by 70% of users in production. By running in user space and fully controlling the I/O stack, it has enabled space-efficient metadata and data checksums, fast overwrites of erasure-coded data, inline compression, decreased performance variability, and avoided a series of performance pitfalls of local file systems. Finally, it makes the adoption of backwards-incompatible storage hardware possible, an important trait in a changing storage landscape that is learning to embrace hardware diversity.","PeriodicalId":331561,"journal":{"name":"Proceedings of the 27th ACM Symposium on Operating Systems Principles","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"66","resultStr":"{\"title\":\"File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution\",\"authors\":\"Abutalib Aghayev, S. Weil, Michael Kuchnik, M. Nelson, G. Ganger, George Amvrosiadis\",\"doi\":\"10.1145/3341301.3359656\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For a decade, the Ceph distributed file system followed the conventional wisdom of building its storage backend on top of local file systems. This is a preferred choice for most distributed file systems today because it allows them to benefit from the convenience and maturity of battle-tested code. Ceph's experience, however, shows that this comes at a high price. First, developing a zero-overhead transaction mechanism is challenging. Second, metadata performance at the local level can significantly affect performance at the distributed level. Third, supporting emerging storage hardware is painstakingly slow. Ceph addressed these issues with BlueStore, a new back-end designed to run directly on raw storage devices. In only two years since its inception, BlueStore outperformed previous established backends and is adopted by 70% of users in production. By running in user space and fully controlling the I/O stack, it has enabled space-efficient metadata and data checksums, fast overwrites of erasure-coded data, inline compression, decreased performance variability, and avoided a series of performance pitfalls of local file systems. Finally, it makes the adoption of backwards-incompatible storage hardware possible, an important trait in a changing storage landscape that is learning to embrace hardware diversity.\",\"PeriodicalId\":331561,\"journal\":{\"name\":\"Proceedings of the 27th ACM Symposium on Operating Systems Principles\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"66\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 27th ACM Symposium on Operating Systems Principles\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3341301.3359656\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th ACM Symposium on Operating Systems Principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341301.3359656","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 66

摘要

十年来,Ceph分布式文件系统一直遵循在本地文件系统之上构建其存储后端的传统智慧。这是当今大多数分布式文件系统的首选,因为它使它们能够受益于经过实战测试的代码的便利性和成熟度。然而,Ceph的经验表明,这需要付出高昂的代价。首先,开发零开销事务机制具有挑战性。其次,本地级别的元数据性能会显著影响分布式级别的性能。第三,支持新兴的存储硬件非常缓慢。Ceph通过BlueStore解决了这些问题,BlueStore是一个设计用于直接在原始存储设备上运行的新后端。BlueStore在成立后的短短两年内,就超越了之前建立的后端,并在生产中被70%的用户采用。通过在用户空间中运行并完全控制I/O堆栈,它支持节省空间的元数据和数据校验和、对擦除编码数据的快速覆盖、内联压缩、降低性能可变性,并避免了本地文件系统的一系列性能缺陷。最后,它使采用向后不兼容的存储硬件成为可能,这在不断变化的存储环境中是一个重要的特征,因为存储环境正在学习接受硬件多样性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution
For a decade, the Ceph distributed file system followed the conventional wisdom of building its storage backend on top of local file systems. This is a preferred choice for most distributed file systems today because it allows them to benefit from the convenience and maturity of battle-tested code. Ceph's experience, however, shows that this comes at a high price. First, developing a zero-overhead transaction mechanism is challenging. Second, metadata performance at the local level can significantly affect performance at the distributed level. Third, supporting emerging storage hardware is painstakingly slow. Ceph addressed these issues with BlueStore, a new back-end designed to run directly on raw storage devices. In only two years since its inception, BlueStore outperformed previous established backends and is adopted by 70% of users in production. By running in user space and fully controlling the I/O stack, it has enabled space-efficient metadata and data checksums, fast overwrites of erasure-coded data, inline compression, decreased performance variability, and avoided a series of performance pitfalls of local file systems. Finally, it makes the adoption of backwards-incompatible storage hardware possible, an important trait in a changing storage landscape that is learning to embrace hardware diversity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
TASO Gerenuk The inflection point hypothesis: a principled debugging approach for locating the root cause of a failure Yodel I4
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1