ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems

Lin Xiao, Kai Ren, Qing Zheng, Garth A. Gibson
{"title":"ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems","authors":"Lin Xiao, Kai Ren, Qing Zheng, Garth A. Gibson","doi":"10.1145/2806777.2806844","DOIUrl":null,"url":null,"abstract":"The rapid growth of cloud storage systems calls for fast and scalable namespace processing. While few commercial file systems offer anything better than federating individually non-scalable namespace servers, a recent academic file system, IndexFS, demonstrates scalable namespace processing based on client caching of directory entries and permissions (directory lookup state) with no per-client state in servers. In this paper we explore explicit replication of directory lookup state in all servers as an alternative to caching this information in all clients. Both eliminate most repeated RPCs to different servers in order to resolve hierarchical permission tests. Our realization for server replicated directory lookup state, ShardFS, employs a novel file system specific hybrid optimistic and pessimistic concurrency control favoring single object transactions over distributed transactions. Our experimentation suggests that if directory lookup state mutation is a fixed fraction of operations (strong scaling for metadata), server replication does not scale as well as client caching, but if directory lookup state mutation is proportional to the number of jobs, not the number of processes per job, (weak scaling for metadata), then server replication can scale more linearly than client caching and provide lower 70 percentile response times as well.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixth ACM Symposium on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2806777.2806844","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

Abstract

The rapid growth of cloud storage systems calls for fast and scalable namespace processing. While few commercial file systems offer anything better than federating individually non-scalable namespace servers, a recent academic file system, IndexFS, demonstrates scalable namespace processing based on client caching of directory entries and permissions (directory lookup state) with no per-client state in servers. In this paper we explore explicit replication of directory lookup state in all servers as an alternative to caching this information in all clients. Both eliminate most repeated RPCs to different servers in order to resolve hierarchical permission tests. Our realization for server replicated directory lookup state, ShardFS, employs a novel file system specific hybrid optimistic and pessimistic concurrency control favoring single object transactions over distributed transactions. Our experimentation suggests that if directory lookup state mutation is a fixed fraction of operations (strong scaling for metadata), server replication does not scale as well as client caching, but if directory lookup state mutation is proportional to the number of jobs, not the number of processes per job, (weak scaling for metadata), then server replication can scale more linearly than client caching and provide lower 70 percentile response times as well.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ShardFS与IndexFS:云存储系统中分布式元数据管理的复制与缓存策略
云存储系统的快速增长需要快速和可扩展的命名空间处理。虽然很少有商业文件系统能提供比联合单个不可扩展的名称空间服务器更好的东西,但最近的一个学术文件系统IndexFS演示了基于目录条目和权限(目录查找状态)的客户端缓存的可扩展名称空间处理,而服务器中没有每个客户端状态。在本文中,我们将探索在所有服务器中显式复制目录查找状态,作为在所有客户端中缓存此信息的替代方案。两者都消除了对不同服务器的大多数重复rpc,以便解决分层权限测试。我们对服务器复制目录查找状态的实现ShardFS采用了一种新的特定于文件系统的混合乐观和悲观并发控制,它更倾向于单对象事务而不是分布式事务。我们的实验表明,如果目录查找状态的变化是操作的固定部分(元数据的可伸缩性强),服务器复制的可伸缩性不如客户端缓存好,但是如果目录查找状态的变化与作业的数量成正比,而不是与每个作业的进程数成比例(元数据的可伸缩性弱),那么服务器复制可以比客户端缓存更线性地扩展,并且提供更低的70%的响应时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Software-defined caching: managing caches in multi-tenant data centers Managed communication and consistency for fast data-parallel iterative analytics MemcachedGPU: scaling-up scale-out key-value stores Database high availability using SHADOW systems Proceedings of the Sixth ACM Symposium on Cloud Computing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1