利用局部性实现分布式RAM存储中的快速故障恢复

Yiming Zhang, Dongsheng Li, Ling Liu
{"title":"利用局部性实现分布式RAM存储中的快速故障恢复","authors":"Yiming Zhang, Dongsheng Li, Ling Liu","doi":"10.1145/3289604","DOIUrl":null,"url":null,"abstract":"Distributed RAM storage aggregates the RAM of servers in data center networks (DCN) to provide extremely high I/O performance for large-scale cloud systems. For quick recovery of storage server failures, MemCube [53] exploits the proximity of the BCube network to limit the recovery traffic to the recovery servers’ 1-hop neighborhood. However, the previous design is applicable only to the symmetric BCube(n,k) network with nk+1 nodes and has suboptimal recovery performance due to congestion and contention. To address these problems, in this article, we propose CubeX, which (i) generalizes the “1-hop” principle of MemCube for arbitrary cube-based networks and (ii) improves the throughput and recovery performance of RAM-based key-value (KV) store via cross-layer optimizations. At the core of CubeX is to leverage the glocality (= globality + locality) of cube-based networks: It scatters backup data across a large number of disks globally distributed throughout the cube and restricts all recovery traffic within the small local range of each server node. Our evaluation shows that CubeX not only efficiently supports RAM-based KV store for cube-based networks but also significantly outperforms MemCube and RAMCloud in both throughput and recovery time.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Leveraging Glocality for Fast Failure Recovery in Distributed RAM Storage\",\"authors\":\"Yiming Zhang, Dongsheng Li, Ling Liu\",\"doi\":\"10.1145/3289604\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed RAM storage aggregates the RAM of servers in data center networks (DCN) to provide extremely high I/O performance for large-scale cloud systems. For quick recovery of storage server failures, MemCube [53] exploits the proximity of the BCube network to limit the recovery traffic to the recovery servers’ 1-hop neighborhood. However, the previous design is applicable only to the symmetric BCube(n,k) network with nk+1 nodes and has suboptimal recovery performance due to congestion and contention. To address these problems, in this article, we propose CubeX, which (i) generalizes the “1-hop” principle of MemCube for arbitrary cube-based networks and (ii) improves the throughput and recovery performance of RAM-based key-value (KV) store via cross-layer optimizations. At the core of CubeX is to leverage the glocality (= globality + locality) of cube-based networks: It scatters backup data across a large number of disks globally distributed throughout the cube and restricts all recovery traffic within the small local range of each server node. Our evaluation shows that CubeX not only efficiently supports RAM-based KV store for cube-based networks but also significantly outperforms MemCube and RAMCloud in both throughput and recovery time.\",\"PeriodicalId\":273014,\"journal\":{\"name\":\"ACM Transactions on Storage (TOS)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Storage (TOS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3289604\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3289604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

分布式RAM存储将数据中心网络(DCN)中服务器的RAM聚合在一起,为大规模云系统提供极高的I/O性能。为了快速恢复存储服务器故障,MemCube[53]利用BCube网络的邻近性,将恢复流量限制在恢复服务器的1跳邻居中。但是,上述设计仅适用于具有nk+1个节点的对称BCube(n,k)网络,并且由于拥塞和争用,恢复性能不是最优。为了解决这些问题,在本文中,我们提出了CubeX,它(i)将MemCube的“1-hop”原则推广到任意基于立方体的网络中,(ii)通过跨层优化提高基于ram的键值(KV)存储的吞吐量和恢复性能。CubeX的核心是利用基于多维数据集的网络的全局局部性(=全局性+局部性):它将备份数据分散到全局分布在整个多维数据集的大量磁盘上,并将所有恢复流量限制在每个服务器节点的小本地范围内。我们的评估表明,CubeX不仅有效地支持基于ram的基于立方体网络的KV存储,而且在吞吐量和恢复时间方面都明显优于MemCube和RAMCloud。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Leveraging Glocality for Fast Failure Recovery in Distributed RAM Storage
Distributed RAM storage aggregates the RAM of servers in data center networks (DCN) to provide extremely high I/O performance for large-scale cloud systems. For quick recovery of storage server failures, MemCube [53] exploits the proximity of the BCube network to limit the recovery traffic to the recovery servers’ 1-hop neighborhood. However, the previous design is applicable only to the symmetric BCube(n,k) network with nk+1 nodes and has suboptimal recovery performance due to congestion and contention. To address these problems, in this article, we propose CubeX, which (i) generalizes the “1-hop” principle of MemCube for arbitrary cube-based networks and (ii) improves the throughput and recovery performance of RAM-based key-value (KV) store via cross-layer optimizations. At the core of CubeX is to leverage the glocality (= globality + locality) of cube-based networks: It scatters backup data across a large number of disks globally distributed throughout the cube and restricts all recovery traffic within the small local range of each server node. Our evaluation shows that CubeX not only efficiently supports RAM-based KV store for cube-based networks but also significantly outperforms MemCube and RAMCloud in both throughput and recovery time.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
WebAssembly-based Delta Sync for Cloud Storage Services DEFUSE: An Interface for Fast and Correct User Space File System Access Donag: Generating Efficient Patches and Diffs for Compressed Archives Building GC-free Key-value Store on HM-SMR Drives with ZoneFS Kangaroo: Theory and Practice of Caching Billions of Tiny Objects on Flash
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1