Tree-structured data regeneration with network coding in distributed storage systems

Jun Li, Shuang Yang, Xin Wang, X. Xue, Baochun Li
{"title":"Tree-structured data regeneration with network coding in distributed storage systems","authors":"Jun Li, Shuang Yang, Xin Wang, X. Xue, Baochun Li","doi":"10.1109/IWQoS.2009.5201391","DOIUrl":null,"url":null,"abstract":"Distributed storage systems, built on peer-to-peer networks, can provide large-scale data storage and high data reliability by redundant schemes, such as replica, erasure codes and linear network coding. Redundant data may get lost due to the instability of distributed systems, such as permanent node departures, hardware failures, and accidental deletions. In order to maintain data availability, it is necessary to regenerate new redundant data in another node, referred to as a newcomer. Regeneration is expected to be finished as soon as possible, because the regeneration time can influence the data reliability and availability of distributed storage systems. It has been acknowledged that linear network coding can regenerate redundant data with less network traffic than replica and erasure codes. However, previous regeneration schemes are all star-structured regeneration schemes, in which data are transferred directly from existing storage nodes, referred to as providers, to the newcomer, so the regeneration time is always limited by the path with the narrowest bandwidth between newcomer and provider, due to bandwidth heterogeneity. In this paper, we exploit the bandwidth between providers and propose a tree-structured regeneration scheme using linear network coding. In our scheme, data can be transferred from providers to the newcomer through a regeneration tree, defined as a spanning tree covering the newcomer and all the providers. In a regeneration tree, a provider can receive data from other providers, then encode the received data with the data this provider stores, and finally send the encoded data to another provider or to the newcomer. We prove that a maximum spanning tree is an optimal regeneration tree and analyze its performance. In a trace-based simulation, the results show the tree-structured scheme can reduce the regeneration time by 75%–82% and improve data availability by 73%–124%.","PeriodicalId":231103,"journal":{"name":"2009 17th International Workshop on Quality of Service","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 17th International Workshop on Quality of Service","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS.2009.5201391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

Abstract

Distributed storage systems, built on peer-to-peer networks, can provide large-scale data storage and high data reliability by redundant schemes, such as replica, erasure codes and linear network coding. Redundant data may get lost due to the instability of distributed systems, such as permanent node departures, hardware failures, and accidental deletions. In order to maintain data availability, it is necessary to regenerate new redundant data in another node, referred to as a newcomer. Regeneration is expected to be finished as soon as possible, because the regeneration time can influence the data reliability and availability of distributed storage systems. It has been acknowledged that linear network coding can regenerate redundant data with less network traffic than replica and erasure codes. However, previous regeneration schemes are all star-structured regeneration schemes, in which data are transferred directly from existing storage nodes, referred to as providers, to the newcomer, so the regeneration time is always limited by the path with the narrowest bandwidth between newcomer and provider, due to bandwidth heterogeneity. In this paper, we exploit the bandwidth between providers and propose a tree-structured regeneration scheme using linear network coding. In our scheme, data can be transferred from providers to the newcomer through a regeneration tree, defined as a spanning tree covering the newcomer and all the providers. In a regeneration tree, a provider can receive data from other providers, then encode the received data with the data this provider stores, and finally send the encoded data to another provider or to the newcomer. We prove that a maximum spanning tree is an optimal regeneration tree and analyze its performance. In a trace-based simulation, the results show the tree-structured scheme can reduce the regeneration time by 75%–82% and improve data availability by 73%–124%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
分布式存储系统中基于网络编码的树状结构数据再生
分布式存储系统建立在点对点网络上,通过副本、擦除码、线性网络编码等冗余方案,可以提供大规模的数据存储和高的数据可靠性。由于分布式系统的不稳定性,例如永久性节点偏离、硬件故障和意外删除,可能会导致冗余数据丢失。为了保持数据的可用性,有必要在另一个节点(称为新节点)中重新生成新的冗余数据。由于再生时间会影响分布式存储系统的数据可靠性和可用性,因此希望尽快完成再生。线性网络编码可以比复制码和擦除码以更少的网络流量再生冗余数据。但是,以往的再生方案都是星形结构的再生方案,数据直接从现有的存储节点(即提供商)传输到新加入的存储节点,由于带宽的异构性,再生时间总是受到从新加入的存储节点到提供商之间带宽最窄的路径的限制。在本文中,我们利用提供商之间的带宽,提出了一种使用线性网络编码的树结构再生方案。在我们的方案中,数据可以通过再生树从提供者转移到新提供者,再生树定义为覆盖新提供者和所有提供者的生成树。在再生树中,提供者可以从其他提供者接收数据,然后用该提供者存储的数据对接收到的数据进行编码,最后将编码后的数据发送给另一个提供者或新来者。证明了最大生成树是最优再生树,并对其性能进行了分析。在基于跟踪的仿真中,结果表明,树结构方案可将再生时间缩短75% ~ 82%,将数据可用性提高73% ~ 124%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reliable navigation of mobile sensors in wireless sensor networks without localization service Fast rerouting for IP multicast in managed IPTV networks Admission control for roadside unit access in Intelligent Transportation Systems Rate and delay controlled core networks: An experimental demonstration Succinct priority indexing structures for the management of large priority queues
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1