Erasure Coding in Object Stores: Challenges and Opportunities

Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing Pub Date : 2018-07-23 DOI:10.1145/3212734.3212799

Lewis Tseng

{"title":"Erasure Coding in Object Stores: Challenges and Opportunities","authors":"Lewis Tseng","doi":"10.1145/3212734.3212799","DOIUrl":null,"url":null,"abstract":"Recent years have seen a tremendous growth in the popularity of online services accessed over the Internet. Our daily lives are becoming more and more dependent on these online services, which generate and/or rely on huge amount of data. One core technique to deal with the unprecedented amount of data is the distributed storage systems that allow users/applications to read and write data in a distributed fashion and ensure fault-tolerance, durability, scalability, and availability. This tutorial will focus on the distributed key-value storage systems, i.e., read/write objects. One common implementation of such a read/write object is via replicating data across multiple servers or even data-centers. The replication-based implementation has been studied in the literature, e.g., ABD [Attiya, Bar-Noy and Dolev '96] and LDR [Fan and Lynch '03], and adopted in practice e.g., Cassandra, MongoDB, and DynamoDB. One drawbacks of the replication-based mechanism is high storage cost and communication cost due to unnecessary redundancy. To address the issue, there is an ongoing effort on applying erasure codes to distributed storage systems in both academia and industry. For example, Microsoft applies erasure coding across data-centers to build strongly consistent objects (Giza in Microsoft Azure Storage), and OpenStack provides erasure coding as a storage policy in their read/write object Swift. However, the field is still fairly young and has many interesting open problems. This tutorial will focus on the challenges of using erasure codes in read/write objects that guarantee consistency. To begin with, I will introduce concepts on consistency models, erasure codes followed by some recent algorithms and existing practical systems. I will then discuss the state-of-the-art techniques in this field, and conclude the talk with potential challenges that lead to interesting research problems. The talk will be accessible to anyone with a background a basic knowledge on algorithms or programming. First part of the results are done by Viveck Cadambe, Kishori Konwar, N. Prakash, Nancy Lynch, and Muriel Médard. In the end, I will also share our recent results.","PeriodicalId":198284,"journal":{"name":"Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3212734.3212799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Recent years have seen a tremendous growth in the popularity of online services accessed over the Internet. Our daily lives are becoming more and more dependent on these online services, which generate and/or rely on huge amount of data. One core technique to deal with the unprecedented amount of data is the distributed storage systems that allow users/applications to read and write data in a distributed fashion and ensure fault-tolerance, durability, scalability, and availability. This tutorial will focus on the distributed key-value storage systems, i.e., read/write objects. One common implementation of such a read/write object is via replicating data across multiple servers or even data-centers. The replication-based implementation has been studied in the literature, e.g., ABD [Attiya, Bar-Noy and Dolev '96] and LDR [Fan and Lynch '03], and adopted in practice e.g., Cassandra, MongoDB, and DynamoDB. One drawbacks of the replication-based mechanism is high storage cost and communication cost due to unnecessary redundancy. To address the issue, there is an ongoing effort on applying erasure codes to distributed storage systems in both academia and industry. For example, Microsoft applies erasure coding across data-centers to build strongly consistent objects (Giza in Microsoft Azure Storage), and OpenStack provides erasure coding as a storage policy in their read/write object Swift. However, the field is still fairly young and has many interesting open problems. This tutorial will focus on the challenges of using erasure codes in read/write objects that guarantee consistency. To begin with, I will introduce concepts on consistency models, erasure codes followed by some recent algorithms and existing practical systems. I will then discuss the state-of-the-art techniques in this field, and conclude the talk with potential challenges that lead to interesting research problems. The talk will be accessible to anyone with a background a basic knowledge on algorithms or programming. First part of the results are done by Viveck Cadambe, Kishori Konwar, N. Prakash, Nancy Lynch, and Muriel Médard. In the end, I will also share our recent results.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

对象存储中的擦除编码:挑战与机遇

近年来，通过互联网访问的在线服务的普及程度有了巨大的增长。我们的日常生活越来越依赖于这些产生和/或依赖于大量数据的在线服务。处理空前海量数据的一项核心技术是分布式存储系统，它允许用户/应用程序以分布式方式读写数据，并确保容错、持久性、可伸缩性和可用性。本教程将重点介绍分布式键值存储系统，即读/写对象。这种读/写对象的一种常见实现是通过跨多个服务器甚至数据中心复制数据。基于复制的实现已经在文献中进行了研究，例如ABD [Attiya, Bar-Noy和Dolev '96]和LDR [Fan和Lynch '03]，并在实践中采用，例如Cassandra, MongoDB和DynamoDB。基于复制的机制的一个缺点是由于不必要的冗余而导致的高存储成本和通信成本。为了解决这个问题，学术界和工业界都在努力将擦除码应用于分布式存储系统。例如，Microsoft跨数据中心应用擦除编码来构建强一致性对象(Microsoft Azure Storage中的Giza)， OpenStack在其读写对象Swift中提供擦除编码作为存储策略。然而，该领域仍然相当年轻，并且有许多有趣的开放问题。本教程将重点介绍在保证一致性的读/写对象中使用擦除码的挑战。首先，我将介绍一致性模型、擦除码的概念，然后是一些最新的算法和现有的实用系统。然后，我将讨论该领域的最新技术，并以导致有趣的研究问题的潜在挑战来结束演讲。任何具有算法或编程基础知识的人都可以参加该讲座。第一部分结果是由vivek Cadambe, Kishori Konwar, N. Prakash, Nancy Lynch和Muriel m达德完成的。最后，我也将分享我们最近的成果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing

自引率

0.00%

发文量

期刊最新文献

Tutorial: Consistency Choices in Modern Distributed Systems Locking Timestamps versus Locking Objects Recoverable Mutual Exclusion Under System-Wide Failures Nesting-Safe Recoverable Linearizability: Modular Constructions for Non-Volatile Memory Brief Announcement: Beeping a Time-Optimal Leader Election