数据存储的快速擦除编码

ACM Transactions on Storage (TOS) Pub Date : 2020-03-07 DOI:10.1145/3375554

Tianli Zhou, C. Tian

{"title":"数据存储的快速擦除编码","authors":"Tianli Zhou, C. Tian","doi":"10.1145/3375554","DOIUrl":null,"url":null,"abstract":"Various techniques have been proposed in the literature to improve erasure code computation efficiency, including optimizing bitmatrix design and computation schedule, common XOR (exclusive-OR) operation reduction, caching management techniques, and vectorization techniques. These techniques were largely proposed individually, and, in this work, we seek to use them jointly. To accomplish this task, these techniques need to be thoroughly evaluated individually and their relation better understood. Building on extensive testing, we develop methods to systematically optimize the computation chain together with the underlying bitmatrix. This led to a simple design approach of optimizing the bitmatrix by minimizing a weighted computation cost function, and also a straightforward coding procedure—follow a computation schedule produced from the optimized bitmatrix to apply XOR-level vectorization. This procedure provides better performances than most existing techniques (e.g., those used in ISA-L and Jerasure libraries), and sometimes can even compete against well-known but less general codes such as EVENODD, RDP, and STAR codes. One particularly important observation is that vectorizing the XOR operations is a better choice than directly vectorizing finite field operations, not only because of the flexibility in choosing finite field size and the better encoding throughput, but also its minimal migration efforts onto newer CPUs.","PeriodicalId":273014,"journal":{"name":"ACM Transactions on Storage (TOS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Fast Erasure Coding for Data Storage\",\"authors\":\"Tianli Zhou, C. Tian\",\"doi\":\"10.1145/3375554\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Various techniques have been proposed in the literature to improve erasure code computation efficiency, including optimizing bitmatrix design and computation schedule, common XOR (exclusive-OR) operation reduction, caching management techniques, and vectorization techniques. These techniques were largely proposed individually, and, in this work, we seek to use them jointly. To accomplish this task, these techniques need to be thoroughly evaluated individually and their relation better understood. Building on extensive testing, we develop methods to systematically optimize the computation chain together with the underlying bitmatrix. This led to a simple design approach of optimizing the bitmatrix by minimizing a weighted computation cost function, and also a straightforward coding procedure—follow a computation schedule produced from the optimized bitmatrix to apply XOR-level vectorization. This procedure provides better performances than most existing techniques (e.g., those used in ISA-L and Jerasure libraries), and sometimes can even compete against well-known but less general codes such as EVENODD, RDP, and STAR codes. One particularly important observation is that vectorizing the XOR operations is a better choice than directly vectorizing finite field operations, not only because of the flexibility in choosing finite field size and the better encoding throughput, but also its minimal migration efforts onto newer CPUs.\",\"PeriodicalId\":273014,\"journal\":{\"name\":\"ACM Transactions on Storage (TOS)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Storage (TOS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3375554\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Storage (TOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3375554","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

文献中提出了各种技术来提高擦除码的计算效率，包括优化位矩阵设计和计算时间表、减少常见异或操作、缓存管理技术和向量化技术。这些技术在很大程度上是单独提出的，在这项工作中，我们寻求将它们联合使用。为了完成这项任务，需要对这些技术单独进行彻底的评估，并更好地理解它们之间的关系。基于广泛的测试，我们开发了系统地优化计算链和底层比特矩阵的方法。这导致了通过最小化加权计算代价函数来优化位矩阵的简单设计方法，以及一个直接的编码过程-遵循由优化的位矩阵产生的计算时间表来应用异或级矢量化。此过程提供了比大多数现有技术(例如ISA-L和Jerasure库中使用的技术)更好的性能，有时甚至可以与众所周知但不太通用的代码(如EVENODD、RDP和STAR代码)竞争。一个特别重要的观察结果是，与直接向量化有限域操作相比，向量化异或操作是一个更好的选择，这不仅是因为选择有限域大小的灵活性和更好的编码吞吐量，而且还因为向新cpu迁移的工作量最小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Fast Erasure Coding for Data Storage

Various techniques have been proposed in the literature to improve erasure code computation efficiency, including optimizing bitmatrix design and computation schedule, common XOR (exclusive-OR) operation reduction, caching management techniques, and vectorization techniques. These techniques were largely proposed individually, and, in this work, we seek to use them jointly. To accomplish this task, these techniques need to be thoroughly evaluated individually and their relation better understood. Building on extensive testing, we develop methods to systematically optimize the computation chain together with the underlying bitmatrix. This led to a simple design approach of optimizing the bitmatrix by minimizing a weighted computation cost function, and also a straightforward coding procedure—follow a computation schedule produced from the optimized bitmatrix to apply XOR-level vectorization. This procedure provides better performances than most existing techniques (e.g., those used in ISA-L and Jerasure libraries), and sometimes can even compete against well-known but less general codes such as EVENODD, RDP, and STAR codes. One particularly important observation is that vectorizing the XOR operations is a better choice than directly vectorizing finite field operations, not only because of the flexibility in choosing finite field size and the better encoding throughput, but also its minimal migration efforts onto newer CPUs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Storage (TOS)

自引率

0.00%

发文量

期刊最新文献

WebAssembly-based Delta Sync for Cloud Storage Services DEFUSE: An Interface for Fast and Correct User Space File System Access Donag: Generating Efficient Patches and Diffs for Compressed Archives Building GC-free Key-value Store on HM-SMR Drives with ZoneFS Kangaroo: Theory and Practice of Caching Billions of Tiny Objects on Flash