{"title":"Toward Efficient Repair for Wide-Stripe Erasure Coding With High Reliability","authors":"Wei Wang;Zhipeng Li;Min Lyu;Liangliang Xu;Yinlong Xu","doi":"10.1109/TR.2024.3447706","DOIUrl":null,"url":null,"abstract":"Erasure coding is a common redundancy scheme to provide higher reliability with much lower storage overhead compared to replication. It prevents data loss due to failures but induces high repair costs. As data volumes grow exponentially, wide stripes are proposed for extreme storage savings. Wide-stripe erasure codes face the challenges of higher repair costs for single and multiple failures. Our extensive analysis shows that existing repair-efficient erasure codes, such as locally repairable codes (LRCs) and minimum storage regenerating (MSR) codes, are insufficient to meet all the requirements of wide stripes: low storage overhead, low repair cost for both single and multiple failures, and high reliability. In this article, we explore an alternative code scheme, locally repairable with zigzag code (LRZC), which combines the advantages of LRCs and zigzag codes. LRZC divides data blocks and global parity blocks into evenly sized local groups, and generates two local parity blocks by a zigzag code in each group. Under the limit of storage overhead of wide stripes, LRZC reduces the repair cost for single and multiple failures and provides higher reliability compared with existing wide-stripe codes. Experiments show that LRZC reduces the repair cost of single and multiple failures by up to 41.9% and 41.7% compared with the state-of-the-art LRCs.","PeriodicalId":56305,"journal":{"name":"IEEE Transactions on Reliability","volume":"74 3","pages":"3657-3670"},"PeriodicalIF":5.7000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Reliability","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10665910/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Erasure coding is a common redundancy scheme to provide higher reliability with much lower storage overhead compared to replication. It prevents data loss due to failures but induces high repair costs. As data volumes grow exponentially, wide stripes are proposed for extreme storage savings. Wide-stripe erasure codes face the challenges of higher repair costs for single and multiple failures. Our extensive analysis shows that existing repair-efficient erasure codes, such as locally repairable codes (LRCs) and minimum storage regenerating (MSR) codes, are insufficient to meet all the requirements of wide stripes: low storage overhead, low repair cost for both single and multiple failures, and high reliability. In this article, we explore an alternative code scheme, locally repairable with zigzag code (LRZC), which combines the advantages of LRCs and zigzag codes. LRZC divides data blocks and global parity blocks into evenly sized local groups, and generates two local parity blocks by a zigzag code in each group. Under the limit of storage overhead of wide stripes, LRZC reduces the repair cost for single and multiple failures and provides higher reliability compared with existing wide-stripe codes. Experiments show that LRZC reduces the repair cost of single and multiple failures by up to 41.9% and 41.7% compared with the state-of-the-art LRCs.
期刊介绍:
IEEE Transactions on Reliability is a refereed journal for the reliability and allied disciplines including, but not limited to, maintainability, physics of failure, life testing, prognostics, design and manufacture for reliability, reliability for systems of systems, network availability, mission success, warranty, safety, and various measures of effectiveness. Topics eligible for publication range from hardware to software, from materials to systems, from consumer and industrial devices to manufacturing plants, from individual items to networks, from techniques for making things better to ways of predicting and measuring behavior in the field. As an engineering subject that supports new and existing technologies, we constantly expand into new areas of the assurance sciences.