{"title":"Concurrent Failure Recovery for Product Matrix Regenerating Code","authors":"Jingyao Zhang","doi":"10.1109/PDCAT46702.2019.00060","DOIUrl":null,"url":null,"abstract":"Regenerating codes can minimize the network bandwidth required to recover the lost data in case of node failure in distributed storage systems. Product Matrix (PM) code is an important kind of Minimum Storage Regenerating (MSR) code that can maximize the storage efficiency, meanwhile minimizing the repair bandwidth. The original Product Matrix (PM) code only addressed single node failure. In this work, we will propose an algorithm of recovering multiple failed nodes concurrently for PM code. The explicit construction of the Repair Matrix that is applicable to any reasonable combinations of coding parameters will be presented, and the lost data can be obtained by simply multiplying the helper data with the repair matrix, thus is very easy for implementation. Based on the proposed strategy, the needed bandwidth for two major repairing policies: centralized and distributed recovery will be given formally. Moreover, the impact of Repairing Degree (the number of surviving nodes from which the assistant data are downloaded) on the bandwidth cost will be studied, which can help make optimal decisions in practical storage systems.","PeriodicalId":166126,"journal":{"name":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT46702.2019.00060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Regenerating codes can minimize the network bandwidth required to recover the lost data in case of node failure in distributed storage systems. Product Matrix (PM) code is an important kind of Minimum Storage Regenerating (MSR) code that can maximize the storage efficiency, meanwhile minimizing the repair bandwidth. The original Product Matrix (PM) code only addressed single node failure. In this work, we will propose an algorithm of recovering multiple failed nodes concurrently for PM code. The explicit construction of the Repair Matrix that is applicable to any reasonable combinations of coding parameters will be presented, and the lost data can be obtained by simply multiplying the helper data with the repair matrix, thus is very easy for implementation. Based on the proposed strategy, the needed bandwidth for two major repairing policies: centralized and distributed recovery will be given formally. Moreover, the impact of Repairing Degree (the number of surviving nodes from which the assistant data are downloaded) on the bandwidth cost will be studied, which can help make optimal decisions in practical storage systems.