Jehan-Francois Pâris, T. Schwarz, A. Amer, D. Long
{"title":"Improving Disk Array Reliability Through Expedited Scrubbing","authors":"Jehan-Francois Pâris, T. Schwarz, A. Amer, D. Long","doi":"10.1109/NAS.2010.37","DOIUrl":null,"url":null,"abstract":"Disk scrubbing periodically scans the contents of a disk array to detect the presence of irrecoverable read errors and reconstitute the contents of the lost blocks using the built-in redundancy of the disk array. We address the issue of scheduling scrubbing runs in disk arrays that can tolerate two disk failures without incurring a data loss, and propose to start an urgent scrubbing run of the whole array whenever a disk failure is detected. Used alone or in combination with periodic scrubbing runs, these expedited runs can improve the mean time to data loss of disk arrays over a wide range of disk repair times. As a result, our technique eliminates the need for frequent scrubbing runs and the need to maintain spare disks and personnel on site to replace failed disks within a twenty-four hour interval.","PeriodicalId":284549,"journal":{"name":"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage","volume":"173 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2010.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Disk scrubbing periodically scans the contents of a disk array to detect the presence of irrecoverable read errors and reconstitute the contents of the lost blocks using the built-in redundancy of the disk array. We address the issue of scheduling scrubbing runs in disk arrays that can tolerate two disk failures without incurring a data loss, and propose to start an urgent scrubbing run of the whole array whenever a disk failure is detected. Used alone or in combination with periodic scrubbing runs, these expedited runs can improve the mean time to data loss of disk arrays over a wide range of disk repair times. As a result, our technique eliminates the need for frequent scrubbing runs and the need to maintain spare disks and personnel on site to replace failed disks within a twenty-four hour interval.