Eric Rozier, W. Sanders, Pin Zhou, N. Mandagere, Sandeep Uttamchandani, Mark L. Yakushev
{"title":"Modeling the Fault Tolerance Consequences of Deduplication","authors":"Eric Rozier, W. Sanders, Pin Zhou, N. Mandagere, Sandeep Uttamchandani, Mark L. Yakushev","doi":"10.1109/SRDS.2011.18","DOIUrl":null,"url":null,"abstract":"Modern storage systems are employing data deduplication with increasing frequency. Often the storage systems on which these techniques are deployed contain important data, and utilize fault-tolerant hardware and software to improve the reliability of the system and reduce data loss. We suggest that data deduplication introduces inter-file relationships that may have a negative impact on the fault tolerance of such systems by creating dependencies that can increase the severity of data loss events. We present a framework composed of data analysis methods and a model of data deduplication that is useful in studying the reliability impact of data deduplication. The framework is useful for determining a deduplication strategy that is estimated to satisfy a set of reliability constraints supplied by a user.","PeriodicalId":116805,"journal":{"name":"2011 IEEE 30th International Symposium on Reliable Distributed Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 30th International Symposium on Reliable Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2011.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Modern storage systems are employing data deduplication with increasing frequency. Often the storage systems on which these techniques are deployed contain important data, and utilize fault-tolerant hardware and software to improve the reliability of the system and reduce data loss. We suggest that data deduplication introduces inter-file relationships that may have a negative impact on the fault tolerance of such systems by creating dependencies that can increase the severity of data loss events. We present a framework composed of data analysis methods and a model of data deduplication that is useful in studying the reliability impact of data deduplication. The framework is useful for determining a deduplication strategy that is estimated to satisfy a set of reliability constraints supplied by a user.