Reliability-aware deduplication storage: Assuring chunk reliability and chunk loss severity

2011 International Green Computing Conference and Workshops Pub Date : 2011-07-25 DOI:10.1109/IGCC.2011.6008566

Youngjin Nam, Guanlin Lu, D. Du

{"title":"Reliability-aware deduplication storage: Assuring chunk reliability and chunk loss severity","authors":"Youngjin Nam, Guanlin Lu, D. Du","doi":"10.1109/IGCC.2011.6008566","DOIUrl":null,"url":null,"abstract":"Reliability in deduplication storage has not attracted much research attention yet. To provide a demanded reliability for an incoming data stream, most deduplication storage systems first carry out deduplication process by eliminating duplicates from the data stream and then apply erasure coding for the remaining (unique) chunks. A unique chunk may be shared (i.e., duplicated) at many places of the data stream and shared by other data streams. That is why deduplication can reduce the required storage capacity. However, this occasionally becomes problematic to assure certain reliability levels required from different data streams. We introduce two reliability parameters for deduplication storage: chunk reliability and chunk loss severity. The chunk reliability means each chunk's tolerance level in the face of any failures. The chunk loss severity represents an expected damage level in the event of a chunk loss, formally defined as the multiplication of actual damage by the probability of a chunk loss. We propose a reliability-aware deduplication solution that not only assures all demanded chunk reliability levels by making already existing chunks sharable only if its reliability is high enough, but also mitigates the chunk loss severity by adaptively reducing the probability of having a chunk loss. In addition, we provide future research directions following to the current study.","PeriodicalId":306876,"journal":{"name":"2011 International Green Computing Conference and Workshops","volume":"49 5","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Green Computing Conference and Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IGCC.2011.6008566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Reliability in deduplication storage has not attracted much research attention yet. To provide a demanded reliability for an incoming data stream, most deduplication storage systems first carry out deduplication process by eliminating duplicates from the data stream and then apply erasure coding for the remaining (unique) chunks. A unique chunk may be shared (i.e., duplicated) at many places of the data stream and shared by other data streams. That is why deduplication can reduce the required storage capacity. However, this occasionally becomes problematic to assure certain reliability levels required from different data streams. We introduce two reliability parameters for deduplication storage: chunk reliability and chunk loss severity. The chunk reliability means each chunk's tolerance level in the face of any failures. The chunk loss severity represents an expected damage level in the event of a chunk loss, formally defined as the multiplication of actual damage by the probability of a chunk loss. We propose a reliability-aware deduplication solution that not only assures all demanded chunk reliability levels by making already existing chunks sharable only if its reliability is high enough, but also mitigates the chunk loss severity by adaptively reducing the probability of having a chunk loss. In addition, we provide future research directions following to the current study.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于可靠性的重复数据删除存储:保证数据块的可靠性和数据块丢失的严重程度

重复数据删除存储的可靠性目前还没有引起很多研究的关注。为了给传入的数据流提供所需的可靠性，大多数重复数据删除存储系统首先通过消除数据流中的重复数据来进行重复数据删除处理，然后对剩余的(唯一的)数据块进行擦除编码。一个唯一的块可以在数据流的许多地方共享(即，复制)，并由其他数据流共享。这就是为什么重复数据删除可以减少所需的存储容量。然而，在确保不同数据流所需的某些可靠性级别时，这偶尔会出现问题。引入了重删存储的两个可靠性参数:块可靠性和块丢失严重程度。数据块可靠性是指每个数据块在面对任何故障时的容忍度。块丢失严重性表示在块丢失事件中预期的损坏级别，正式定义为实际损坏与块丢失概率的乘积。我们提出了一种可靠性感知的重复数据删除解决方案，该解决方案不仅通过使已有的块在可靠性足够高的情况下可共享来确保所有要求的块可靠性水平，而且还通过自适应地降低块丢失的概率来减轻块丢失的严重程度。在此基础上，提出了今后的研究方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 International Green Computing Conference and Workshops

自引率

0.00%

发文量

期刊最新文献

VLSI testing and test power Leakage-aware Kalman filter for accurate temperature tracking Practical performance prediction under Dynamic Voltage Frequency Scaling CACM: Current-aware capacity management in consolidated server enclosures Gureen Game: An energy-efficient QoS control scheme for wireless sensor networks