A Lightweight Virtual Machine Image Deduplication Backup Approach in Cloud Environment

Jiwei Xu, Wen-bo Zhang, Shiyang Ye, Jun Wei, Tao Huang
{"title":"A Lightweight Virtual Machine Image Deduplication Backup Approach in Cloud Environment","authors":"Jiwei Xu, Wen-bo Zhang, Shiyang Ye, Jun Wei, Tao Huang","doi":"10.1109/COMPSAC.2014.73","DOIUrl":null,"url":null,"abstract":"As most clouds are based on virtualization technology, more and more virtual machine images are created within data centers. Depending on the need of disaster recovery, the storage space used for backup would easily sprawl to a TB or PB level with the growth of images. Unfortunately, different images have a large amount of same data segments. Those duplicated data segments will lead to serious waste of storage resource. Although there is a lot of work focus on deduplication storage and could achieve a good result in removing duplicate copies, they are not very suitable for virtual machine image deduplication in a cloud environment. Because huge resource usage of deduplication operations could lead to serious performance interference to the hosting virtual machines. This paper propose a local deduplication method which can speed up the operation progress of virtual machine image deduplication and reduce the operation time. The method is based on an improved k-means clustering algorithm, which could classify the metadata of backup image to reduce the search space of index lookup and improve the index lookup performance. Experiments show that our approach is robust and effective. It can significantly reduce the performance interference to hosting virtual machine with an acceptable increase in disk space usage.","PeriodicalId":106871,"journal":{"name":"2014 IEEE 38th Annual Computer Software and Applications Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 38th Annual Computer Software and Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC.2014.73","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

Abstract

As most clouds are based on virtualization technology, more and more virtual machine images are created within data centers. Depending on the need of disaster recovery, the storage space used for backup would easily sprawl to a TB or PB level with the growth of images. Unfortunately, different images have a large amount of same data segments. Those duplicated data segments will lead to serious waste of storage resource. Although there is a lot of work focus on deduplication storage and could achieve a good result in removing duplicate copies, they are not very suitable for virtual machine image deduplication in a cloud environment. Because huge resource usage of deduplication operations could lead to serious performance interference to the hosting virtual machines. This paper propose a local deduplication method which can speed up the operation progress of virtual machine image deduplication and reduce the operation time. The method is based on an improved k-means clustering algorithm, which could classify the metadata of backup image to reduce the search space of index lookup and improve the index lookup performance. Experiments show that our approach is robust and effective. It can significantly reduce the performance interference to hosting virtual machine with an acceptable increase in disk space usage.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
云环境下轻量级虚拟机映像重复数据删除备份方法
由于大多数云都基于虚拟化技术,因此在数据中心内创建了越来越多的虚拟机映像。根据灾难恢复的需要,随着映像的增长,用于备份的存储空间很容易扩展到TB或PB级别。不幸的是,不同的图像有大量相同的数据段。这些重复的数据段将导致严重的存储资源浪费。尽管在重复数据删除存储方面有很多工作,并且可以在删除重复副本方面取得很好的效果,但它们并不非常适合云环境中的虚拟机映像重复数据删除。因为重复数据删除操作占用大量资源,可能会对托管虚拟机造成严重的性能干扰。本文提出了一种局部重复数据删除方法,可以加快虚拟机镜像重复数据删除的操作进度,缩短操作时间。该方法基于改进的k-means聚类算法,可以对备份映像的元数据进行分类,减少索引查找的搜索空间,提高索引查找的性能。实验结果表明,该方法具有较好的鲁棒性和有效性。它可以显著减少对托管虚拟机的性能干扰,增加磁盘空间的使用是可以接受的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
相关文献
Information for Readers
IF 3.2 2区 医学Journal of vascular surgery. Venous and lymphatic disordersPub Date : 2021-03-01 DOI: 10.1016/S2213-333X(21)00007-X
Information for readers
IF 5.9 2区 计算机科学IEEE Transactions on ReliabilityPub Date : 1972-03-01 DOI: 10.1109/TR.1972.5215966
Information for Readers and readers
IF 29.4 1区 医学GastroenterologyPub Date : 2005-11-01 DOI: 10.1053/S0016-5085(05)01809-3
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Power-Saving Mechanism for IEEE 802.11 Clients in a Multicast Multimedia Streaming Network Empirically Based Evolution of a Variability Management Approach at UML Class Level CrowdAdaptor: A Crowd Sourcing Approach toward Adaptive Energy-Efficient Configurations of Virtual Machines Hosting Mobile Applications A Distributed Topic-Based Pub/Sub Method for Exhaust Data Streams towards Scalable Event-Driven Systems Trimming Test Suites with Coincidentally Correct Test Cases for Enhancing Fault Localizations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1