提供块级及时恢复到任何时间点的高效索引和检索方法

Yonghong Sheng, Dan Xu, Dongsheng Wang
{"title":"提供块级及时恢复到任何时间点的高效索引和检索方法","authors":"Yonghong Sheng, Dan Xu, Dongsheng Wang","doi":"10.1109/NAS.2010.63","DOIUrl":null,"url":null,"abstract":"Block-level continuous data protection (CDP) logs every disk write operation so that the disk can be rolled back to any arbitrary point-in-time within a time window. For each update operation is time stamped and logged, the indexing for such huge amounts of records is an important and challenging problem. Unfortunately, the conventional indexing methods can not efficiently record large numbers of versions and support instant “time-travel” types of queries in CDP. In this paper, we present an effective indexing method providing timely recovery to any point-in-time in comprehensive versioning systems, called the Hierarchical Spatial-Temporal Indexing Method (HSTIM). The basic principle of HSTIM is to partition the time domain and the production storage LBAs into time slice and segments respectively according to update frequency of disk IOs, and build separate index file for each segment. In order to meet the demands of instant view of history data, the metadata of production storage is independently indexed. For long-time history data retrieval requirements, index snapshot is introduced in HSTIM to reduce the retrieval time. Another distinctive feature of HSTIM is its incremental retrieval method, which achieves high query performance at time point t + t if neighboring time point t is queried previously. The paper compares HSTIM with traditional B+-tree and multi-version B-tree (MVBT) index in many aspects. Experiments with real workload IO trace files show that HSTIM can locate history data within 8.05 seconds for recovery point of 48 hours, while B+-tree consumes 24.04 seconds. If the index snapshot is applied, HSTIM can reduce such retrieval time within 3 seconds.","PeriodicalId":284549,"journal":{"name":"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage","volume":"302 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A High Effective Indexing and Retrieval Method Providing Block-Level Timely Recovery to Any Point-in-Time\",\"authors\":\"Yonghong Sheng, Dan Xu, Dongsheng Wang\",\"doi\":\"10.1109/NAS.2010.63\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Block-level continuous data protection (CDP) logs every disk write operation so that the disk can be rolled back to any arbitrary point-in-time within a time window. For each update operation is time stamped and logged, the indexing for such huge amounts of records is an important and challenging problem. Unfortunately, the conventional indexing methods can not efficiently record large numbers of versions and support instant “time-travel” types of queries in CDP. In this paper, we present an effective indexing method providing timely recovery to any point-in-time in comprehensive versioning systems, called the Hierarchical Spatial-Temporal Indexing Method (HSTIM). The basic principle of HSTIM is to partition the time domain and the production storage LBAs into time slice and segments respectively according to update frequency of disk IOs, and build separate index file for each segment. In order to meet the demands of instant view of history data, the metadata of production storage is independently indexed. For long-time history data retrieval requirements, index snapshot is introduced in HSTIM to reduce the retrieval time. Another distinctive feature of HSTIM is its incremental retrieval method, which achieves high query performance at time point t + t if neighboring time point t is queried previously. The paper compares HSTIM with traditional B+-tree and multi-version B-tree (MVBT) index in many aspects. Experiments with real workload IO trace files show that HSTIM can locate history data within 8.05 seconds for recovery point of 48 hours, while B+-tree consumes 24.04 seconds. If the index snapshot is applied, HSTIM can reduce such retrieval time within 3 seconds.\",\"PeriodicalId\":284549,\"journal\":{\"name\":\"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage\",\"volume\":\"302 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NAS.2010.63\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE Fifth International Conference on Networking, Architecture, and Storage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2010.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

CDP (Block-level continuous data protection)记录磁盘的每次写操作,以便在一个时间窗口内将磁盘回滚到任意时间点。由于每个更新操作都有时间戳和日志,因此对如此大量的记录进行索引是一个重要且具有挑战性的问题。遗憾的是,传统的索引方法不能有效地记录大量的版本,也不能支持CDP中即时的“时间旅行”类型的查询。在本文中,我们提出了一种有效的索引方法,可以在综合版本控制系统中及时恢复到任何时间点,称为分层时空索引方法(HSTIM)。HSTIM的基本原理是根据磁盘io的更新频率,将时域和生产存储LBAs分别划分为时间片和时间段,并为每个时间段构建单独的索引文件。为了满足即时查看历史数据的需求,生产存储的元数据被独立索引。针对长时间的历史数据检索需求,在HSTIM中引入了索引快照来减少检索时间。HSTIM的另一个显著特点是它的增量检索方法,如果之前查询相邻的时间点t,则在时间点t + t处获得较高的查询性能。本文将HSTIM与传统B+树和多版本B-树(MVBT)索引进行了多方面的比较。对真实工作负载IO跟踪文件的实验表明,对于48小时的恢复点,HSTIM可以在8.05秒内找到历史数据,而B+-tree则需要24.04秒。如果应用索引快照,HSTIM可以在3秒内减少这种检索时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A High Effective Indexing and Retrieval Method Providing Block-Level Timely Recovery to Any Point-in-Time
Block-level continuous data protection (CDP) logs every disk write operation so that the disk can be rolled back to any arbitrary point-in-time within a time window. For each update operation is time stamped and logged, the indexing for such huge amounts of records is an important and challenging problem. Unfortunately, the conventional indexing methods can not efficiently record large numbers of versions and support instant “time-travel” types of queries in CDP. In this paper, we present an effective indexing method providing timely recovery to any point-in-time in comprehensive versioning systems, called the Hierarchical Spatial-Temporal Indexing Method (HSTIM). The basic principle of HSTIM is to partition the time domain and the production storage LBAs into time slice and segments respectively according to update frequency of disk IOs, and build separate index file for each segment. In order to meet the demands of instant view of history data, the metadata of production storage is independently indexed. For long-time history data retrieval requirements, index snapshot is introduced in HSTIM to reduce the retrieval time. Another distinctive feature of HSTIM is its incremental retrieval method, which achieves high query performance at time point t + t if neighboring time point t is queried previously. The paper compares HSTIM with traditional B+-tree and multi-version B-tree (MVBT) index in many aspects. Experiments with real workload IO trace files show that HSTIM can locate history data within 8.05 seconds for recovery point of 48 hours, while B+-tree consumes 24.04 seconds. If the index snapshot is applied, HSTIM can reduce such retrieval time within 3 seconds.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Heterogeneous Multi-core Parallel SGEMM Performance Testing and Analysis on Cell/B.E Processor Stabilizing Path Modification of Power-Aware On/Off Interconnection Networks Modelling Speculative Prefetching for Hybrid Storage Systems Binomial Probability Redundancy Strategy for Multimedia Transmission Fast and Memory-Efficient Traffic Classification with Deep Packet Inspection in CMP Architecture
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1