Recomputing Materialized Instances after Changes to Mappings and Data

Todd J. Green, Z. Ives
{"title":"Recomputing Materialized Instances after Changes to Mappings and Data","authors":"Todd J. Green, Z. Ives","doi":"10.1109/ICDE.2012.107","DOIUrl":null,"url":null,"abstract":"A major challenge faced by today's information systems is that of evolution as data usage evolves or new data resources become available. Modern organizations sometimes exchange data with one another via declarative mappings among their databases, as in data exchange and collaborative data sharing systems. Such mappings are frequently revised and refined as new data becomes available, new cross-reference tables are created, and corrections are made. A fundamental question is how to handle changes to these mapping definitions, when the organizations each materialize the results of applying the mappings to the available data. We consider how to incrementally recompute these database instances in this setting, reusing (if possible) previously computed instances to speed up computation. We develop a principled solution that performs cost-based exploration of recomputation versus reuse, and simultaneously handles updates to source data and mapping definitions through a single, unified mechanism. Our solution also takes advantage of provenance information, when present, to speed up computation even further. We present an implementation that takes advantage of an off-the-shelf DBMS's query processing system, and we show experimentally that our approach provides substantial performance benefits.","PeriodicalId":321608,"journal":{"name":"2012 IEEE 28th International Conference on Data Engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 28th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2012.107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

A major challenge faced by today's information systems is that of evolution as data usage evolves or new data resources become available. Modern organizations sometimes exchange data with one another via declarative mappings among their databases, as in data exchange and collaborative data sharing systems. Such mappings are frequently revised and refined as new data becomes available, new cross-reference tables are created, and corrections are made. A fundamental question is how to handle changes to these mapping definitions, when the organizations each materialize the results of applying the mappings to the available data. We consider how to incrementally recompute these database instances in this setting, reusing (if possible) previously computed instances to speed up computation. We develop a principled solution that performs cost-based exploration of recomputation versus reuse, and simultaneously handles updates to source data and mapping definitions through a single, unified mechanism. Our solution also takes advantage of provenance information, when present, to speed up computation even further. We present an implementation that takes advantage of an off-the-shelf DBMS's query processing system, and we show experimentally that our approach provides substantial performance benefits.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
更改映射和数据后重新计算物化实例
当今信息系统面临的一个主要挑战是随着数据使用的发展或新数据资源的可用性而发展。现代组织有时通过数据库之间的声明性映射相互交换数据,例如在数据交换和协作数据共享系统中。随着新数据的出现,这种映射经常被修改和细化,创建新的交叉引用表,并进行更正。一个基本的问题是,当各组织将将映射应用到可用数据的结果具体化时,如何处理这些映射定义的更改。我们考虑如何在这种设置中增量地重新计算这些数据库实例,重用(如果可能的话)以前计算过的实例来加快计算速度。我们开发了一个原则性的解决方案,该解决方案执行基于成本的重新计算与重用的探索,并同时通过单一、统一的机制处理源数据和映射定义的更新。我们的解决方案还利用了存在的来源信息来进一步加快计算速度。我们提出了一个利用现成DBMS查询处理系统的实现,并通过实验证明,我们的方法提供了实质性的性能优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Keyword Query Reformulation on Structured Data Accuracy-Aware Uncertain Stream Databases Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks Project Daytona: Data Analytics as a Cloud Service Automatic Extraction of Structured Web Data with Domain Knowledge
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1