{"title":"Data Provenance for Historical Queries in Relational Database","authors":"A. Rani, Navneet Goyal, S. Gadia","doi":"10.1145/2835043.2835047","DOIUrl":null,"url":null,"abstract":"Capturing, modeling, and querying data provenance in databases has gained considerable importance in the last decade. All kinds of applications developed on top of databases, now a days collect provenance for various purposes like trustworthiness of data, update management, quality measurement etc. For these purposes, there is a need to efficiently capture, store, and query provenance information for current as well as historical queries executed on the database. Most of the existing provenance models like DBNotes, MONDRIAN, Perm, Orchestra, TRIO, and GProM are suitable for capturing and querying provenance in relational databases. All these models can capture provenance only for currently executing queries, except for TRIO and GProM, which can capture and query provenance for historical queries also. But, the time and space complexity of these two models is very high. In this paper, we propose a framework, Data Provenance for Historical Queries (DPHQ), which is capable of efficiently capturing and querying provenance for queries, including that of historical queries. The proposed model also supports provenance for updates. In our model, we have used Zero Information Loss Database [2] to execute historical queries at any point of time, using the concept of nested relations. A graph database is used for storing and subsequent querying of provenance information.","PeriodicalId":435920,"journal":{"name":"Compute","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Compute","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2835043.2835047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Capturing, modeling, and querying data provenance in databases has gained considerable importance in the last decade. All kinds of applications developed on top of databases, now a days collect provenance for various purposes like trustworthiness of data, update management, quality measurement etc. For these purposes, there is a need to efficiently capture, store, and query provenance information for current as well as historical queries executed on the database. Most of the existing provenance models like DBNotes, MONDRIAN, Perm, Orchestra, TRIO, and GProM are suitable for capturing and querying provenance in relational databases. All these models can capture provenance only for currently executing queries, except for TRIO and GProM, which can capture and query provenance for historical queries also. But, the time and space complexity of these two models is very high. In this paper, we propose a framework, Data Provenance for Historical Queries (DPHQ), which is capable of efficiently capturing and querying provenance for queries, including that of historical queries. The proposed model also supports provenance for updates. In our model, we have used Zero Information Loss Database [2] to execute historical queries at any point of time, using the concept of nested relations. A graph database is used for storing and subsequent querying of provenance information.