Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems最新文献

英文中文

A quest for beauty and wealth (or, business processes for database researchers) 对美丽和财富的追求(或者，数据库研究人员的业务流程)

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-06-13 DOI: 10.1145/1989284.1989286

Daniel Deutch, T. Milo

While classic data management focuses on the data itself, research on Business Processes considers also the context in which this data is generated and manipulated, namely the processes, the users, and the goals that this data serves. This allows the analysts a better perspective of the organizational needs centered around the data. As such, this research is of fundamental importance. Much of the success of database systems in the last decade is due to the beauty and elegance of the relational model and its declarative query languages, combined with a rich spectrum of underlying evaluation and optimization techniques, and efficient implementations. This, in turn, has lead to an economic wealth for both the users and vendors of database systems. Similar beauty and wealth are sought for in the context of Business Processes. Much like the case for traditional database research, elegant modeling and rich underlying technology are likely to bring economic wealth for the Business Process owners and their users; both can benefit from easy formulation and analysis of the processes. While there have been many important advances in this research in recent years, there is still much to be desired: specifically, there have been many works that focus on the processes behavior (flow), and many that focus on its data, but only very few works have dealt with both. We will discuss here the important advantages of a holistic flow-and-data framework for Business Processes, the progress towards such a framework, and highlight the current gaps and research directions.

传统的数据管理关注的是数据本身，而对业务流程的研究还考虑生成和操作数据的上下文，即流程、用户和数据所服务的目标。这使分析人员能够更好地了解以数据为中心的组织需求。因此，这项研究具有根本性的重要性。在过去十年中，数据库系统的成功很大程度上归功于关系模型及其声明性查询语言的优美和优雅，结合了丰富的底层评估和优化技术，以及高效的实现。这反过来又为数据库系统的用户和供应商带来了经济财富。在业务流程的上下文中也寻求类似的美丽和财富。就像传统的数据库研究一样，优雅的建模和丰富的底层技术很可能为业务流程所有者及其用户带来经济财富;这两种方法都可以从易于制定和分析的过程中受益。虽然近年来该研究取得了许多重要进展，但仍有许多需要改进的地方:具体而言，有许多工作关注过程行为(流)，也有许多工作关注其数据，但只有很少的工作处理了两者。我们将在这里讨论业务流程的整体流和数据框架的重要优势，以及朝着这种框架发展的进展，并强调当前的差距和研究方向。

{"title":"A quest for beauty and wealth (or, business processes for database researchers)","authors":"Daniel Deutch, T. Milo","doi":"10.1145/1989284.1989286","DOIUrl":"https://doi.org/10.1145/1989284.1989286","url":null,"abstract":"While classic data management focuses on the data itself, research on Business Processes considers also the context in which this data is generated and manipulated, namely the processes, the users, and the goals that this data serves. This allows the analysts a better perspective of the organizational needs centered around the data. As such, this research is of fundamental importance.\u0000 Much of the success of database systems in the last decade is due to the beauty and elegance of the relational model and its declarative query languages, combined with a rich spectrum of underlying evaluation and optimization techniques, and efficient implementations. This, in turn, has lead to an economic wealth for both the users and vendors of database systems. Similar beauty and wealth are sought for in the context of Business Processes. Much like the case for traditional database research, elegant modeling and rich underlying technology are likely to bring economic wealth for the Business Process owners and their users; both can benefit from easy formulation and analysis of the processes. While there have been many important advances in this research in recent years, there is still much to be desired: specifically, there have been many works that focus on the processes behavior (flow), and many that focus on its data, but only very few works have dealt with both. We will discuss here the important advantages of a holistic flow-and-data framework for Business Processes, the progress towards such a framework, and highlight the current gaps and research directions.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"75 1","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79494938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Querying semantic web data with SPARQL 使用SPARQL查询语义web数据

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-06-13 DOI: 10.1145/1989284.1989312

M. Arenas, Jorge Pérez

The Semantic Web is the initiative of the W3C to make information on the Web readable not only by humans but also by machines. RDF is the data model for Semantic Web data, and SPARQL is the standard query language for this data model. In the last ten years, we have witnessed a constant growth in the amount of RDF data available on the Web, which have motivated the theoretical study of some fundamental aspects of SPARQL and the development of efficient mechanisms for implementing this query language. Some of the distinctive features of RDF have made the study and implementation of SPARQL challenging. First, as opposed to usual database applications, the semantics of RDF is open world, making RDF databases inherently incomplete. Thus, one usually obtains partial answers when querying RDF with SPARQL, and the possibility of adding optional information if present is a crucial feature of SPARQL. Second, RDF databases have a graph structure and are interlinked, thus making graph navigational capabilities a necessary component of SPARQL. Last, but not least, SPARQL has to work at Web scale! RDF and SPARQL have attracted interest from the database community. However, we think that this community has much more to say about these technologies, and, in particular, about the fundamental database problems that need to be solved in order to provide solid foundations for the development of these technologies. In this paper, we survey some of the main results about the theory of RDF and SPARQL putting emphasis on some research opportunities for the database community.

语义网是W3C的首创，目的是使Web上的信息不仅可以被人类阅读，而且可以被机器阅读。RDF是语义Web数据的数据模型，SPARQL是该数据模型的标准查询语言。在过去的十年中，我们目睹了Web上可用的RDF数据量的不断增长，这推动了对SPARQL的一些基本方面的理论研究，并开发了实现这种查询语言的有效机制。RDF的一些独特特性给SPARQL的研究和实现带来了挑战。首先，与通常的数据库应用程序相反，RDF的语义是开放的，这使得RDF数据库本质上是不完整的。因此，当使用SPARQL查询RDF时，通常会得到部分答案，如果有的话，添加可选信息的可能性是SPARQL的一个关键特性。其次，RDF数据库具有图结构并且是相互链接的，因此使图导航功能成为SPARQL的必要组件。最后，但并非最不重要的是，SPARQL必须在Web规模上工作!RDF和SPARQL已经引起了数据库社区的兴趣。然而，我们认为这个社区对这些技术还有很多话要说，特别是关于需要解决的基本数据库问题，以便为这些技术的发展提供坚实的基础。在本文中，我们概述了关于RDF和SPARQL理论的一些主要成果，重点介绍了数据库社区的一些研究机会。

{"title":"Querying semantic web data with SPARQL","authors":"M. Arenas, Jorge Pérez","doi":"10.1145/1989284.1989312","DOIUrl":"https://doi.org/10.1145/1989284.1989312","url":null,"abstract":"The Semantic Web is the initiative of the W3C to make information on the Web readable not only by humans but also by machines. RDF is the data model for Semantic Web data, and SPARQL is the standard query language for this data model. In the last ten years, we have witnessed a constant growth in the amount of RDF data available on the Web, which have motivated the theoretical study of some fundamental aspects of SPARQL and the development of efficient mechanisms for implementing this query language.\u0000 Some of the distinctive features of RDF have made the study and implementation of SPARQL challenging. First, as opposed to usual database applications, the semantics of RDF is open world, making RDF databases inherently incomplete. Thus, one usually obtains partial answers when querying RDF with SPARQL, and the possibility of adding optional information if present is a crucial feature of SPARQL. Second, RDF databases have a graph structure and are interlinked, thus making graph navigational capabilities a necessary component of SPARQL. Last, but not least, SPARQL has to work at Web scale!\u0000 RDF and SPARQL have attracted interest from the database community. However, we think that this community has much more to say about these technologies, and, in particular, about the fundamental database problems that need to be solved in order to provide solid foundations for the development of these technologies. In this paper, we survey some of the main results about the theory of RDF and SPARQL putting emphasis on some research opportunities for the database community.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"302 1","pages":"305-316"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79749222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 143

On finding skylines in external memory 在外部存储器中寻找天际线

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-06-13 DOI: 10.1145/1989284.1989298

Cheng Sheng, Yufei Tao

We consider the skyline problem (a.k.a. the maxima problem), which has been extensively studied in the database community. The input is a set P of d-dimensional points. A point dominates another if the former has a lower coordinate than the latter on every dimension. The goal is to find the skyline, which is the set of points p ∈ P such that p is not dominated by any other data point. In the external-memory model, the 2-d version of the problem is known to be solvable in O((N/B)log_M/B(N/B)) I/Os, where N is the cardinality of P, B the size of a disk block, and M the capacity of main memory. For fixed d ≥ 3, we present an algorithm with I/O-complexity O((N/B)logd-2/M/B(N/B)). Previously, the best solution was adapted from an in-memory algorithm, and requires O((N/B) logd-2/2(N/M)) I/Os.

我们考虑的是天际线问题(又称极大值问题)，这个问题在数据库界已经得到了广泛的研究。输入是d维点的集合P。如果一个点在每个维度上的坐标都比另一个点低，那么它就优于另一个点。目标是找到天际线，它是点p∈p的集合，使得p不受任何其他数据点的支配。在外部存储器模型中，已知该问题的二维版本可以在O((N/B)logM/B(N/B)) I/O中解决，其中N是P的基数，B是磁盘块的大小，M是主存储器的容量。对于固定d≥3，我们给出了一个I/O复杂度为O((N/B)log -2/M/B(N/B))的算法。以前，最佳解决方案是采用内存算法，需要O((N/B) log -2/2(N/M))个I/O。

引用次数: 43

Incomplete information and certain answers in general data models 一般数据模型中的不完整信息和特定答案

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-06-13 DOI: 10.1145/1989284.1989294

L. Libkin

While incomplete information is ubiquitous in all data models - especially in applications involving data translation or integration - our understanding of it is still not completely satisfactory. For example, even such a basic notion as certain answers for XML queries was only introduced recently, and in a way seemingly rather different from relational certain answers. The goal of this paper is to introduce a general approach to handling incompleteness, and to test its applicability in known data models such as relations and documents. The approach is based on representing degrees of incompleteness via semantics-based orderings on database objects. We use it to both obtain new results on incompleteness and to explain some previously observed phenomena. Specifically we show that certain answers for relational and XML queries are two instances of the same general concept; we describe structural properties behind the naive evaluation of queries; answer open questions on the existence of certain answers in the XML setting; and show that previously studied ordering-based approaches were only adequate for SQL's primitive view of nulls. We define a general setting that subsumes relations and documents to help us explain in a uniform way how to compute certain answers, and when good solutions can be found in data exchange. We also look at the complexity of common problems related to incompleteness, and generalize several results from relational and XML contexts.

虽然不完整信息在所有数据模型中普遍存在——特别是在涉及数据转换或集成的应用程序中——但我们对它的理解仍然不完全令人满意。例如，甚至像XML查询的特定答案这样的基本概念也是最近才引入的，并且以一种看起来与关系特定答案相当不同的方式引入。本文的目标是介绍一种处理不完整性的通用方法，并测试其在已知数据模型(如关系和文档)中的适用性。该方法基于通过数据库对象上基于语义的排序来表示不完整程度。我们用它来获得关于不完备性的新结果和解释一些以前观察到的现象。具体来说，我们展示了关系查询和XML查询的某些答案是同一一般概念的两个实例;我们描述了查询朴素求值背后的结构属性;回答关于XML设置中是否存在某些答案的开放性问题;并表明先前研究的基于排序的方法仅适用于SQL的null基本视图。我们定义了一个包含关系和文档的通用设置，以帮助我们以统一的方式解释如何计算某些答案，以及何时可以在数据交换中找到好的解决方案。我们还研究了与不完整性相关的常见问题的复杂性，并从关系和XML上下文中归纳了几个结果。

{"title":"Incomplete information and certain answers in general data models","authors":"L. Libkin","doi":"10.1145/1989284.1989294","DOIUrl":"https://doi.org/10.1145/1989284.1989294","url":null,"abstract":"While incomplete information is ubiquitous in all data models - especially in applications involving data translation or integration - our understanding of it is still not completely satisfactory. For example, even such a basic notion as certain answers for XML queries was only introduced recently, and in a way seemingly rather different from relational certain answers.\u0000 The goal of this paper is to introduce a general approach to handling incompleteness, and to test its applicability in known data models such as relations and documents. The approach is based on representing degrees of incompleteness via semantics-based orderings on database objects. We use it to both obtain new results on incompleteness and to explain some previously observed phenomena. Specifically we show that certain answers for relational and XML queries are two instances of the same general concept; we describe structural properties behind the naive evaluation of queries; answer open questions on the existence of certain answers in the XML setting; and show that previously studied ordering-based approaches were only adequate for SQL's primitive view of nulls. We define a general setting that subsumes relations and documents to help us explain in a uniform way how to compute certain answers, and when good solutions can be found in data exchange. We also look at the complexity of common problems related to incompleteness, and generalize several results from relational and XML contexts.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"63 1","pages":"59-70"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84330840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Efficient evaluation for a temporal logic on changing XML documents 对更改XML文档的时间逻辑进行有效的评估

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-06-13 DOI: 10.1145/1989284.1989317

M. Bojanczyk, Diego Figueira

We consider a sequence t₁,...,t_k of XML documents that is produced by a sequence of local edit operations. To describe properties of such a sequence, we use a temporal logic. The logic can navigate both in time and in the document, e.g. a formula can say that every node with label a eventually gets a descendant with label b. For every fixed formula, we provide an evaluation algorithm that works in time O(k ⋅ log(n)), where k is the number of edit operations and n is the maximal size of document that is produced. In the algorithm, we represent formulas of the logic by a kind of automaton, which works on sequences of documents. The algorithm works on XML documents of bounded depth.

我们考虑序列t1，…由一系列本地编辑操作生成的XML文档。为了描述这样一个序列的性质，我们使用时序逻辑。逻辑既可以在时间中导航，也可以在文档中导航，例如，一个公式可以说，每个标签为a的节点最终都会得到标签为b的后代。对于每个固定的公式，我们提供了一个在时间O(k⋅log(n))内工作的评估算法，其中k是编辑操作的次数，n是生成的文档的最大大小。在该算法中，我们通过一种自动机来表示逻辑的公式，该自动机作用于文档序列。该算法适用于有限深度的XML文档。

引用次数: 3

New results on two-dimensional orthogonal range aggregation in external memory 外存储器中二维正交范围聚合的新结果

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-06-13 DOI: 10.1145/1989284.1989297

Cheng Sheng, Yufei Tao

We consider the orthogonal range aggregation problem. The dataset S consists of N axis-parallel rectangles in R², each of which is associated with an integer weight. Given an axis-parallel rectangle Q and an aggregate function F, a query reports the aggregated result of the weights of the rectangles in S intersecting Q. The goal is to preprocess S into a structure such that all queries can be answered efficiently. We present indexing schemes to solve the problem in external memory when F = max (hence, min) and F = sum (hence, count and average), respectively. Our schemes have linear or near-linear space, and answer a query in O(log_BN) or O(logB²/BN) I/Os, where B is the disk block size.

研究正交距离聚合问题。数据集S由R2中的N个轴平行矩形组成，每个矩形都与一个整数权值相关联。给定一个轴平行的矩形Q和一个聚合函数F，查询报告S中与Q相交的矩形权重的聚合结果。目标是将S预处理成一个结构，这样所有查询都可以有效地回答。我们提出了分别解决F = max(因此，min)和F = sum(因此，count和average)时外部存储器中的索引问题的方案。我们的方案具有线性或近线性空间，并且在O(logBN)或O(logB2/BN) I/O中回答查询，其中B是磁盘块大小。

引用次数: 22

Finding a minimal tree pattern under neighborhood constraints 在邻域约束下寻找最小树模式

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-06-13 DOI: 10.1145/1989284.1989318

B. Kimelfeld, Y. Sagiv

Tools that automatically generate queries are useful when schemas are hard to understand due to size or complexity. Usually, these tools find minimal tree patterns that contain a given set (or bag) of labels. The labels could be, for example, XML tags or relation names. The only restriction is that, in a tree pattern, adjacent labels must be among some specified pairs. A more expressive framework is developed here, where a schema is a mapping of each label to a collection of bags of labels. A tree pattern conforms to the schema if for all nodes v, the bag comprising the labels of the neighbors is contained in one of the bags to which the label of v is mapped. The problem at hand is to find a minimal tree pattern that conforms to the schema and contains a given bag of labels. This problem is NP-hard even when using the simplest conceivable language for describing schemas. In practice, however, the set of labels is small, so efficiency is realized by means of an algorithm that is fixed-parameter tractable (FPT). Two languages for specifying schemas are discussed. In the first, one expresses pairwise mutual exclusions between labels. Though W[1]-hardness (hence, unlikeliness of an FPT algorithm) is shown, an FPT algorithm is described for the case where the mutual exclusions form a circular-arc graph (e.g., disjoint cliques). The second language is that of regular expressions, and for that another FPT algorithm is described.

当模式由于大小或复杂性而难以理解时，自动生成查询的工具非常有用。通常，这些工具会找到包含给定标签集(或包)的最小树模式。例如，标签可以是XML标记或关系名称。唯一的限制是，在树模式中，相邻的标签必须在一些指定的对之间。这里开发了一个更具表现力的框架，其中模式是每个标签到标签袋集合的映射。如果对于所有节点v，包含邻居标签的包包含在v的标签映射到的其中一个包中，则树模式符合该模式。当前的问题是找到符合模式并包含给定标签包的最小树模式。即使使用最简单的语言来描述模式，这个问题也是np困难的。然而，在实际操作中，由于标签集很小，因此通过一种固定参数可处理(fixed-parameter tractable, FPT)算法来实现效率。讨论了用于指定模式的两种语言。在第一种情况下，表示标签之间的成对互斥。虽然显示了W[1]-硬度(因此，FPT算法的不可能性)，但对于互斥形成圆弧图(例如，不相交的团)的情况，描述了FPT算法。第二种语言是正则表达式语言，并为此描述了另一种FPT算法。

{"title":"Finding a minimal tree pattern under neighborhood constraints","authors":"B. Kimelfeld, Y. Sagiv","doi":"10.1145/1989284.1989318","DOIUrl":"https://doi.org/10.1145/1989284.1989318","url":null,"abstract":"Tools that automatically generate queries are useful when schemas are hard to understand due to size or complexity. Usually, these tools find minimal tree patterns that contain a given set (or bag) of labels. The labels could be, for example, XML tags or relation names. The only restriction is that, in a tree pattern, adjacent labels must be among some specified pairs. A more expressive framework is developed here, where a schema is a mapping of each label to a collection of bags of labels. A tree pattern conforms to the schema if for all nodes v, the bag comprising the labels of the neighbors is contained in one of the bags to which the label of v is mapped. The problem at hand is to find a minimal tree pattern that conforms to the schema and contains a given bag of labels. This problem is NP-hard even when using the simplest conceivable language for describing schemas. In practice, however, the set of labels is small, so efficiency is realized by means of an algorithm that is fixed-parameter tractable (FPT). Two languages for specifying schemas are discussed. In the first, one expresses pairwise mutual exclusions between labels. Though W[1]-hardness (hence, unlikeliness of an FPT algorithm) is shown, an FPT algorithm is described for the case where the mutual exclusions form a circular-arc graph (e.g., disjoint cliques). The second language is that of regular expressions, and for that another FPT algorithm is described.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"60 1","pages":"235-246"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90212747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Provenance for aggregate queries 聚合查询的来源

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2011-01-05 DOI: 10.1145/1989284.1989302

Yael Amsterdamer, Daniel Deutch, V. Tannen

We study in this paper provenance information for queries with aggregation. Provenance information was studied in the context of various query languages that do not allow for aggregation, and recent work has suggested to capture provenance by annotating the different database tuples with elements of a commutative semiring and propagating the annotations through query evaluation. We show that aggregate queries pose novel challenges rendering this approach inapplicable. Consequently, we propose a new approach, where we annotate with provenance information not just tuples but also the individual values within tuples, using provenance to describe the values computation. We realize this approach in a concrete construction, first for "simple" queries where the aggregation operator is the last one applied, and then for arbitrary (positive) relational algebra queries with aggregation; the latter queries are shown to be more challenging in this context. Finally, we use aggregation to encode queries with difference, and study the semantics obtained for such queries on provenance annotated databases.

本文研究了带聚合查询的种源信息。在不允许聚合的各种查询语言的上下文中研究了出处信息，最近的工作建议通过用交换半环的元素注释不同的数据库元组并通过查询求值传播注释来捕获出处。我们表明，聚合查询带来了新的挑战，使得这种方法不适用。因此，我们提出了一种新的方法，即不仅用源信息注释元组，而且用源信息注释元组中的单个值，使用源信息来描述值的计算。我们在一个具体的结构中实现了这种方法，首先用于聚合运算符是最后应用的“简单”查询，然后用于具有聚合的任意(正)关系代数查询;在这种情况下，后一种查询更具挑战性。最后，我们利用聚合方法对差异查询进行编码，并研究了这些查询在有出处标注的数据库上所获得的语义。

{"title":"Provenance for aggregate queries","authors":"Yael Amsterdamer, Daniel Deutch, V. Tannen","doi":"10.1145/1989284.1989302","DOIUrl":"https://doi.org/10.1145/1989284.1989302","url":null,"abstract":"We study in this paper provenance information for queries with aggregation. Provenance information was studied in the context of various query languages that do not allow for aggregation, and recent work has suggested to capture provenance by annotating the different database tuples with elements of a commutative semiring and propagating the annotations through query evaluation. We show that aggregate queries pose novel challenges rendering this approach inapplicable. Consequently, we propose a new approach, where we annotate with provenance information not just tuples but also the individual values within tuples, using provenance to describe the values computation. We realize this approach in a concrete construction, first for \"simple\" queries where the aggregation operator is the last one applied, and then for arbitrary (positive) relational algebra queries with aggregation; the latter queries are shown to be more challenging in this context. Finally, we use aggregation to encode queries with difference, and study the semantics obtained for such queries on provenance annotated databases.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"1 1","pages":"153-164"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86971042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 172

Relational transducers for declarative networking 用于声明性网络的关系传感器

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2010-12-13 DOI: 10.1145/1989284.1989321

Tom J. Ameloot, F. Neven, J. V. D. Bussche

Motivated by a recent conjecture concerning the expressiveness of declarative networking, we propose a formal computation model for "eventually consistent" distributed querying, based on relational transducers. A tight link has been conjectured between coordination-freeness of computations, and monotonicity of the queries expressed by such computations. Indeed, we propose a formal definition of coordination-freeness and confirm that the class of monotone queries is captured by coordination-free transducer networks. Coordination-freeness is a semantic property, but the syntactic class of "oblivious" transducers we define also captures the same class of monotone queries. Transducer networks that are not coordination-free are much more powerful.

受最近关于声明性网络表达性的猜想的启发，我们提出了一个基于关系换能器的“最终一致”分布式查询的形式化计算模型。在计算的无协调性和这种计算所表达的查询的单调性之间存在着紧密的联系。事实上，我们提出了无坐标的正式定义，并确认了无坐标传感器网络捕获了单调查询类。协调无关是一种语义属性，但我们定义的“无关”换能器的语法类也捕获了同类的单调查询。不需要协调的传感器网络要强大得多。

引用次数: 83

The ACM PODS Alberto O. Mendelzon test-of-time-award 2010 ACM PODS Alberto O. Mendelzon时间测试奖2010

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2010-06-06 DOI: 10.1145/1807085.1807093

Jianwen Su, Phokion G. Kolaitis

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀