首页 > 最新文献

Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems最新文献

英文 中文
Probabilistic Databases for All 所有人的概率数据库
Dan Suciu
In probabilistic databases the data is uncertain and is modeled by a probability distribution. The central problem in probabilistic databases is query evaluation, which requires performing not only traditional data processing such as joins, projections, unions, but also probabilistic inference in order to compute the probability of each item in the answer. At their core, probabilistic databases are a proposal to integrate logic with probability theory. This paper accompanies a talk given as part of the Gems of PODS series, and describes several results in probabilistic databases, explaining their significance in the broader context of model counting, probabilistic inference, and Statistical Relational Models.
在概率数据库中,数据是不确定的,用概率分布来建模。概率数据库的核心问题是查询评估,它不仅需要执行传统的数据处理,如连接、投影、联合,还需要进行概率推理,以计算答案中每个项目的概率。概率数据库的核心是将逻辑与概率论相结合。本文附带了作为PODS系列的一部分的演讲,并描述了概率数据库中的几个结果,解释了它们在模型计数、概率推理和统计关系模型的更广泛上下文中的意义。
{"title":"Probabilistic Databases for All","authors":"Dan Suciu","doi":"10.1145/3375395.3389129","DOIUrl":"https://doi.org/10.1145/3375395.3389129","url":null,"abstract":"In probabilistic databases the data is uncertain and is modeled by a probability distribution. The central problem in probabilistic databases is query evaluation, which requires performing not only traditional data processing such as joins, projections, unions, but also probabilistic inference in order to compute the probability of each item in the answer. At their core, probabilistic databases are a proposal to integrate logic with probability theory. This paper accompanies a talk given as part of the Gems of PODS series, and describes several results in probabilistic databases, explaining their significance in the broader context of model counting, probabilistic inference, and Statistical Relational Models.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129658953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Efficient Indexes for Diverse Top-k Range Queries 不同Top-k范围查询的高效索引
P. Agarwal, Stavros Sintos, Alex Steiger
Let P be a set of n (non-negatively) weighted points in Rd. We consider the problem of computing a subset of (at most) k diverse and high-valued points of P that lie inside a query range, a problem relevant to many areas such as search engines, recommendation systems, and online stores. The diversity and value of a set of points are measured as functions (say average or minimum) of their pairwise distances and weights, respectively. We study both bicriteria and constrained optimization problems. In the former, we wish to return a set of k points that maximize a weighted sum of their value and diversity measures, and in the latter, we wish to return a set of at most k points that maximize their value and satisfy a diversity constraint. We obtain three main types of results in this paper: Near-linear time (0.5-ε)-approximation algorithms for the bicriteria optimization problem in the offline setting. Near-linear size indexes for the bicriteria optimization problem that for a query rectangle return a (0.5-ε)-approximate solution in time O(k polylog(n)). The indexes can be constructed in O(n polylog(n)) time. Near-linear size indexes for answering constrained optimization range queries. For a query rectangle, a 0.5O(d)-approximate solution can be computed in O(k polylog(n)) time. If we allow some of the returned points to lie at most ε outside of the query rectangle then an (1-ε)-approximate solution can be computed in O(k polylog(n)) time. The indexes are constructed in O(n polylog(n)) and nO(1/εd) time, respectively.
设P是Rd中n个(非负)加权点的集合。我们考虑计算(最多)k个位于查询范围内的P的不同和高价值点的子集的问题,这个问题与许多领域相关,如搜索引擎,推荐系统和在线商店。一组点的多样性和值分别是它们的成对距离和权重的函数(比如平均值或最小值)。我们研究了双准则和约束优化问题。在前一种情况下,我们希望返回k个点的集合,这些点的值和多样性度量的加权和最大化,而在后一种情况下,我们希望返回最多k个点的集合,这些点的值最大化并满足多样性约束。本文主要得到三种结果:离线双准则优化问题的近似线性时间(0.5-ε)逼近算法;双准则优化问题的近线性尺寸索引,对于查询矩形,在时间O(k polylog(n))内返回(0.5-ε)-近似解。索引可以在O(n polylog(n))时间内构建。用于回答受限优化范围查询的近线性大小索引。对于一个查询矩形,0.5O(d)的近似解可以在O(k polylog(n))时间内计算出来。如果我们允许一些返回点位于查询矩形的最大为ε之外,则可以在O(k polylog(n))时间内计算出(1-ε)-近似解。指标分别在O(n polylog(n))和nO(1/εd)时间内构建。
{"title":"Efficient Indexes for Diverse Top-k Range Queries","authors":"P. Agarwal, Stavros Sintos, Alex Steiger","doi":"10.1145/3375395.3387667","DOIUrl":"https://doi.org/10.1145/3375395.3387667","url":null,"abstract":"Let P be a set of n (non-negatively) weighted points in Rd. We consider the problem of computing a subset of (at most) k diverse and high-valued points of P that lie inside a query range, a problem relevant to many areas such as search engines, recommendation systems, and online stores. The diversity and value of a set of points are measured as functions (say average or minimum) of their pairwise distances and weights, respectively. We study both bicriteria and constrained optimization problems. In the former, we wish to return a set of k points that maximize a weighted sum of their value and diversity measures, and in the latter, we wish to return a set of at most k points that maximize their value and satisfy a diversity constraint. We obtain three main types of results in this paper: Near-linear time (0.5-ε)-approximation algorithms for the bicriteria optimization problem in the offline setting. Near-linear size indexes for the bicriteria optimization problem that for a query rectangle return a (0.5-ε)-approximate solution in time O(k polylog(n)). The indexes can be constructed in O(n polylog(n)) time. Near-linear size indexes for answering constrained optimization range queries. For a query rectangle, a 0.5O(d)-approximate solution can be computed in O(k polylog(n)) time. If we allow some of the returned points to lie at most ε outside of the query rectangle then an (1-ε)-approximate solution can be computed in O(k polylog(n)) time. The indexes are constructed in O(n polylog(n)) and nO(1/εd) time, respectively.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"95 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134359746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems 第39届ACM SIGMOD-SIGACT-SIGAI数据库系统原理研讨会论文集
{"title":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","authors":"","doi":"10.1145/3375395","DOIUrl":"https://doi.org/10.1145/3375395","url":null,"abstract":"","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132506214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
First-Order Rewritability in Consistent Query Answering with Respect to Multiple Keys 多键一致性查询应答中的一阶可重写性
Paraschos Koutris, J. Wijsen
We study consistent query answering with respect to key dependencies. Given a (possibly inconsistent) database instance and a set of key dependencies, a repair is an inclusion-maximal subinstance that satisfies all key dependencies. Consistent query answering for a Boolean query is the following problem: given a database instance as input, is the query true in every repair? In [Koutris and Wijsen, ICDT 2019], it was shown that for every self-join-free Boolean conjunctive query and set of key dependencies containing exactly one key dependency per relation name (also called the primary key), this problem is in FO, L-complete, or coNP-complete, and it is decidable which of the three cases applies. In this paper, we consider the more general case where a relation name can be associated with more than one key dependency. It is shown that in this more general setting, it remains decidable whether or not the above problem is in FO, for self-join-free Boolean conjunctive queries. Moreover, it is possible to effectively construct a first-order query that solves the problem whenever such a query exists.
我们研究关于键依赖的一致查询回答。给定一个(可能不一致的)数据库实例和一组键依赖项,修复是满足所有键依赖项的最大包含子实例。一致性查询回答布尔查询是以下问题:给定一个数据库实例作为输入,查询是否在每次修复中都为真?在[Koutris和Wijsen, ICDT 2019]中,研究表明,对于每个自连接无布尔连接查询和每个关系名称(也称为主键)仅包含一个键依赖的键依赖集,该问题是FO, L-complete或coNP-complete,并且可以确定三种情况中的哪一种适用。在本文中,我们考虑一个关系名称可以与多个键依赖项相关联的更一般的情况。结果表明,在这种更一般的情况下,对于自连接无布尔合取查询,上述问题是否在FO中仍然是可决定的。此外,可以有效地构造一个一阶查询,在这种查询存在时解决问题。
{"title":"First-Order Rewritability in Consistent Query Answering with Respect to Multiple Keys","authors":"Paraschos Koutris, J. Wijsen","doi":"10.1145/3375395.3387654","DOIUrl":"https://doi.org/10.1145/3375395.3387654","url":null,"abstract":"We study consistent query answering with respect to key dependencies. Given a (possibly inconsistent) database instance and a set of key dependencies, a repair is an inclusion-maximal subinstance that satisfies all key dependencies. Consistent query answering for a Boolean query is the following problem: given a database instance as input, is the query true in every repair? In [Koutris and Wijsen, ICDT 2019], it was shown that for every self-join-free Boolean conjunctive query and set of key dependencies containing exactly one key dependency per relation name (also called the primary key), this problem is in FO, L-complete, or coNP-complete, and it is decidable which of the three cases applies. In this paper, we consider the more general case where a relation name can be associated with more than one key dependency. It is shown that in this more general setting, it remains decidable whether or not the above problem is in FO, for self-join-free Boolean conjunctive queries. Moreover, it is possible to effectively construct a first-order query that solves the problem whenever such a query exists.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130468099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Parallel Algorithms for Sparse Matrix Multiplication and Join-Aggregate Queries 稀疏矩阵乘法和连接-聚合查询的并行算法
Xiao Hu, K. Yi
In this paper, we design massively parallel algorithms for sparse matrix multiplication, as well as more general join-aggregate queries, where the join hypergraph is a tree with arbitrary output attributes. For each case, we obtain asymptotic improvement over existing algorithms. In particular, our matrix multiplication algorithm is shown to be optimal in the semiring model.
在本文中,我们设计了用于稀疏矩阵乘法的大规模并行算法,以及更一般的连接-聚合查询,其中连接超图是具有任意输出属性的树。对于每种情况,我们都得到了对现有算法的渐近改进。特别是,我们的矩阵乘法算法在半环模型中被证明是最优的。
{"title":"Parallel Algorithms for Sparse Matrix Multiplication and Join-Aggregate Queries","authors":"Xiao Hu, K. Yi","doi":"10.1145/3375395.3387657","DOIUrl":"https://doi.org/10.1145/3375395.3387657","url":null,"abstract":"In this paper, we design massively parallel algorithms for sparse matrix multiplication, as well as more general join-aggregate queries, where the join hypergraph is a tree with arbitrary output attributes. For each case, we obtain asymptotic improvement over existing algorithms. In particular, our matrix multiplication algorithm is shown to be optimal in the semiring model.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115379199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Queries with Arithmetic on Incomplete Databases 不完全数据库上的算术查询
Marco Console, M. Hofer, L. Libkin
The standard notion of query answering over incomplete database is that of certain answers, guaranteeing correctness regardless of how incomplete data is interpreted. In majority of real-life databases, relations have numerical columns and queries use arithmetic and comparisons. Even though the notion of certain answers still applies, we explain that it becomes much more problematic in situations when missing data occurs in numerical columns. We propose a new general framework that allows us to assign a measure of certainty to query answers. We test it in the agnostic scenario where we do not have prior information about values of numerical attributes, similarly to the predominant approach in handling incomplete data which assumes that each null can be interpreted as an arbitrary value of the domain. The key technical challenge is the lack of a uniform distribution over the entire domain of numerical attributes, such as real numbers. We overcome this by associating the measure of certainty with the asymptotic behavior of volumes of some subsets of the Euclidean space. We show that this measure is well-defined, and describe approaches to computing and approximating it. While it can be computationally hard, or result in an irrational number, even for simple constraints, we produce polynomial-time randomized approximation schemes with multiplicative guarantees for conjunctive queries, and with additive guarantees for arbitrary first-order queries. We also describe a set of experimental results to confirm the feasibility of this approach.
不完整数据库上的查询回答的标准概念是某些答案,无论如何解释不完整的数据都保证正确性。在大多数实际数据库中,关系具有数值列,查询使用算术和比较。尽管某些答案的概念仍然适用,但我们解释说,在数字列中出现丢失数据的情况下,它会变得更成问题。我们提出了一个新的通用框架,允许我们为查询答案分配一个确定性的度量。我们在不可知的场景中测试它,我们没有关于数值属性值的先验信息,类似于处理不完整数据的主要方法,该方法假设每个null可以被解释为域的任意值。关键的技术挑战是在整个数值属性领域(如实数)缺乏统一的分布。我们通过将确定性测度与欧几里德空间的某些子集的体积的渐近行为联系起来来克服这个问题。我们证明了这个度量是定义良好的,并描述了计算和近似它的方法。虽然它在计算上很困难,或者即使对于简单的约束也会导致无理数,但我们生成了多项式时间的随机近似方案,对合取查询具有乘法保证,对任意一阶查询具有加性保证。我们还描述了一组实验结果来证实该方法的可行性。
{"title":"Queries with Arithmetic on Incomplete Databases","authors":"Marco Console, M. Hofer, L. Libkin","doi":"10.1145/3375395.3387666","DOIUrl":"https://doi.org/10.1145/3375395.3387666","url":null,"abstract":"The standard notion of query answering over incomplete database is that of certain answers, guaranteeing correctness regardless of how incomplete data is interpreted. In majority of real-life databases, relations have numerical columns and queries use arithmetic and comparisons. Even though the notion of certain answers still applies, we explain that it becomes much more problematic in situations when missing data occurs in numerical columns. We propose a new general framework that allows us to assign a measure of certainty to query answers. We test it in the agnostic scenario where we do not have prior information about values of numerical attributes, similarly to the predominant approach in handling incomplete data which assumes that each null can be interpreted as an arbitrary value of the domain. The key technical challenge is the lack of a uniform distribution over the entire domain of numerical attributes, such as real numbers. We overcome this by associating the measure of certainty with the asymptotic behavior of volumes of some subsets of the Euclidean space. We show that this measure is well-defined, and describe approaches to computing and approximating it. While it can be computationally hard, or result in an irrational number, even for simple constraints, we produce polynomial-time randomized approximation schemes with multiplicative guarantees for conjunctive queries, and with additive guarantees for arbitrary first-order queries. We also describe a set of experimental results to confirm the feasibility of this approach.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123659555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Projection Views of Register Automata 寄存器自动机的投影视图
L. Segoufin, V. Vianu
Register automata have been used as a convenient model for specifying and verifying database driven systems. An important problem in such systems is to provide views that hide or restructure certain information about the data or process, extending classical notions of database views. In this paper we carry out a formal investigation of views of register automata by considering simple views that project away some of the registers. We show that classical register automata are not able to describe such projections and introduce more powerful register automata that are able to do so. We also show useful properties of these automata such as closure under projection and decidability of verifying temporal properties of their runs.
寄存器自动机被用作指定和验证数据库驱动系统的方便模型。这种系统中的一个重要问题是提供隐藏或重构有关数据或过程的某些信息的视图,扩展了数据库视图的经典概念。在本文中,我们通过考虑投射掉一些寄存器的简单视图,对寄存器自动机的视图进行了形式化的研究。我们证明了经典的寄存器自动机不能描述这样的投影,并引入了更强大的寄存器自动机。我们还展示了这些自动机的有用性质,如投影下的闭包性和验证其运行的时间性质的可判定性。
{"title":"Projection Views of Register Automata","authors":"L. Segoufin, V. Vianu","doi":"10.1145/3375395.3387651","DOIUrl":"https://doi.org/10.1145/3375395.3387651","url":null,"abstract":"Register automata have been used as a convenient model for specifying and verifying database driven systems. An important problem in such systems is to provide views that hide or restructure certain information about the data or process, extending classical notions of database views. In this paper we carry out a formal investigation of views of register automata by considering simple views that project away some of the registers. We show that classical register automata are not able to describe such projections and introduce more powerful register automata that are able to do so. We also show useful properties of these automata such as closure under projection and decidability of verifying temporal properties of their runs.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134443011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
2020 ACM PODS Alberto O. Mendelzon Test-of-Time Award 2020年ACM PODS Alberto O. Mendelzon时间测试奖
G. Gottlob, J. V. D. Bussche, D. V. Gucht
The PODS Executive Committee has appointed us to serve as the Award Committee for 2020. The committee would like to state that PODS 2010 boasted an exceptional set of influential papers, attesting to the strength and relevance of the field. We received a significant number of nominations for different truly excellent papers. After careful consideration and having solicited external nominations and advice, we have selected the following paper as the award winner for 2020:
PODS执行委员会已任命我们担任2020年的奖项委员会。委员会谨指出,2010年国际学术会议发表了一组特别有影响力的论文,证明了该领域的实力和相关性。我们收到了许多真正优秀论文的提名。经过认真考虑并征求外部提名和建议,我们选择以下论文作为2020年的获奖者:
{"title":"2020 ACM PODS Alberto O. Mendelzon Test-of-Time Award","authors":"G. Gottlob, J. V. D. Bussche, D. V. Gucht","doi":"10.1145/3375395.3387723","DOIUrl":"https://doi.org/10.1145/3375395.3387723","url":null,"abstract":"The PODS Executive Committee has appointed us to serve as the Award Committee for 2020. The committee would like to state that PODS 2010 boasted an exceptional set of influential papers, attesting to the strength and relevance of the field. We received a significant number of nominations for different truly excellent papers. After careful consideration and having solicited external nominations and advice, we have selected the following paper as the award winner for 2020:","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125394311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coping with Incomplete Data: Recent Advances 处理不完整数据:最新进展
Marco Console, P. Guagliardo, L. Libkin, Etienne Toussaint
Handling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on three-valued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We re-examine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers.
以正确的方式处理不完整的数据是数据库中一个众所周知的难题。理论方法依赖于某些答案的计算困难概念,而实际解决方案依赖于基于三值逻辑的临时查询评估技术。我们能否找到一个中间地带,并有效地给出正确的答案?在这个问题的推动下,本文调查了最近几年的结果。我们重新考察确定性的概念本身,并表明它比以前所认为的要丰富得多。我们确定某些答案可以有效计算的情况,并且,缺乏这种情况,为它们提供确定性和概率近似方案。我们将研究在SQL查询求值中使用的三值逻辑的作用,并讨论选择的正确性,以及生成查询答案时使用这种逻辑的必要性。
{"title":"Coping with Incomplete Data: Recent Advances","authors":"Marco Console, P. Guagliardo, L. Libkin, Etienne Toussaint","doi":"10.1145/3375395.3387970","DOIUrl":"https://doi.org/10.1145/3375395.3387970","url":null,"abstract":"Handling incomplete data in a correct manner is a notoriously hard problem in databases. Theoretical approaches rely on the computationally hard notion of certain answers, while practical solutions rely on ad hoc query evaluation techniques based on three-valued logic. Can we find a middle ground, and produce correct answers efficiently? The paper surveys results of the last few years motivated by this question. We re-examine the notion of certainty itself, and show that it is much more varied than previously thought. We identify cases when certain answers can be computed efficiently and, short of that, provide deterministic and probabilistic approximation schemes for them. We look at the role of three-valued logic as used in SQL query evaluation, and discuss the correctness of the choice, as well as the necessity of such a logic for producing query answers.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132920208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Deciding Robustness for Lower SQL Isolation Levels 决定较低SQL隔离级别的健壮性
Bas Ketsman, Christoph Koch, F. Neven, Brecht Vandevoort
While serializability always guarantees application correctness, lower isolation levels can be chosen to improve transaction throughput at the risk of introducing certain anomalies. A set of transactions is robust against a given isolation level if every possible interleaving of the transactions under the specified isolation level is serializable. Robustness therefore always guarantees application correctness with the performance benefit of the lower isolation level. While the robustness problem has received considerable attention in the literature, only sufficient conditions have been obtained. The most notable exception is the seminal work by Fekete where he obtained a characterization for deciding robustness against SNAPSHOT ISOLATION. In this paper, we address the robustness problem for the lower SQL isolation levels READ UNCOMMITTED and READ COMMITTED which are defined in terms of the forbidden dirty write and dirty read patterns. The first main contribution of this paper is that we characterize robustness against both isolation levels in terms of the absence of counter example schedules of a specific form (split and multi-split schedules) and by the absence of cycles in interference graphs that satisfy various properties. A critical difference with Fekete's work, is that the properties of cycles obtained in this paper have to take the relative ordering of operations within transactions into account as READ UNCOMMITTED and READ COMMITTED do not satisfy the atomic visibility requirement. A particular consequence is that the latter renders the robustness problem against READ COMMITTED coNP-complete. The second main contribution of this paper is the coNP-hardness proof. For READ UNCOMMITTED, we obtain LOGSPACE-completeness.
虽然可序列化性总是保证应用程序的正确性,但可以选择较低的隔离级别来提高事务吞吐量,但要冒引入某些异常的风险。如果在指定隔离级别下的事务的每个可能的交错都是可序列化的,那么一组事务对于给定的隔离级别是健壮的。因此,鲁棒性总是保证应用程序的正确性,并具有较低隔离级别的性能优势。虽然鲁棒性问题在文献中得到了相当大的关注,但只有充分的条件才得到。最值得注意的例外是Fekete的开创性工作,他在那里获得了决定对SNAPSHOT隔离的鲁棒性的表征。在本文中,我们解决了较低的SQL隔离级别READ UNCOMMITTED和READ COMMITTED的鲁棒性问题,这两个级别是根据禁止脏写和脏读模式定义的。本文的第一个主要贡献是,我们根据不存在特定形式的反例调度(分裂和多分裂调度)和不存在满足各种性质的干涉图中的循环来描述对两种隔离级别的鲁棒性。与Fekete的工作的一个关键区别是,本文中获得的循环属性必须考虑事务中操作的相对顺序,因为READ UNCOMMITTED和READ COMMITTED不满足原子可见性要求。一个特别的结果是后者呈现了针对READ COMMITTED cop -complete的鲁棒性问题。本文的第二个主要贡献是conp硬度证明。对于READ UNCOMMITTED,我们获得了日志空间完整性。
{"title":"Deciding Robustness for Lower SQL Isolation Levels","authors":"Bas Ketsman, Christoph Koch, F. Neven, Brecht Vandevoort","doi":"10.1145/3375395.3387655","DOIUrl":"https://doi.org/10.1145/3375395.3387655","url":null,"abstract":"While serializability always guarantees application correctness, lower isolation levels can be chosen to improve transaction throughput at the risk of introducing certain anomalies. A set of transactions is robust against a given isolation level if every possible interleaving of the transactions under the specified isolation level is serializable. Robustness therefore always guarantees application correctness with the performance benefit of the lower isolation level. While the robustness problem has received considerable attention in the literature, only sufficient conditions have been obtained. The most notable exception is the seminal work by Fekete where he obtained a characterization for deciding robustness against SNAPSHOT ISOLATION. In this paper, we address the robustness problem for the lower SQL isolation levels READ UNCOMMITTED and READ COMMITTED which are defined in terms of the forbidden dirty write and dirty read patterns. The first main contribution of this paper is that we characterize robustness against both isolation levels in terms of the absence of counter example schedules of a specific form (split and multi-split schedules) and by the absence of cycles in interference graphs that satisfy various properties. A critical difference with Fekete's work, is that the properties of cycles obtained in this paper have to take the relative ordering of operations within transactions into account as READ UNCOMMITTED and READ COMMITTED do not satisfy the atomic visibility requirement. A particular consequence is that the latter renders the robustness problem against READ COMMITTED coNP-complete. The second main contribution of this paper is the coNP-hardness proof. For READ UNCOMMITTED, we obtain LOGSPACE-completeness.","PeriodicalId":412441,"journal":{"name":"Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116387996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1