首页 > 最新文献

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory最新文献

英文 中文
Front Matter, Table of Contents, Preface, Conference Organization 前文,目录,序言,会议组织
B. Kimelfeld, Yael Amsterdamer
{"title":"Front Matter, Table of Contents, Preface, Conference Organization","authors":"B. Kimelfeld, Yael Amsterdamer","doi":"10.4230/LIPIcs.ICDT.2018.0","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2018.0","url":null,"abstract":"","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86364838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preserving Constraints with the Stable Chase 用稳定追逐保持约束
David Carral, M. Krötzsch, Maximilian Marx, A. Ozaki, S. Rudolph
Conjunctive query answering over databases with constraints – also known as (tuple-generating) dependencies – is considered a central database task. To this end, several versions of a construction called chase have been described. Given a set Sigma of dependencies, it is interesting to ask which constraints not contained in Sigma that are initially satisfied in a given database instance are preserved when computing a chase over Sigma. Such constraints are an example for the more general class of incidental constraints, which when added to Sigma as new dependencies do not affect certain answers and might even speed up query answering. After formally introducing incidental constraints, we show that deciding incidentality is undecidable for tuple-generating dependencies, even in cases for which query entailment is decidable. For dependency sets with a finite universal model, the core chase can be used to decide incidentality. For the infinite case, we propose the stable chase, which generalises the core chase, and study its relation to incidental constraints.
具有约束的数据库上的联合查询应答——也称为(元组生成)依赖关系——被认为是一个中心数据库任务。为此,已经描述了称为chase的构造的几个版本。给定一组Sigma依赖项,有趣的是,当计算对Sigma的追逐时,在给定数据库实例中最初满足的Sigma中未包含的哪些约束被保留。这样的约束是更一般的附带约束类的一个例子,当将其作为新的依赖项添加到Sigma时,不会影响某些答案,甚至可能加快查询回答的速度。在正式引入附带约束之后,我们表明,对于生成元组的依赖项,即使在查询蕴涵是可确定的情况下,决定附带性也是不可确定的。对于具有有限通用模型的依赖集,可以使用核心追逐来确定偶然性。对于无限情况,我们提出了稳定追逐,它推广了核心追逐,并研究了它与附带约束的关系。
{"title":"Preserving Constraints with the Stable Chase","authors":"David Carral, M. Krötzsch, Maximilian Marx, A. Ozaki, S. Rudolph","doi":"10.4230/LIPIcs.ICDT.2018.12","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2018.12","url":null,"abstract":"Conjunctive query answering over databases with constraints – also known as (tuple-generating) dependencies – is considered a central database task. To this end, several versions of a construction called chase have been described. Given a set Sigma of dependencies, it is interesting to ask which constraints not contained in Sigma that are initially satisfied in a given database instance are preserved when computing a chase over Sigma. Such constraints are an example for the more general class of incidental constraints, which when added to Sigma as new dependencies do not affect certain answers and might even speed up query answering. After formally introducing incidental constraints, we show that deciding incidentality is undecidable for tuple-generating dependencies, even in cases for which query entailment is decidable. For dependency sets with a finite universal model, the core chase can be used to decide incidentality. For the infinite case, we propose the stable chase, which generalises the core chase, and study its relation to incidental constraints.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81806570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Querying the Unary Negation Fragment with Regular Path Expressions 使用正则路径表达式查询一元反分段
J. C. Jung, C. Lutz, M. Martel, Thomas Schneider
The unary negation fragment of first-order logic (UNFO) has recently been proposed as a generalization of modal logic that shares many of its good computational and model-theoretic properties. It is attractive from the perspective of database theory because it can express conjunctive queries (CQs) and ontologies formulated in many description logics (DLs). Both are relevant for ontology-mediated querying and, in fact, CQ evaluation under UNFO ontologies (and thus also under DL ontologies) can be `expressed' in UNFO as a satisfiability problem. In this paper, we consider the natural extension of UNFO with regular expressions on binary relations. The resulting logic UNFOreg can express (unions of) conjunctive two-way regular path queries (C2RPQs) and ontologies formulated in DLs that include transitive roles and regular expressions on roles. Our main results are that evaluating C2RPQs under UNFOreg ontologies is decidable, 2ExpTime-complete in combined complexity, and coNP-complete in data complexity, and that satisfiability in UNFOreg is 2ExpTime-complete, thus not harder than in UNFO.
一阶逻辑的一元否定片段(UNFO)最近被提出作为模态逻辑的一种推广,它具有模态逻辑的许多良好的计算和模型理论性质。从数据库理论的角度来看,它很有吸引力,因为它可以表达许多描述逻辑(dl)中表述的联合查询(cq)和本体。两者都与本体中介查询相关,事实上,UNFO本体下的CQ评估(因此也在DL本体下)可以在UNFO中“表示”为可满足性问题。本文考虑了二元关系上正则表达式的UNFO自然扩展。由此产生的逻辑UNFOreg可以表达(联合)连接双向正则路径查询(c2rpq)和在dl中制定的本体,包括传递角色和角色的正则表达式。我们的主要结果是UNFOreg本体下的c2rpq评估是可决定的,在组合复杂度上是2ExpTime-complete,在数据复杂度上是coNP-complete,并且UNFOreg的可满足性是2ExpTime-complete,因此并不比UNFO更难。
{"title":"Querying the Unary Negation Fragment with Regular Path Expressions","authors":"J. C. Jung, C. Lutz, M. Martel, Thomas Schneider","doi":"10.4230/LIPIcs.ICDT.2018.15","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2018.15","url":null,"abstract":"The unary negation fragment of first-order logic (UNFO) has recently been proposed as a generalization of modal logic that shares many of its good computational and model-theoretic properties. It is attractive from the perspective of database theory because it can express conjunctive queries (CQs) and ontologies formulated in many description logics (DLs). Both are relevant for ontology-mediated querying and, in fact, CQ evaluation under UNFO ontologies (and thus also under DL ontologies) can be `expressed' in UNFO as a satisfiability problem. In this paper, we consider the natural extension of UNFO with regular expressions on binary relations. The resulting logic UNFOreg can express (unions of) conjunctive two-way regular path queries (C2RPQs) and ontologies formulated in DLs that include transitive roles and regular expressions on roles. Our main results are that evaluating C2RPQs under UNFOreg ontologies is decidable, 2ExpTime-complete in combined complexity, and coNP-complete in data complexity, and that satisfiability in UNFOreg is 2ExpTime-complete, thus not harder than in UNFO.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84801887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Connecting Width and Structure in Knowledge Compilation 知识汇编中的连接宽度与结构
Antoine Amarilli, Mikaël Monet, P. Senellart
Several query evaluation tasks can be done via knowledge compilation: the query result is compiled as a lineage circuit from which the answer can be determined. For such tasks, it is important to leverage some width parameters of the circuit, such as bounded treewidth or pathwidth, to convert the circuit to structured classes, e.g., deterministic structured NNFs (d-SDNNFs) or OBDDs. In this work, we show how to connect the width of circuits to the size of their structured representation, through upper and lower bounds. For the upper bound, we show how bounded-treewidth circuits can be converted to a d-SDNNF, in time linear in the circuit size. Our bound, unlike existing results, is constructive and only singly exponential in the treewidth. We show a related lower bound on monotone DNF or CNF formulas, assuming a constant bound on the arity (size of clauses) and degree (number of occurrences of each variable). Specifically, any d-SDNNF (resp., SDNNF) for such a DNF (resp., CNF) must be of exponential size in its treewidth; and the same holds for pathwidth when compiling to OBDDs. Our lower bounds, in contrast with most previous work, apply to any formula of this class, not just a well-chosen family. Hence, for our language of DNF and CNF, pathwidth and treewidth respectively characterize the efficiency of compiling to OBDDs and (d-)SDNNFs, that is, compilation is singly exponential in the width parameter. We conclude by applying our lower bound results to the task of query evaluation.
几个查询评估任务可以通过知识编译来完成:查询结果被编译为一个沿袭电路,从中可以确定答案。对于这样的任务,重要的是利用电路的一些宽度参数,如有界树宽度或路径宽度,将电路转换为结构化类,例如确定性结构化NNFs (d- sdn)或obdd。在这项工作中,我们展示了如何通过上界和下界将电路的宽度与其结构化表示的大小联系起来。对于上界,我们展示了如何将有界树宽电路转换为d-SDNNF,电路尺寸在时间上呈线性。与现有的结果不同,我们的边界是建设性的,并且仅在树宽上呈单指数。我们展示了单调DNF或CNF公式的相关下界,假设在arity(子句的大小)和degree(每个变量的出现次数)上有一个恒定的边界。具体来说,任何d-SDNNF (resp。, sdn - nf),用于这样的DNF(参见。, CNF)的树宽必须是指数大小;在编译为obdd时,对于路径宽度也是如此。我们的下界,与以前的大多数工作不同,适用于这类公式的任何公式,而不仅仅是一个精心挑选的族。因此,对于我们的DNF和CNF语言,pathwidth和treewidth分别表征了编译到obdd和(d-) sdn的效率,即编译在宽度参数上是单指数的。最后,我们将下界结果应用于查询求值任务。
{"title":"Connecting Width and Structure in Knowledge Compilation","authors":"Antoine Amarilli, Mikaël Monet, P. Senellart","doi":"10.4230/LIPIcs.ICDT.2018.6","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2018.6","url":null,"abstract":"Several query evaluation tasks can be done via knowledge compilation: the query result is compiled as a lineage circuit from which the answer can be determined. For such tasks, it is important to leverage some width parameters of the circuit, such as bounded treewidth or pathwidth, to convert the circuit to structured classes, e.g., deterministic structured NNFs (d-SDNNFs) or OBDDs. In this work, we show how to connect the width of circuits to the size of their structured representation, through upper and lower bounds. For the upper bound, we show how bounded-treewidth circuits can be converted to a d-SDNNF, in time linear in the circuit size. Our bound, unlike existing results, is constructive and only singly exponential in the treewidth. We show a related lower bound on monotone DNF or CNF formulas, assuming a constant bound on the arity (size of clauses) and degree (number of occurrences of each variable). Specifically, any d-SDNNF (resp., SDNNF) for such a DNF (resp., CNF) must be of exponential size in its treewidth; and the same holds for pathwidth when compiling to OBDDs. Our lower bounds, in contrast with most previous work, apply to any formula of this class, not just a well-chosen family. Hence, for our language of DNF and CNF, pathwidth and treewidth respectively characterize the efficiency of compiling to OBDDs and (d-)SDNNFs, that is, compilation is singly exponential in the width parameter. We conclude by applying our lower bound results to the task of query evaluation.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89962915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Constant Delay Enumeration for FO Queries over Databases with Local Bounded Expansion 局部有界扩展数据库上FO查询的常延迟枚举
L. Segoufin, Alexandre Vigny
We consider the evaluation of first-order queries over classes of databases with local bounded expansion. This class was introduced by Nesetril and Ossona de Mendez and generalizes many well known classes of databases, such as bounded degree, bounded tree width or bounded expansion. It is known that over classes of databases with local bounded expansion, first-order sentences can be evaluated in pseudo-linear time (pseudo-linear time means that for all epsilon there exists an algorithm working in time O(n^{1+epsilon})). Here, we investigate other scenarios, where queries are not sentences. We show that first-order queries can be enumerated with constant delay after a pseudo-linear preprocessing over any class of databases having locally bounded expansion. We also show that, in this context, counting the number of solutions can be done in pseudo-linear time.
考虑一类具有局部有界展开的数据库一阶查询的求值问题。这个类是由Nesetril和Ossona de Mendez介绍的,它概括了许多众所周知的数据库类,如有界度、有界树宽度或有界扩展。众所周知,在具有局部有界展开的数据库类上,一阶句子可以在伪线性时间内求值(伪线性时间意味着对于所有epsilon存在一个在时间O(n^{1+epsilon})内工作的算法)。这里,我们研究其他场景,其中查询不是句子。我们证明了在具有局部有界扩展的任何一类数据库上,经过伪线性预处理后,一阶查询可以以恒定延迟枚举。我们还证明,在这种情况下,计算解的个数可以在伪线性时间内完成。
{"title":"Constant Delay Enumeration for FO Queries over Databases with Local Bounded Expansion","authors":"L. Segoufin, Alexandre Vigny","doi":"10.4230/LIPIcs.ICDT.2017.20","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2017.20","url":null,"abstract":"We consider the evaluation of first-order queries over classes of databases with local bounded expansion. This class was introduced by Nesetril and Ossona de Mendez and generalizes many well known classes of databases, such as bounded degree, bounded tree width or bounded expansion. It is known that over classes of databases with local bounded expansion, first-order sentences can be evaluated in pseudo-linear time (pseudo-linear time means that for all epsilon there exists an algorithm working in time O(n^{1+epsilon})). Here, we investigate other scenarios, where queries are not sentences. We show that first-order queries can be enumerated with constant delay after a pseudo-linear preprocessing over any class of databases having locally bounded expansion. We also show that, in this context, counting the number of solutions can be done in pseudo-linear time.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79175472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Top-k Querying of Unknown Values under Order Constraints (Extended Version) 序约束下未知值的Top-k查询(扩展版)
Antoine Amarilli, Yael Amsterdamer, T. Milo, P. Senellart
Many practical scenarios make it necessary to evaluate top-k queries over data items with partially unknown values. This paper considers a setting where the values are taken from a numerical domain, and where some partial order constraints are given over known and unknown values: under these constraints, we assume that all possible worlds are equally likely. Our work is the first to propose a principled scheme to derive the value distributions and expected values of unknown items in this setting, with the goal of computing estimated top-k results by interpolating the unknown values from the known ones. We study the complexity of this general task, and show tight complexity bounds, proving that the problem is intractable, but can be tractably approximated. We then consider the case of tree-shaped partial orders, where we show a constructive PTIME solution. We also compare our problem setting to other top-k definitions on uncertain data.
许多实际场景都需要对具有部分未知值的数据项进行top-k查询。本文考虑了一种值取自数值域的情况,并对已知值和未知值给出了一些偏序约束,在这些约束下,我们假设所有可能世界都是等可能的。我们的工作是第一个提出一个有原则的方案来推导在这种情况下未知项目的值分布和期望值,目标是通过从已知值中插值未知值来计算估计的top-k结果。我们研究了这一一般任务的复杂性,并给出了严格的复杂性界,证明了问题是难以处理的,但可以被跟踪逼近。然后我们考虑树形偏序的情况,在这种情况下我们给出了一个建设性的PTIME解。我们还将我们的问题设置与不确定数据的其他top-k定义进行了比较。
{"title":"Top-k Querying of Unknown Values under Order Constraints (Extended Version)","authors":"Antoine Amarilli, Yael Amsterdamer, T. Milo, P. Senellart","doi":"10.4230/LIPIcs.ICDT.2017.5","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2017.5","url":null,"abstract":"Many practical scenarios make it necessary to evaluate top-k queries over data items with partially unknown values. This paper considers a setting where the values are taken from a numerical domain, and where some partial order constraints are given over known and unknown values: under these constraints, we assume that all possible worlds are equally likely. Our work is the first to propose a principled scheme to derive the value distributions and expected values of unknown items in this setting, with the goal of computing estimated top-k results by interpolating the unknown values from the known ones. We study the complexity of this general task, and show tight complexity bounds, proving that the problem is intractable, but can be tractably approximated. We then consider the case of tree-shaped partial orders, where we show a constructive PTIME solution. We also compare our problem setting to other top-k definitions on uncertain data.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83062571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The Smart Crowd - Learning from the Ones Who Know (Invited Talk) 聪明人群-向知者学习(特邀演讲)
T. Milo
One of the foremost challenges for information technology over the last few years has been to explore, understand, and extract useful information from large amounts of data. Some particular tasks such as annotating data or matching entities have been outsourced to human workers for many years. But the last few years have seen the rise of a new research field called crowdsourcing that aims at delegating a wide range of tasks to human workers, building formal frameworks, and improving the efficiency of these processes. In order to provide sound scientific foundations for crowdsourcing and support the development of efficient crowd sourcing processes, adequate formal models and algorithms must be defined. In particular, the models must formalize unique characteristics of crowd-based settings, such as the knowledge of the crowd and crowd-provided data; the interaction with crowd members; the inherent inaccuracies and disagreements in crowd answers; and evaluation metrics that capture the cost and effort of the crowd. Clearly, what may be achieved with the help of the crowd depends heavily on the properties and knowledge of the given crowd. In this talk we will focus on knowledgeable crowds. We will examine the use of such crowds, and in particular domain experts, for assisting solving data management problems. Specifically we will consider three dimensions of the problem: (1) How domain experts can help in improving the data itself, e.g. by gathering missing data and improving the quality of existing data, (2) How they can assist in gathering meta-data that facilitate improved data processing, and (3) How can we find and identify the most relevant crowd for a given data management task.
在过去几年中,信息技术面临的最大挑战之一是从大量数据中探索、理解和提取有用的信息。一些特定的任务,如注释数据或匹配实体,多年来一直外包给人工。但最近几年出现了一个名为“众包”(crowdsourcing)的新研究领域,其目的是将广泛的任务委托给人类工作者,建立正式的框架,并提高这些过程的效率。为了为众包提供可靠的科学基础,并支持高效众包流程的发展,必须定义适当的正式模型和算法。特别是,模型必须形式化基于人群的设置的独特特征,例如对人群的了解和人群提供的数据;与人群成员的互动;群众回答中固有的不准确和不一致;和评估指标,捕捉成本和努力的人群。显然,在群体的帮助下可能取得的成就在很大程度上取决于特定群体的属性和知识。在这次演讲中,我们将关注知识渊博的人群。我们将研究使用这些群体,特别是领域专家,来协助解决数据管理问题。具体来说,我们将考虑问题的三个维度:(1)领域专家如何帮助改进数据本身,例如通过收集缺失数据和提高现有数据的质量;(2)他们如何协助收集元数据,以促进改进数据处理;(3)我们如何找到并识别与给定数据管理任务最相关的人群。
{"title":"The Smart Crowd - Learning from the Ones Who Know (Invited Talk)","authors":"T. Milo","doi":"10.4230/LIPIcs.ICDT.2017.3","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2017.3","url":null,"abstract":"One of the foremost challenges for information technology over the last few years has been to explore, understand, and extract useful information from large amounts of data. Some particular tasks such as annotating data or matching entities have been outsourced to human workers for many years. But the last few years have seen the rise of a new research field called crowdsourcing that aims at delegating a wide range of tasks to human workers, building formal frameworks, and improving the efficiency of these processes. In order to provide sound scientific foundations for crowdsourcing and support the development of efficient crowd sourcing processes, adequate formal models and algorithms must be defined. In particular, the models must formalize unique characteristics of crowd-based settings, such as the knowledge of the crowd and crowd-provided data; the interaction with crowd members; the inherent inaccuracies and disagreements in crowd answers; and evaluation metrics that capture the cost and effort of the crowd. Clearly, what may be achieved with the help of the crowd depends heavily on the properties and knowledge of the given crowd. In this talk we will focus on knowledgeable crowds. We will examine the use of such crowds, and in particular domain experts, for assisting solving data management problems. Specifically we will consider three dimensions of the problem: (1) How domain experts can help in improving the data itself, e.g. by gathering missing data and improving the quality of existing data, (2) How they can assist in gathering meta-data that facilitate improved data processing, and (3) How can we find and identify the most relevant crowd for a given data management task.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74726617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting Ambiguity in Prioritized Database Repairing 优先数据库修复中的歧义检测
B. Kimelfeld, Ester Livshits, L. Peterfreund
In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a "minimal way." Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, researchers have introduced and investigated the framework of preferred repairs, in the context of denial constraints and subset repairs. There, a priority relation between facts is lifted towards a priority relation between consistent databases, and repairs are restricted to the ones that are optimal in the lifted sense. Three notions of lifting (and optimal repairs) have been proposed: Pareto, global, and completion. In this paper we investigate the complexity of deciding whether the priority relation suffices to clean the database unambiguously, or in other words, whether there is exactly one optimal repair. We show that the different lifting semantics entail highly different complexities. Under Pareto optimality, the problem is coNP-complete, in data complexity, for every set of functional dependencies (FDs), except for the tractable case of (equivalence to) one FD per relation. Under global optimality, one FD per relation is still tractable, but we establish Pi-2-p-completeness for a relation with two FDs. In contrast, under completion optimality the problem is solvable in polynomial time for every set of FDs. In fact, we present a polynomial-time algorithm for arbitrary conflict hypergraphs. We further show that under a general assumption of transitivity, this algorithm solves the problem even for global optimality. The algorithm is extremely simple, but its proof of correctness is quite intricate.
在其传统定义中,不一致数据库的修复是一个与不一致数据库“以最小的方式”不同的一致数据库。通常情况下,维修不是同等合法的,因为人们希望更喜欢一个。例如,一个事实被认为比另一个更可靠,或者一个最近的事实应该比一个更早的事实更受欢迎。出于这些考虑,研究人员在拒绝约束和子集修复的背景下引入并研究了首选修复的框架。在那里,事实之间的优先级关系被提升为一致数据库之间的优先级关系,并且修复仅限于在提升意义上最优的那些。提出了三个提升(和最优修复)的概念:帕累托、全局和完成。在本文中,我们研究了确定优先级关系是否足以明确地清理数据库的复杂性,或者换句话说,是否只有一个最佳修复。我们表明,不同的提升语义导致高度不同的复杂性。在Pareto最优性下,对于每一组功能依赖(FD),除了每个关系(等价于)一个FD的可处理情况外,在数据复杂性方面,问题是conp完全的。在全局最优性下,每个关系一个FD仍然是可处理的,但我们建立了具有两个FD的关系的pi -2-p完备性。相反,在补全最优性下,对于每组fd,问题都在多项式时间内可解。实际上,我们提出了一种求解任意冲突超图的多项式时间算法。进一步证明了在一般传递性假设下,该算法即使在全局最优的情况下也能解决问题。该算法极其简单,但其正确性的证明却相当复杂。
{"title":"Detecting Ambiguity in Prioritized Database Repairing","authors":"B. Kimelfeld, Ester Livshits, L. Peterfreund","doi":"10.4230/LIPIcs.ICDT.2017.17","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2017.17","url":null,"abstract":"In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a \"minimal way.\" Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, researchers have introduced and investigated the framework of preferred repairs, in the context of denial constraints and subset repairs. There, a priority relation between facts is lifted towards a priority relation between consistent databases, and repairs are restricted to the ones that are optimal in the lifted sense. Three notions of lifting (and optimal repairs) have been proposed: Pareto, global, and completion. In this paper we investigate the complexity of deciding whether the priority relation suffices to clean the database unambiguously, or in other words, whether there is exactly one optimal repair. We show that the different lifting semantics entail highly different complexities. Under Pareto optimality, the problem is coNP-complete, in data complexity, for every set of functional dependencies (FDs), except for the tractable case of (equivalence to) one FD per relation. Under global optimality, one FD per relation is still tractable, but we establish Pi-2-p-completeness for a relation with two FDs. In contrast, under completion optimality the problem is solvable in polynomial time for every set of FDs. In fact, we present a polynomial-time algorithm for arbitrary conflict hypergraphs. We further show that under a general assumption of transitivity, this algorithm solves the problem even for global optimality. The algorithm is extremely simple, but its proof of correctness is quite intricate.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84810643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Graphs, Hypergraphs, and the Complexity of Conjunctive Database Queries (Invited Talk) 图、超图和连接数据库查询的复杂性(特邀演讲)
D. Marx
The complexity of evaluating conjunctive queries can depend significantly on the structure of the query. For example, it is well known that various notions of acyclicity can make the evaluation problem tractable. More generally, it seems that the complexity is connected to the "treelikeness" of the graph or hypergraph describing the query structure. In the lecture, we will review some of the notions of treelikeness that were proposed in the literature and how they are relevant for the complexity of evaluating conjunctive queries and related problems.
求合查询的复杂度很大程度上取决于查询的结构。例如,众所周知,各种不循环的概念可以使求值问题易于处理。更一般地说,复杂性似乎与描述查询结构的图或超图的“树状性”有关。在讲座中,我们将回顾一些在文献中提出的树状概念,以及它们如何与评估连接查询和相关问题的复杂性相关。
{"title":"Graphs, Hypergraphs, and the Complexity of Conjunctive Database Queries (Invited Talk)","authors":"D. Marx","doi":"10.4230/LIPIcs.ICDT.2017.2","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2017.2","url":null,"abstract":"The complexity of evaluating conjunctive queries can depend significantly on the structure of the query. For example, it is well known that various notions of acyclicity can make the evaluation problem tractable. More generally, it seems that the complexity is connected to the \"treelikeness\" of the graph or hypergraph describing the query structure. In the lecture, we will review some of the notions of treelikeness that were proposed in the literature and how they are relevant for the complexity of evaluating conjunctive queries and related problems.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90727880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Distributed Query Monitoring through Convex Analysis: Towards Composable Safe Zones 基于凸分析的分布式查询监控:迈向可组合安全区域
M. Garofalakis, V. Samoladas
Continuous tracking of complex data analytics queries over high-speed distributed streams is becoming increasingly important. Query tracking can be reduced to continuous monitoring of a condition over the global stream. Communication-efficient monitoring relies on locally processing stream data at the sites where it is generated, by deriving site-local conditions which collectively guarantee the global condition. Recently proposed geometric techniques offer a generic approach for splitting an arbitrary global condition into local geometric monitoring constraints (known as “Safe Zones”); still, their application to various problem domains has so far been based on heuristics and lacking a principled, compositional methodology. In this paper, we present the first known formal results on the difficult problem of effective Safe Zone (SZ) design for complex query monitoring over distributed streams. Exploiting tools from convex analysis, our approach relies on an algebraic representation of SZs which allows us to: (1) Formally define the notion of a “good” SZ for distributed monitoring problems; and, most importantly, (2) Tackle and solve the important problem of systematically composing SZs for monitored conditions expressed as Boolean formulas over simpler conditions (for which SZs are known); furthermore, we prove that, under broad assumptions, the composed SZ is good if the component SZs are good. Our results are, therefore, a first step towards a principled compositional solution to SZ design for distributed query monitoring. Finally, we discuss a number of important applications for our SZ design algorithms, also demonstrating how earlier geometric techniques can be seen as special cases of our framework. 1998 ACM Subject Classification H.2.m [Database Management] Miscellaneous, C.2.4 [Computer-Communication Networks] Distributed Systems
在高速分布式流上持续跟踪复杂的数据分析查询变得越来越重要。查询跟踪可以简化为对全局流上的条件的连续监视。高效通信的监测依赖于在产生流数据的站点对流数据进行本地处理,通过导出站点本地条件来共同保证全局条件。最近提出的几何技术提供了一种通用方法,可以将任意全局条件拆分为局部几何监控约束(称为“安全区”);然而,到目前为止,它们在各种问题领域的应用都是基于启发式的,缺乏原则性的组合方法。在本文中,我们提出了关于分布式流上复杂查询监控的有效安全区(SZ)设计难题的第一个已知的形式化结果。利用凸分析工具,我们的方法依赖于SZ的代数表示,它允许我们:(1)正式定义分布式监控问题的“好”SZ的概念;最重要的是,(2)处理和解决系统地组成以布尔公式表示的监控条件的sz的重要问题(sz是已知的);进一步证明了在广义的假设条件下,如果组成SZ是好的,则组成SZ是好的。因此,我们的结果是朝着分布式查询监视的SZ设计原则组合解决方案迈出的第一步。最后,我们讨论了SZ设计算法的一些重要应用,也展示了早期的几何技术如何被视为我们框架的特殊情况。1998 ACM主题分类H.2。m[数据库管理]Miscellaneous . C.2.4[计算机通信网络]分布式系统
{"title":"Distributed Query Monitoring through Convex Analysis: Towards Composable Safe Zones","authors":"M. Garofalakis, V. Samoladas","doi":"10.4230/LIPIcs.ICDT.2017.14","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2017.14","url":null,"abstract":"Continuous tracking of complex data analytics queries over high-speed distributed streams is becoming increasingly important. Query tracking can be reduced to continuous monitoring of a condition over the global stream. Communication-efficient monitoring relies on locally processing stream data at the sites where it is generated, by deriving site-local conditions which collectively guarantee the global condition. Recently proposed geometric techniques offer a generic approach for splitting an arbitrary global condition into local geometric monitoring constraints (known as “Safe Zones”); still, their application to various problem domains has so far been based on heuristics and lacking a principled, compositional methodology. In this paper, we present the first known formal results on the difficult problem of effective Safe Zone (SZ) design for complex query monitoring over distributed streams. Exploiting tools from convex analysis, our approach relies on an algebraic representation of SZs which allows us to: (1) Formally define the notion of a “good” SZ for distributed monitoring problems; and, most importantly, (2) Tackle and solve the important problem of systematically composing SZs for monitored conditions expressed as Boolean formulas over simpler conditions (for which SZs are known); furthermore, we prove that, under broad assumptions, the composed SZ is good if the component SZs are good. Our results are, therefore, a first step towards a principled compositional solution to SZ design for distributed query monitoring. Finally, we discuss a number of important applications for our SZ design algorithms, also demonstrating how earlier geometric techniques can be seen as special cases of our framework. 1998 ACM Subject Classification H.2.m [Database Management] Miscellaneous, C.2.4 [Computer-Communication Networks] Distributed Systems","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78644655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1