首页 > 最新文献

Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems最新文献

英文 中文
The Tractability Frontier of Well-designed SPARQL Queries 设计良好的SPARQL查询的可跟踪性边界
M. Romero
We study the complexity of query evaluation of SPARQL queries. We focus on the fundamental fragment of well-designed SPARQL restricted to the AND, OPTIONAL and UNION operators. Our main result is a structural characterisation of the classes of well-designed queries that can be evaluated in polynomial time. In particular, we introduce a new notion of width called domination width, which relies on the well-known notion of treewidth. We show that, under some complexity theoretic assumptions, the classes of well-designed queries that can be evaluated in polynomial time are precisely those of bounded domination width.
我们研究了SPARQL查询求值的复杂性。我们将关注精心设计的SPARQL的基本部分,这些部分仅限于AND、OPTIONAL和UNION操作符。我们的主要结果是可以在多项式时间内评估的设计良好的查询类的结构特征。特别地,我们引入了一个新的宽度概念,称为支配宽度,它依赖于众所周知的树宽度概念。我们证明,在一些复杂性理论假设下,可以在多项式时间内评估的设计良好的查询类正是那些有界支配宽度的查询类。
{"title":"The Tractability Frontier of Well-designed SPARQL Queries","authors":"M. Romero","doi":"10.1145/3196959.3196973","DOIUrl":"https://doi.org/10.1145/3196959.3196973","url":null,"abstract":"We study the complexity of query evaluation of SPARQL queries. We focus on the fundamental fragment of well-designed SPARQL restricted to the AND, OPTIONAL and UNION operators. Our main result is a structural characterisation of the classes of well-designed queries that can be evaluated in polynomial time. In particular, we introduce a new notion of width called domination width, which relies on the well-known notion of treewidth. We show that, under some complexity theoretic assumptions, the classes of well-designed queries that can be evaluated in polynomial time are precisely those of bounded domination width.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132662481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Heavy Hitters and the Structure of Local Privacy 重量级人物和本地隐私结构
Mark Bun, Jelani Nelson, Uri Stemmer
We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bounds on the error to incorporate the failure probability, and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters. (1) Advanced Grouposition: In the local model, group privacy for k users degrades proportionally to root k, instead of linearly in k as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via "packing arguments"), over the central model. (2) Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy.
我们提出了一种新的局部差分私有算法,该算法可以将最优最坏情况误差作为所有标准考虑参数的函数。先前的工作得到的错误率最优地依赖于用户数量、域大小和隐私参数,但次优地依赖于故障概率。我们加强了现有的误差下界,以纳入失效概率,并表明我们的新上界对于这个参数也是紧的。我们的下界是基于对局部私有协议结构的新理解。我们进一步发展这些想法,以获得除重量级人物之外的以下一般结果。(1)高级分组位置:在局部模型中,k个用户的组隐私按比例退化到根k,而不是像在中心模型中那样在k中线性退化。与中心模型相比,更强的群体隐私产生了更好的最大信息保证,以及更强的下界(通过“打包参数”)。(2)基于Bassily和Smith (STOC 2015)的转换,我们给出了从任何非交互式近似私有本地协议到纯私有本地协议的通用转换。再次与中心模型形成对比的是,这表明我们无法通过从纯粹到近似的局部隐私来获得更准确的算法。
{"title":"Heavy Hitters and the Structure of Local Privacy","authors":"Mark Bun, Jelani Nelson, Uri Stemmer","doi":"10.1145/3196959.3196981","DOIUrl":"https://doi.org/10.1145/3196959.3196981","url":null,"abstract":"We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bounds on the error to incorporate the failure probability, and show that our new upper bound is tight with respect to this parameter as well. Our lower bound is based on a new understanding of the structure of locally private protocols. We further develop these ideas to obtain the following general results beyond heavy hitters. (1) Advanced Grouposition: In the local model, group privacy for k users degrades proportionally to root k, instead of linearly in k as in the central model. Stronger group privacy yields improved max-information guarantees, as well as stronger lower bounds (via \"packing arguments\"), over the central model. (2) Building on a transformation of Bassily and Smith (STOC 2015), we give a generic transformation from any non-interactive approximate-private local protocol into a pure-private local protocol. Again in contrast with the central model, this shows that we cannot obtain more accurate algorithms by moving from pure to approximate local privacy.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127759972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 139
Compressed Representations of Conjunctive Query Results 合取查询结果的压缩表示
Shaleen Deep, Paraschos Koutris
Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline. Motivated by this problem, we study the construction of space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a given access pattern. In particular, we initiate the study of an important tradeoff: minimizing the space necessary to store the compressed result, versus minimizing the answer time and delay for an access request over the result. Our main contribution is a novel parameterized data structure, which can be tuned to trade off space for answer time. The tradeoff allows us to control the space requirement of the data structure precisely, and depends both on the structure of the query and the access pattern. We show how we can use the data structure in conjunction with query decomposition techniques in order to efficiently represent the outputs for several classes of conjunctive queries.
关系查询,特别是连接查询,在一个庞大的数据集上执行时,通常会产生大量的输出结果。在这种情况下,如果我们计划在数据处理管道中进一步重用它,那么存储整个物化输出通常是不可行的。受此问题的启发,我们研究了连接查询输出的空间高效压缩表示的构造,目的是支持对给定访问模式的中间压缩结果的有效访问。特别是,我们开始研究一个重要的权衡:最小化存储压缩结果所需的空间,与最小化对结果的访问请求的应答时间和延迟。我们的主要贡献是一种新的参数化数据结构,可以对其进行调优,以权衡空间和回答时间。这种权衡使我们能够精确地控制数据结构的空间需求,并取决于查询的结构和访问模式。我们将展示如何将数据结构与查询分解技术结合使用,以便有效地表示几类联合查询的输出。
{"title":"Compressed Representations of Conjunctive Query Results","authors":"Shaleen Deep, Paraschos Koutris","doi":"10.1145/3196959.3196979","DOIUrl":"https://doi.org/10.1145/3196959.3196979","url":null,"abstract":"Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline. Motivated by this problem, we study the construction of space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a given access pattern. In particular, we initiate the study of an important tradeoff: minimizing the space necessary to store the compressed result, versus minimizing the answer time and delay for an access request over the result. Our main contribution is a novel parameterized data structure, which can be tuned to trade off space for answer time. The tradeoff allows us to control the space requirement of the data structure precisely, and depends both on the structure of the query and the access pattern. We show how we can use the data structure in conjunction with query decomposition techniques in order to efficiently represent the outputs for several classes of conjunctive queries.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"19 11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134387937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Improvements on the k-center Problem for Uncertain Data 不确定数据k中心问题的改进
Sharareh Alipour, A. Jafari
In real applications, there are situations where we need to model some problems based on uncertain data. This leads us to define an uncertain model for some classical geometric optimization problems and propose algorithms to solve them. The assigned version of the k-center problem for n uncertain points in a metric space is studied in this paper. The main approach is to replace each uncertain point with a clever choice of a certain point. We argue that the k-center solution for these certain replacements of our uncertain points, is a good constant approximation factor for the original uncertain k-center problem. This approach enables us to present fast and simple algorithms that give 10-approximation solution for the k-center problem in any metric space and when the ambient space is Euclidean, it can be improved to (3+ε)-approximation for any ε>0. These algorithms improve both the approximation factor and the running time of the previously known algorithms. Also, our algorithms are suitable for applying in the case of streaming and big data.
在实际应用中,我们需要基于不确定的数据对一些问题进行建模。这导致我们定义了一些经典几何优化问题的不确定模型,并提出了求解这些问题的算法。本文研究了度量空间中n个不确定点的k中心问题的赋值形式。主要的方法是用某一点的巧妙选择来代替每一个不确定点。我们认为,这些不确定点的某些替换的k中心解,是原始不确定k中心问题的一个很好的常数近似因子。这种方法使我们能够提供快速和简单的算法,在任何度量空间中给出k中心问题的10近似解,当环境空间是欧几里得时,对于任何ε>0,它可以改进为(3+ε)近似。这些算法改进了先前已知算法的近似因子和运行时间。同时,我们的算法也适用于流数据和大数据的应用。
{"title":"Improvements on the k-center Problem for Uncertain Data","authors":"Sharareh Alipour, A. Jafari","doi":"10.1145/3196959.3196969","DOIUrl":"https://doi.org/10.1145/3196959.3196969","url":null,"abstract":"In real applications, there are situations where we need to model some problems based on uncertain data. This leads us to define an uncertain model for some classical geometric optimization problems and propose algorithms to solve them. The assigned version of the k-center problem for n uncertain points in a metric space is studied in this paper. The main approach is to replace each uncertain point with a clever choice of a certain point. We argue that the k-center solution for these certain replacements of our uncertain points, is a good constant approximation factor for the original uncertain k-center problem. This approach enables us to present fast and simple algorithms that give 10-approximation solution for the k-center problem in any metric space and when the ambient space is Euclidean, it can be improved to (3+ε)-approximation for any ε>0. These algorithms improve both the approximation factor and the running time of the previously known algorithms. Also, our algorithms are suitable for applying in the case of streaming and big data.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117228410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
First-Order Query Evaluation with Cardinality Conditions 具有基数条件的一阶查询求值
Martin Grohe, Nicole Schweikardt
We study an extension of first-order logic FO that allows to express cardinality conditions in a similar way as SQL's COUNT operator. The corresponding logic FOC(P) was introduced by Kuske and Schweikardt, who showed that query evaluation for this logic is fixed-parameter tractable on classes of databases of bounded degree. In this paper, we first show that the fixed-parameter tractability of FOC(P) cannot even be generalised to very simple classes of databases of unbounded degree such as unranked trees or strings with a linear order relation. Then, we identify a fragment FOC1(P) of FOCP which is still extends FO and is sufficiently strong to express standard applications of SQL's COUNT operator. Our main result shows that query evaluation for FOC1(P) is fixed-parameter tractable on nowhere dense classes of databases. This, in particular, implies that the counting problem for first-order queries on nowhere dense classes is fixed-parameter tractable.
我们研究了一阶逻辑FO的扩展,它允许以类似于SQL的COUNT运算符的方式表示基数条件。Kuske和Schweikardt引入了相应的逻辑FOC(P),他们证明了该逻辑的查询求值在有界度的数据库类上是固定参数可处理的。在本文中,我们首先证明了FOC(P)的定参数可追溯性甚至不能推广到非常简单的无界度数据库类,如无序树或具有线性顺序关系的字符串。然后,我们确定了FOCP的一个片段FOC1(P),它仍然是FO的扩展,并且足够强,可以表达SQL的COUNT运算符的标准应用。我们的主要结果表明,FOC1(P)的查询评估在任何密集的数据库类别上都是固定参数可处理的。特别是,这意味着对无处密集类的一阶查询的计数问题是固定参数可处理的。
{"title":"First-Order Query Evaluation with Cardinality Conditions","authors":"Martin Grohe, Nicole Schweikardt","doi":"10.1145/3196959.3196970","DOIUrl":"https://doi.org/10.1145/3196959.3196970","url":null,"abstract":"We study an extension of first-order logic FO that allows to express cardinality conditions in a similar way as SQL's COUNT operator. The corresponding logic FOC(P) was introduced by Kuske and Schweikardt, who showed that query evaluation for this logic is fixed-parameter tractable on classes of databases of bounded degree. In this paper, we first show that the fixed-parameter tractability of FOC(P) cannot even be generalised to very simple classes of databases of unbounded degree such as unranked trees or strings with a linear order relation. Then, we identify a fragment FOC1(P) of FOCP which is still extends FO and is sufficiently strong to express standard applications of SQL's COUNT operator. Our main result shows that query evaluation for FOC1(P) is fixed-parameter tractable on nowhere dense classes of databases. This, in particular, implies that the counting problem for first-order queries on nowhere dense classes is fixed-parameter tractable.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121478725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Reconciling Graphs and Sets of Sets 调和图和集合的集合
M. Mitzenmacher, Tom Morgan
We explore a generalization of set reconciliation, where the goal is to reconcile sets of sets. Alice and Bob each have a parent set consisting of s child sets, each containing at most h elements from a universe of size u. They want to reconcile their sets of sets in a scenario where the total number of differences between all of their child sets (under the minimum difference matching between their child sets) is d. We give several algorithms for this problem, and discuss applications to reconciliation problems on graphs, databases, and collections of documents. We specifically focus on graph reconciliation, providing protocols based on sets of sets reconciliation for random graphs from G(n,p) and for forests of rooted trees.
我们探索了集合调和的推广,其目标是调和集合的集合。Alice和Bob都有一个由s个子集组成的父集,每个子集最多包含来自大小为u的宇宙的h个元素。他们想要在他们所有子集之间的差异总数(在他们的子集之间的最小差异匹配下)为d的情况下协调他们的集合。我们给出了这个问题的几种算法,并讨论了在图,数据库和文档集合上的协调问题的应用。我们特别关注图协调,为来自G(n,p)的随机图和有根树的森林提供基于集的集协调协议。
{"title":"Reconciling Graphs and Sets of Sets","authors":"M. Mitzenmacher, Tom Morgan","doi":"10.1145/3196959.3196988","DOIUrl":"https://doi.org/10.1145/3196959.3196988","url":null,"abstract":"We explore a generalization of set reconciliation, where the goal is to reconcile sets of sets. Alice and Bob each have a parent set consisting of s child sets, each containing at most h elements from a universe of size u. They want to reconcile their sets of sets in a scenario where the total number of differences between all of their child sets (under the minimum difference matching between their child sets) is d. We give several algorithms for this problem, and discuss applications to reconciliation problems on graphs, databases, and collections of documents. We specifically focus on graph reconciliation, providing protocols based on sets of sets reconciliation for random graphs from G(n,p) and for forests of rooted trees.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126622546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity 用于提取不完整信息的文档生成器:表达性和复杂性
Francisco Maturana, Cristian Riveros, D. Vrgoc
Rule-based information extraction has lately received a fair amount of attention from the database community, with several languages appearing in the last few years. Although information extraction systems are intended to deal with semistructured data, all language proposals introduced so far are designed to output relations, thus making them incapable of handling incomplete information. To remedy the situation, we propose to extend information extraction languages with the ability to use mappings, thus allowing us to work with documents which have missing or optional parts. Using this approach, we simplify the semantics of regex formulas and extraction rules, two previously defined methods for extracting information. We extend them with the ability to handle incomplete data, and study how they compare in terms of expressive power. We also study computational properties of these languages, focusing on the query enumeration problem, as well as satisfiability and containment.
基于规则的信息提取最近受到了数据库社区的大量关注,在过去几年中出现了几种语言。虽然信息提取系统的目的是处理半结构化数据,但迄今为止引入的所有语言建议都被设计为输出关系,从而使它们无法处理不完整的信息。为了纠正这种情况,我们建议扩展信息提取语言,使其具有使用映射的能力,从而允许我们处理缺少部分或可选部分的文档。使用这种方法,我们简化了正则表达式公式和提取规则的语义,这是之前定义的两种提取信息的方法。我们将它们扩展为处理不完整数据的能力,并研究它们在表达能力方面的比较。我们还研究了这些语言的计算特性,重点是查询枚举问题,以及可满足性和包容性。
{"title":"Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity","authors":"Francisco Maturana, Cristian Riveros, D. Vrgoc","doi":"10.1145/3196959.3196968","DOIUrl":"https://doi.org/10.1145/3196959.3196968","url":null,"abstract":"Rule-based information extraction has lately received a fair amount of attention from the database community, with several languages appearing in the last few years. Although information extraction systems are intended to deal with semistructured data, all language proposals introduced so far are designed to output relations, thus making them incapable of handling incomplete information. To remedy the situation, we propose to extend information extraction languages with the ability to use mappings, thus allowing us to work with documents which have missing or optional parts. Using this approach, we simplify the semantics of regex formulas and extraction rules, two previously defined methods for extracting information. We extend them with the ability to handle incomplete data, and study how they compare in terms of expressive power. We also study computational properties of these languages, focusing on the query enumeration problem, as well as satisfiability and containment.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134388441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
When Can We Answer Queries Using Result-Bounded Data Interfaces? 什么时候可以使用结果边界数据接口回答查询?
Antoine Amarilli, Michael Benedikt
We consider answering queries on data available through access methods, that provide lookup access to the tuples matching a given binding. Such interfaces are common on the Web; further, they often have bounds on how many results they can return, e.g., because of pagination or rate limits. We thus study result-bounded methods, which may return only a limited number of tuples. We study how to decide if a query is answerable using result-bounded methods, i.e., how to compute a plan that returns all answers to the query using the methods, assuming that the underlying data satisfies some integrity constraints. We first show how to reduce answerability to a query containment problem with constraints. Second, we show "schema simplification'' theorems describing when and how result bounded services can be used. Finally, we use these theorems to give decidability and complexity results about answerability for common constraint classes.
我们考虑通过访问方法回答对可用数据的查询,这些方法提供对匹配给定绑定的元组的查找访问。这样的接口在Web上很常见;此外,它们通常对可以返回的结果数量有限制,例如,由于分页或速率限制。因此,我们研究结果有界方法,它可能只返回有限数量的元组。我们研究如何使用结果有界方法确定查询是否可回答,即,如何计算一个计划,该计划使用这些方法返回查询的所有答案,假设底层数据满足某些完整性约束。我们首先展示如何减少对带有约束的查询包含问题的可回答性。其次,我们展示了描述何时以及如何使用结果绑定服务的“模式简化”定理。最后,我们利用这些定理给出了一般约束类的可答性的可决性和复杂性结果。
{"title":"When Can We Answer Queries Using Result-Bounded Data Interfaces?","authors":"Antoine Amarilli, Michael Benedikt","doi":"10.1145/3196959.3196965","DOIUrl":"https://doi.org/10.1145/3196959.3196965","url":null,"abstract":"We consider answering queries on data available through access methods, that provide lookup access to the tuples matching a given binding. Such interfaces are common on the Web; further, they often have bounds on how many results they can return, e.g., because of pagination or rate limits. We thus study result-bounded methods, which may return only a limited number of tuples. We study how to decide if a query is answerable using result-bounded methods, i.e., how to compute a plan that returns all answers to the query using the methods, assuming that the underlying data satisfies some integrity constraints. We first show how to reduce answerability to a query containment problem with constraints. Second, we show \"schema simplification'' theorems describing when and how result bounded services can be used. Finally, we use these theorems to give decidability and complexity results about answerability for common constraint classes.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117231911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Joining Extractions of Regular Expressions 连接正则表达式的提取
Dominik D. Freydenberger, B. Kimelfeld, L. Peterfreund
Regular expressions with capture variables, also known as "regex formulas,'' extract relations of spans (interval positions) from text. These relations can be further manipulated via the relational Algebra as studied in the context of "document spanners," Fagin et al.'s formal framework for information extraction. We investigate the complexity of querying text by Conjunctive Queries (CQs) and Unions of CQs (UCQs) on top of regex formulas. Such queries have been investigated in prior work on document spanners, but little is known about the (combined) complexity of their evaluation. We show that the lower bounds (NP-completeness and W[1]-hardness) from the relational world also hold in our setting; in particular, hardness hits already single-character text. Yet, the upper bounds from the relational world do not carry over. Unlike the relational world, acyclic CQs, and even gamma-acyclic CQs, are hard to compute. The source of hardness is that it may be intractable to instantiate the relation defined by a regex formula, simply because it has an exponential number of tuples. Yet, we are able to establish general upper bounds. In particular, UCQs can be evaluated with polynomial delay, provided that every CQ has a bounded number of atoms (while unions and projection can be arbitrary). Furthermore, UCQ evaluation is solvable with FPT (Fixed-Parameter Tractable) delay when the parameter is the size of the UCQ.
带有捕获变量的正则表达式(也称为“regex公式”)从文本中提取跨度(间隔位置)的关系。这些关系可以通过在“文档生成器”(Fagin等人用于信息提取的正式框架)上下文中研究的关系代数进一步操作。我们研究了在正则表达式的基础上通过连接查询(CQs)和联合查询(UCQs)查询文本的复杂性。在以前的文档生成器工作中已经研究过这样的查询,但是对其计算的(综合)复杂性知之甚少。我们证明了关系世界的下界(np -完备性和W[1]-硬度)在我们的设置中也成立;特别是,硬度已经击中单字符文本。然而,关系世界的上界并没有延续下去。与关系世界不同,无环cq,甚至是γ -无环cq,都很难计算。困难的根源在于很难实例化由正则表达式定义的关系,这仅仅是因为它具有指数数量的元组。然而,我们能够建立一般的上界。特别是,ucq可以用多项式延迟来计算,前提是每个CQ有有限数量的原子(而联合和投影可以是任意的)。此外,当参数为UCQ的大小时,UCQ评估具有FPT (Fixed-Parameter Tractable)延迟可解。
{"title":"Joining Extractions of Regular Expressions","authors":"Dominik D. Freydenberger, B. Kimelfeld, L. Peterfreund","doi":"10.1145/3196959.3196967","DOIUrl":"https://doi.org/10.1145/3196959.3196967","url":null,"abstract":"Regular expressions with capture variables, also known as \"regex formulas,'' extract relations of spans (interval positions) from text. These relations can be further manipulated via the relational Algebra as studied in the context of \"document spanners,\" Fagin et al.'s formal framework for information extraction. We investigate the complexity of querying text by Conjunctive Queries (CQs) and Unions of CQs (UCQs) on top of regex formulas. Such queries have been investigated in prior work on document spanners, but little is known about the (combined) complexity of their evaluation. We show that the lower bounds (NP-completeness and W[1]-hardness) from the relational world also hold in our setting; in particular, hardness hits already single-character text. Yet, the upper bounds from the relational world do not carry over. Unlike the relational world, acyclic CQs, and even gamma-acyclic CQs, are hard to compute. The source of hardness is that it may be intractable to instantiate the relation defined by a regex formula, simply because it has an exponential number of tuples. Yet, we are able to establish general upper bounds. In particular, UCQs can be evaluated with polynomial delay, provided that every CQ has a bounded number of atoms (while unions and projection can be arbitrary). Furthermore, UCQ evaluation is solvable with FPT (Fixed-Parameter Tractable) delay when the parameter is the size of the UCQ.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132973480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Containment for Rule-Based Ontology-Mediated Queries 基于规则的本体中介查询的包容
P. Barceló, Gerald Berger, Andreas Pieris
Many efforts have been dedicated to identifying restrictions on ontologies expressed as tuple-generating dependencies (tgds), a.k.a. existential rules, that lead to the decidability of answering ontology-mediated queries (OMQs). This has given rise to three families of formalisms: guarded, non-recursive, and sticky sets of tgds. We study the containment problem for OMQs expressed in such formalisms, which is a key ingredient for solving static analysis tasks associated with them. Our main contribution is the development of specially tailored techniques for OMQ containment under the classes of tgds stated above. This enables us to obtain sharp complexity bounds for the problems at hand.
许多工作都致力于识别以元组生成依赖关系(tgds)表示的本体上的限制,也就是存在规则,它导致回答本体中介查询(omq)的可判定性。这就产生了三种形式:保护集、非递归集和粘性集。我们研究了用这种形式表示的omq的包含问题,这是解决与它们相关的静态分析任务的关键因素。我们的主要贡献是为上述tgds类别下的OMQ遏制开发专门定制的技术。这使我们能够为手头的问题获得明确的复杂性界限。
{"title":"Containment for Rule-Based Ontology-Mediated Queries","authors":"P. Barceló, Gerald Berger, Andreas Pieris","doi":"10.1145/3196959.3196963","DOIUrl":"https://doi.org/10.1145/3196959.3196963","url":null,"abstract":"Many efforts have been dedicated to identifying restrictions on ontologies expressed as tuple-generating dependencies (tgds), a.k.a. existential rules, that lead to the decidability of answering ontology-mediated queries (OMQs). This has given rise to three families of formalisms: guarded, non-recursive, and sticky sets of tgds. We study the containment problem for OMQs expressed in such formalisms, which is a key ingredient for solving static analysis tasks associated with them. Our main contribution is the development of specially tailored techniques for OMQ containment under the classes of tgds stated above. This enables us to obtain sharp complexity bounds for the problems at hand.","PeriodicalId":344370,"journal":{"name":"Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130509635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1