首页 > 最新文献

Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory最新文献

英文 中文
Discovering Event Queries from Traces: Laying Foundations for Subsequence-Queries with Wildcards and Gap-Size Constraints 从跟踪中发现事件查询:为带有通配符和间隙大小约束的子序列查询奠定基础
Sarah Kleest-Meißner, Rebecca Sattler, Markus L. Schmid, Nicole Schweikardt, M. Weidlich
We introduce subsequence-queries with wildcards and gap-size constraints (swg-queries, for short) as a tool for querying event traces. An swg-query q is given by a string s over an alphabet of variables and types, a global window size w , and a tuple c = (( c − 1 , c +1 ) , ( c − 2 , c +2 ) , . . . , ( c −| s |− 1 , c + | s |− 1 )) of local gap-size constraints over N × ( N ∪ {∞} ). The query q matches in a trace t (i. e., a sequence of types) if the variables can uniformly be substituted by types such that the resulting string occurs in t as a subsequence that spans an area of length at most w , and the i th gap of the subsequence (i. e., the distance between the i th and ( i +1) th position of the subsequence) has length at least c − i and at most c + i . We formalise and investigate the task of discovering an swg-query that describes best the traces from a given sample S of traces, and we present an algorithm solving this task. As a central component, our algorithm repeatedly solves the matching problem (i. e., deciding whether a given query q matches in a given trace t ), which is an NP-complete problem (in combined complexity). Hence, the matching problem is of special interest in the context of query discovery, and we therefore subject it to a detailed (parameterised) complexity analysis to identify tractable subclasses, which lead to tractable subclasses of the discovery problem as well. We complement this by a reduction proving Proof sketch. A natural brute-force approach is as follows: Upon input of an swg-query q = ( s, w, c ) and a trace t , we enumerate all mappings π : repvars ( q ) → types ( t ), and for each such mapping, we construct a regular expression R π that describes all traces t ′ for which there exists a substitution µ : vars ( q ) ∪ Γ → Γ such that µ is an extension of π and µ ( s ) ≼ e t ′ for some embedding e that satisfies w and c . Then, we only have to check for each of these mappings π , if the regular expression R π matches in t . Another approach is to enumerate all embeddings e : [ | s | ] → [ | t | ] that satisfy w and c and check for each such embedding e whether µ ( s ) ≼ e t for some substitution µ (which can be done in time O( | s | ), since µ must satisfy µ ( s ) = t [ e (1)] t [ e (2)] . . . t [ e ( | s | )]). From these two algorithms and the obvious dependencies between the parameters, we can directly conclude the statements of the theorem.
我们引入带有通配符和间隙大小约束的子查询(简称swg查询)作为查询事件跟踪的工具。swg查询q由变量和类型字母表上的字符串s、全局窗口大小w和元组c = ((c−1,c +1), (c−2,c +2),…, (c−| s |−1,c + | s |−1))在N × (N∪{∞})上的局部间隙大小约束。查询问匹配跟踪t(即一系列类型)如果一致可以替换的变量类型,这样生成的字符串出现在t作为子序列跨度的长度最多w,我th差距的子序列(即我th和之间的距离(i + 1) th子序列的位置)长度至少c−我最多和c +。我们形式化并研究了从给定的轨迹样本S中发现最能描述轨迹的swg查询的任务,并提出了解决该任务的算法。作为中心组件,我们的算法反复解决匹配问题(即决定给定查询q在给定跟踪t中是否匹配),这是一个np完全问题(组合复杂度)。因此,匹配问题在查询发现的上下文中特别重要,因此我们对其进行了详细的(参数化的)复杂性分析,以识别可处理的子类,这也会导致发现问题的可处理子类。我们补充了一个简化证明的证明草图。自然蛮力方法如下:在输入一个swg-query q = (s, w c)和跟踪t,我们列举所有映射π:repvars (q)→类型(t),对于每一次这样的映射,我们构造一个正则表达式Rπ,描述了所有的痕迹t’的存在一个替换µ:var (q)∪Γ→Γ这样µ是π的扩展和µ(s)≼e t '对于一些嵌入满足w e和c。然后,我们只需要检查每一个映射,如果正则表达式R在t中匹配。另一种方法是枚举满足w和c的所有嵌入e: [| s |]→[| t |],并检查每个这样的嵌入e是否对某些替换μ(这可以在时间O(| s |)中完成),因为µ必须满足µ(s) = t [e (1)] t [e(2)]。T [e (| s |)])。从这两种算法和参数之间明显的依赖关系,我们可以直接得出定理的表述。
{"title":"Discovering Event Queries from Traces: Laying Foundations for Subsequence-Queries with Wildcards and Gap-Size Constraints","authors":"Sarah Kleest-Meißner, Rebecca Sattler, Markus L. Schmid, Nicole Schweikardt, M. Weidlich","doi":"10.4230/LIPIcs.ICDT.2022.18","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2022.18","url":null,"abstract":"We introduce subsequence-queries with wildcards and gap-size constraints (swg-queries, for short) as a tool for querying event traces. An swg-query q is given by a string s over an alphabet of variables and types, a global window size w , and a tuple c = (( c − 1 , c +1 ) , ( c − 2 , c +2 ) , . . . , ( c −| s |− 1 , c + | s |− 1 )) of local gap-size constraints over N × ( N ∪ {∞} ). The query q matches in a trace t (i. e., a sequence of types) if the variables can uniformly be substituted by types such that the resulting string occurs in t as a subsequence that spans an area of length at most w , and the i th gap of the subsequence (i. e., the distance between the i th and ( i +1) th position of the subsequence) has length at least c − i and at most c + i . We formalise and investigate the task of discovering an swg-query that describes best the traces from a given sample S of traces, and we present an algorithm solving this task. As a central component, our algorithm repeatedly solves the matching problem (i. e., deciding whether a given query q matches in a given trace t ), which is an NP-complete problem (in combined complexity). Hence, the matching problem is of special interest in the context of query discovery, and we therefore subject it to a detailed (parameterised) complexity analysis to identify tractable subclasses, which lead to tractable subclasses of the discovery problem as well. We complement this by a reduction proving Proof sketch. A natural brute-force approach is as follows: Upon input of an swg-query q = ( s, w, c ) and a trace t , we enumerate all mappings π : repvars ( q ) → types ( t ), and for each such mapping, we construct a regular expression R π that describes all traces t ′ for which there exists a substitution µ : vars ( q ) ∪ Γ → Γ such that µ is an extension of π and µ ( s ) ≼ e t ′ for some embedding e that satisfies w and c . Then, we only have to check for each of these mappings π , if the regular expression R π matches in t . Another approach is to enumerate all embeddings e : [ | s | ] → [ | t | ] that satisfy w and c and check for each such embedding e whether µ ( s ) ≼ e t for some substitution µ (which can be done in time O( | s | ), since µ must satisfy µ ( s ) = t [ e (1)] t [ e (2)] . . . t [ e ( | s | )]). From these two algorithms and the obvious dependencies between the parameters, we can directly conclude the statements of the theorem.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"27 1","pages":"18:1-18:21"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84278741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Expressiveness of SHACL Features acl特性的表达性
B. Bogaerts, Maxim Jakubowski, J. V. D. Bussche
SHACL is a W3C-proposed schema language for expressing structural constraints on RDF graphs. Recent work on formalizing this language has revealed a striking relationship to description logics. SHACL expressions can use four fundamental features that are not so common in description logics. These features are zero-or-one path expressions; equality tests; disjointness tests; and closure constraints. Moreover, SHACL is peculiar in allowing only a restricted form of expressions (so-called targets) on the left-hand side of inclusion constraints. The goal of this paper is to obtain a clear picture of the impact and expressiveness of these features and restrictions. We show that each of the four features is primitive: using the feature, one can express boolean queries that are not expressible without using the feature. We also show that the restriction that SHACL imposes on allowed targets is inessential, as long as closure constraints are not used.
acl是w3c提出的一种模式语言,用于表示RDF图上的结构约束。最近对这种语言的形式化研究揭示了它与描述逻辑的惊人关系。acl表达式可以使用描述逻辑中不太常见的四个基本特性。这些特征是零或一路径表达式;平等的测试;剥离测试;以及闭包约束。此外,SHACL的特殊之处在于只允许包含约束左侧的受限形式的表达式(所谓的目标)。本文的目的是对这些特征和限制的影响和表现力有一个清晰的认识。我们展示了这四个特征中的每一个都是基本的:使用该特征,可以表达不使用该特征无法表达的布尔查询。我们还表明,只要不使用闭包约束,SHACL对允许的目标施加的限制是不必要的。
{"title":"Expressiveness of SHACL Features","authors":"B. Bogaerts, Maxim Jakubowski, J. V. D. Bussche","doi":"10.4230/LIPIcs.ICDT.2022.15","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2022.15","url":null,"abstract":"SHACL is a W3C-proposed schema language for expressing structural constraints on RDF graphs. Recent work on formalizing this language has revealed a striking relationship to description logics. SHACL expressions can use four fundamental features that are not so common in description logics. These features are zero-or-one path expressions; equality tests; disjointness tests; and closure constraints. Moreover, SHACL is peculiar in allowing only a restricted form of expressions (so-called targets) on the left-hand side of inclusion constraints. The goal of this paper is to obtain a clear picture of the impact and expressiveness of these features and restrictions. We show that each of the four features is primitive: using the feature, one can express boolean queries that are not expressible without using the feature. We also show that the restriction that SHACL imposes on allowed targets is inessential, as long as closure constraints are not used.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"82 1","pages":"15:1-15:16"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79779248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Practical Relational Calculus Query Evaluation 实用关系微积分查询求值
Martin Raszyk, D. Basin, S. Krstic, Dmitriy Traytel
The relational calculus (RC) is a concise, declarative query language. However, existing RC query evaluation approaches are inefficient and often deviate from established algorithms based on finite tables used in database management systems. We devise a new translation of an arbitrary RC query into two safe-range queries, for which the finiteness of the query’s evaluation result is guaranteed. Assuming an infinite domain, the two queries have the following meaning: The first is closed and characterizes the original query’s relative safety, i.e., whether given a fixed database, the original query evaluates to a finite relation. The second safe-range query is equivalent to the original query, if the latter is relatively safe. We compose our translation with other, more standard ones to ultimately obtain two SQL queries. This allows us to use standard database management systems to evaluate arbitrary RC queries. We show that our translation improves the time complexity over existing approaches, which we also empirically confirm in both realistic and synthetic experiments.
关系演算(RC)是一种简洁的声明性查询语言。然而,现有的RC查询评估方法效率低下,并且经常偏离数据库管理系统中基于有限表的既定算法。我们设计了一种新的将任意RC查询转换为两个安全范围查询的方法,从而保证了查询求值结果的有限性。假设一个无限域,这两个查询具有以下含义:第一个是封闭的,表征了原始查询的相对安全性,即给定一个固定的数据库,原始查询是否计算为有限关系。第二个安全范围查询相当于第一个查询,如果后者相对安全的话。我们将翻译与其他更标准的翻译组合在一起,最终获得两个SQL查询。这允许我们使用标准的数据库管理系统来评估任意的RC查询。我们表明,我们的翻译提高了现有方法的时间复杂度,我们也在现实和合成实验中证实了这一点。
{"title":"Practical Relational Calculus Query Evaluation","authors":"Martin Raszyk, D. Basin, S. Krstic, Dmitriy Traytel","doi":"10.4230/LIPIcs.ICDT.2022.11","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2022.11","url":null,"abstract":"The relational calculus (RC) is a concise, declarative query language. However, existing RC query evaluation approaches are inefficient and often deviate from established algorithms based on finite tables used in database management systems. We devise a new translation of an arbitrary RC query into two safe-range queries, for which the finiteness of the query’s evaluation result is guaranteed. Assuming an infinite domain, the two queries have the following meaning: The first is closed and characterizes the original query’s relative safety, i.e., whether given a fixed database, the original query evaluates to a finite relation. The second safe-range query is equivalent to the original query, if the latter is relatively safe. We compose our translation with other, more standard ones to ultimately obtain two SQL queries. This allows us to use standard database management systems to evaluate arbitrary RC queries. We show that our translation improves the time complexity over existing approaches, which we also empirically confirm in both realistic and synthetic experiments.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"1 1","pages":"11:1-11:21"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82452571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On an Information Theoretic Approach to Cardinality Estimation (Invited Talk) 基数估计的一种信息论方法(特邀演讲)
H. Q. Ngo
{"title":"On an Information Theoretic Approach to Cardinality Estimation (Invited Talk)","authors":"H. Q. Ngo","doi":"10.4230/LIPIcs.ICDT.2022.1","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2022.1","url":null,"abstract":"","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"63 1","pages":"1:1-1:21"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83940460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Counting the Solutions to a Query (Invited Talk) 计算一个查询的解(特邀讲座)
M. Arenas
{"title":"Counting the Solutions to a Query (Invited Talk)","authors":"M. Arenas","doi":"10.4230/LIPIcs.ICDT.2022.2","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2022.2","url":null,"abstract":"","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"11 1","pages":"2:1-2:1"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89252077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Answering Unions of Conjunctive Queries with Ideal Time Guarantees (Invited Talk) 具有理想时间保证的连接查询的联合回答(特邀演讲)
Nofar Carmeli
The holy grail we strive for is, given a query, to identify an algorithm that answers it over general databases with optimal time guarantees for the specific query. In this tutorial, we focus on what can be seen as ideal time guarantees: linear preprocessing (needed to read the input) and constant time per answer (needed to print the output). We seek to understand which queries can be solved with these (or almost these) time guarantees and how. We start with the basic building blocks of database queries: joins, and slowly increase the expressivity by introducing projections and unions until we cover positive relational algebra. We first consider the task of enumerating all query answers and then discuss related, more demanding, tasks such as ordered enumeration and direct access to query answers. We investigate the challenges in answering such queries and provide algorithms and conditional lower bounds 2012 ACM Classification Theory of computation → Database query processing and
我们努力追求的目标是,给定一个查询,确定一个算法,该算法可以在通用数据库上回答该查询,并为特定查询提供最佳的时间保证。在本教程中,我们关注的是理想的时间保证:线性预处理(需要读取输入)和每个答案的恒定时间(需要打印输出)。我们试图了解哪些查询可以用这些(或几乎是这些)时间保证来解决,以及如何解决。我们从数据库查询的基本构建块开始:连接,然后通过引入投影和联合慢慢增加表现力,直到我们涵盖正关系代数。我们首先考虑枚举所有查询答案的任务,然后讨论相关的、更苛刻的任务,如有序枚举和直接访问查询答案。我们研究了回答这类查询的挑战,并提供了算法和条件下界2012 ACM分类计算理论→数据库查询处理和
{"title":"Answering Unions of Conjunctive Queries with Ideal Time Guarantees (Invited Talk)","authors":"Nofar Carmeli","doi":"10.4230/LIPIcs.ICDT.2022.3","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2022.3","url":null,"abstract":"The holy grail we strive for is, given a query, to identify an algorithm that answers it over general databases with optimal time guarantees for the specific query. In this tutorial, we focus on what can be seen as ideal time guarantees: linear preprocessing (needed to read the input) and constant time per answer (needed to print the output). We seek to understand which queries can be solved with these (or almost these) time guarantees and how. We start with the basic building blocks of database queries: joins, and slowly increase the expressivity by introducing projections and unions until we cover positive relational algebra. We first consider the task of enumerating all query answers and then discuss related, more demanding, tasks such as ordered enumeration and direct access to query answers. We investigate the challenges in answering such queries and provide algorithms and conditional lower bounds 2012 ACM Classification Theory of computation → Database query processing and","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"13 1","pages":"3:1-3:1"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81977587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Do Centrality Measures Choose the Root of Trees? 中心性度量如何选择树的根?
Cristian Riveros, J. Salas, Oskar Skibski
Centrality measures are widely used to assign importance to graph-structured data. Recently, understanding the principles of such measures has attracted a lot of attention. Given that measures are diverse, this research has usually focused on classes of centrality measures. In this work, we provide a different approach by focusing on classes of graphs instead of classes of measures to understand the underlying principles among various measures. More precisely, we study the class of trees. We observe that even in fix{the} case of trees, there is no consensus on which node should be selected as the most central. To analyze the behavior of centrality measures on trees, we introduce a property of emph{tree rooting} that states a measure selects one or two adjacent nodes as the most important, and the importance decreases from them in all directions. This property is satisfied by closeness centrality but violated by PageRank. We show that, for several centrality measures that root trees, the comparison of adjacent nodes can be inferred by emph{potential functions} that assess the quality of trees. We use these functions to give fundamental insights on rooting and derive a characterization explaining why some measure root trees. Moreover, we provide an almost liner-time algorithm to compute the root of a graph by using potential functions. Finally, using a family of potential functions, we show that many ways of tree rooting exist with desirable properties.
中心性度量被广泛用于分配图结构数据的重要性。最近,了解这些措施的原理引起了很多关注。考虑到测量的多样性,这项研究通常集中在中心性测量的类别上。在这项工作中,我们提供了一种不同的方法,通过关注图类而不是测度类来理解各种测度之间的基本原理。更准确地说,我们研究树的类别。我们观察到,即使在fix{the}树的情况下,对于哪个节点应该被选为最中心也没有共识。为了分析中心性测度在树上的行为,我们引入了emph{树生根}的一个性质,即一个测度选择一个或两个相邻的节点作为最重要的节点,并从它们的所有方向上降低重要性。这个属性是由接近中心性满足的,但是PageRank违背了这个属性。我们表明,对于树根的几个中心性度量,相邻节点的比较可以通过评估树质量的emph{潜在函数}来推断。我们使用这些函数来给出生根的基本见解,并推导出一个表征,解释为什么有些测量根树。此外,我们提供了一个几乎线性时间的算法来计算一个图的根使用势函数。最后,利用一组势函数,我们证明了存在许多具有理想性质的树生根方式。
{"title":"How Do Centrality Measures Choose the Root of Trees?","authors":"Cristian Riveros, J. Salas, Oskar Skibski","doi":"10.4230/LIPIcs.ICDT.2023.12","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2023.12","url":null,"abstract":"Centrality measures are widely used to assign importance to graph-structured data. Recently, understanding the principles of such measures has attracted a lot of attention. Given that measures are diverse, this research has usually focused on classes of centrality measures. In this work, we provide a different approach by focusing on classes of graphs instead of classes of measures to understand the underlying principles among various measures. More precisely, we study the class of trees. We observe that even in fix{the} case of trees, there is no consensus on which node should be selected as the most central. To analyze the behavior of centrality measures on trees, we introduce a property of emph{tree rooting} that states a measure selects one or two adjacent nodes as the most important, and the importance decreases from them in all directions. This property is satisfied by closeness centrality but violated by PageRank. We show that, for several centrality measures that root trees, the comparison of adjacent nodes can be inferred by emph{potential functions} that assess the quality of trees. We use these functions to give fundamental insights on rooting and derive a characterization explaining why some measure root trees. Moreover, we provide an almost liner-time algorithm to compute the root of a graph by using potential functions. Finally, using a family of potential functions, we show that many ways of tree rooting exist with desirable properties.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"15 1","pages":"12:1-12:17"},"PeriodicalIF":0.0,"publicationDate":"2021-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88554988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Dyadic Simulation Approach to Efficient Range-Summability 有效距离可和性的二元仿真方法
Jingfan Meng, Huayi Wang, Jun Xu, M. Ogihara
Efficient range-summability (ERS) of a long list of random variables is a fundamental algorithmic problem that has applications to three important database applications, namely, data stream processing, space-efficient histogram maintenance (SEHM), and approximate nearest neighbor searches (ANNS). In this work, we propose a novel dyadic simulation framework and develop three novel ERS solutions, namely Gaussian-dyadic simulation tree (DST), Cauchy-DST and Random Walk-DST, using it. We also propose novel rejection sampling techniques to make these solutions computationally efficient. Furthermore, we develop a novel k-wise independence theory that allows our ERS solutions to have both high computational efficiencies and strong provable independence guarantees.
随机变量长列表的有效范围可和性(ERS)是一个基本的算法问题,在数据流处理、空间高效直方图维护(SEHM)和近似最近邻搜索(ANNS)这三个重要的数据库应用中都有应用。在这项工作中,我们提出了一个新的并矢仿真框架,并使用它开发了三种新的ERS解决方案,即高斯并矢仿真树(DST),柯西DST和随机行走DST。我们还提出了新的拒绝采样技术,使这些解决方案的计算效率。此外,我们开发了一种新颖的k-wise独立理论,使我们的ERS解决方案具有高计算效率和强大的可证明独立性保证。
{"title":"A Dyadic Simulation Approach to Efficient Range-Summability","authors":"Jingfan Meng, Huayi Wang, Jun Xu, M. Ogihara","doi":"10.4230/LIPIcs.ICDT.2022.17","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2022.17","url":null,"abstract":"Efficient range-summability (ERS) of a long list of random variables is a fundamental algorithmic problem that has applications to three important database applications, namely, data stream processing, space-efficient histogram maintenance (SEHM), and approximate nearest neighbor searches (ANNS). In this work, we propose a novel dyadic simulation framework and develop three novel ERS solutions, namely Gaussian-dyadic simulation tree (DST), Cauchy-DST and Random Walk-DST, using it. We also propose novel rejection sampling techniques to make these solutions computationally efficient. Furthermore, we develop a novel k-wise independence theory that allows our ERS solutions to have both high computational efficiencies and strong provable independence guarantees.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"41 1","pages":"17:1-17:18"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86879890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Enumeration Algorithms for Conjunctive Queries with Projection 带投影的合取查询的枚举算法
Shaleen Deep, Xiao Hu, Paraschos Koutris
We investigate the enumeration of query results for an important subset of CQs with projections, namely star and path queries. The task is to design data structures and algorithms that allow for efficient enumeration with delay guarantees after a preprocessing phase. Our main contribution is a series of results based on the idea of interleaving precomputed output with further join processing to maintain delay guarantees, which maybe of independent interest. In particular, for star queries, we design combinatorial algorithms that provide instance-specific delay guarantees in linear preprocessing time. These algorithms improve upon the currently best known results. Further, we show how existing results can be improved upon by using fast matrix multiplication. We also present new results involving tradeoff between preprocessing time and delay guarantees for enumeration of path queries that contain projections. Boolean matrix multiplication is an important query that can be expressed as a CQ with projection where the join attribute is projected away. Our results can therefore also be interpreted as sparse, output-sensitive matrix multiplication with delay guarantees.
我们研究了具有投影的cq的一个重要子集,即星型查询和路径查询的查询结果枚举。任务是设计数据结构和算法,允许在预处理阶段后进行有效的枚举,并保证延迟。我们的主要贡献是一系列基于将预先计算的输出与进一步的连接处理交织在一起以保持延迟保证的思想的结果,这可能是独立的兴趣。特别是,对于星型查询,我们设计了组合算法,在线性预处理时间内提供实例特定的延迟保证。这些算法改进了目前最知名的结果。此外,我们将展示如何通过使用快速矩阵乘法来改进现有的结果。我们还提供了新的结果,涉及在包含投影的路径查询枚举的预处理时间和延迟保证之间的权衡。布尔矩阵乘法是一个重要的查询,它可以表示为带有投影的CQ,其中连接属性被投影掉了。因此,我们的结果也可以解释为具有延迟保证的稀疏、输出敏感的矩阵乘法。
{"title":"Enumeration Algorithms for Conjunctive Queries with Projection","authors":"Shaleen Deep, Xiao Hu, Paraschos Koutris","doi":"10.4230/LIPIcs.ICDT.2021.14","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2021.14","url":null,"abstract":"We investigate the enumeration of query results for an important subset of CQs with projections, namely star and path queries. The task is to design data structures and algorithms that allow for efficient enumeration with delay guarantees after a preprocessing phase. Our main contribution is a series of results based on the idea of interleaving precomputed output with further join processing to maintain delay guarantees, which maybe of independent interest. In particular, for star queries, we design combinatorial algorithms that provide instance-specific delay guarantees in linear preprocessing time. These algorithms improve upon the currently best known results. Further, we show how existing results can be improved upon by using fast matrix multiplication. We also present new results involving tradeoff between preprocessing time and delay guarantees for enumeration of path queries that contain projections. Boolean matrix multiplication is an important query that can be expressed as a CQ with projection where the join attribute is projected away. Our results can therefore also be interpreted as sparse, output-sensitive matrix multiplication with delay guarantees.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"101 1","pages":"14:1-14:17"},"PeriodicalIF":0.0,"publicationDate":"2021-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80744953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Fine-Grained Complexity of Regular Path Queries 正则路径查询的细粒度复杂度
Katrin Casel, Markus L. Schmid
A regular path query (RPQ) is a regular expression q that returns all node pairs (u, v) from a graph database that are connected by an arbitrary path labelled with a word from L(q). The obvious algorithmic approach to RPQ-evaluation (called PG-approach), i.e., constructing the product graph between an NFA for q and the graph database, is appealing due to its simplicity and also leads to efficient algorithms. However, it is unclear whether the PG-approach is optimal. We address this question by thoroughly investigating which upper complexity bounds can be achieved by the PG-approach, and we complement these with conditional lower bounds (in the sense of the fine-grained complexity framework). A special focus is put on enumeration and delay bounds, as well as the data complexity perspective. A main insight is that we can achieve optimal (or near optimal) algorithms with the PG-approach, but the delay for enumeration is rather high (linear in the database). We explore three successful approaches towards enumeration with sub-linear delay: super-linear preprocessing, approximations of the solution sets, and restricted classes of RPQs.
正则路径查询(RPQ)是一个正则表达式q,它返回图数据库中所有节点对(u, v),这些节点对由一个标记为L(q)中的单词的任意路径连接。rpq评估的明显算法方法(称为pg方法),即在q的NFA和图形数据库之间构建乘积图,由于其简单性和高效算法而具有吸引力。然而,目前尚不清楚pg方法是否最佳。我们通过彻底研究pg方法可以实现哪些上限复杂性界限来解决这个问题,并且我们用条件下界(在细粒度复杂性框架的意义上)来补充这些上限。特别关注枚举和延迟边界,以及数据复杂性透视图。主要的见解是,我们可以使用pg方法实现最优(或接近最优)算法,但是枚举的延迟相当高(在数据库中是线性的)。我们探索了三种成功的亚线性延迟枚举方法:超线性预处理、解集逼近和rpq的限制类。
{"title":"Fine-Grained Complexity of Regular Path Queries","authors":"Katrin Casel, Markus L. Schmid","doi":"10.4230/LIPIcs.ICDT.2021.19","DOIUrl":"https://doi.org/10.4230/LIPIcs.ICDT.2021.19","url":null,"abstract":"A regular path query (RPQ) is a regular expression q that returns all node pairs (u, v) from a graph database that are connected by an arbitrary path labelled with a word from L(q). The obvious algorithmic approach to RPQ-evaluation (called PG-approach), i.e., constructing the product graph between an NFA for q and the graph database, is appealing due to its simplicity and also leads to efficient algorithms. However, it is unclear whether the PG-approach is optimal. We address this question by thoroughly investigating which upper complexity bounds can be achieved by the PG-approach, and we complement these with conditional lower bounds (in the sense of the fine-grained complexity framework). A special focus is put on enumeration and delay bounds, as well as the data complexity perspective. A main insight is that we can achieve optimal (or near optimal) algorithms with the PG-approach, but the delay for enumeration is rather high (linear in the database). We explore three successful approaches towards enumeration with sub-linear delay: super-linear preprocessing, approximations of the solution sets, and restricted classes of RPQs.","PeriodicalId":90482,"journal":{"name":"Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory","volume":"18 1","pages":"19:1-19:20"},"PeriodicalIF":0.0,"publicationDate":"2021-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88723911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
Database theory-- ICDT : International Conference ... proceedings. International Conference on Database Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1