Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems最新文献

英文中文

Subspace exploration: Bounds on Projected Frequency Estimation. 子空间探索:投影频率估计的边界。

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2021-06-01 Epub Date: 2021-06-20 DOI: 10.1145/3452021.3458312

Graham Cormode, Charlie Dickens, David P Woodruff

Given an n × d dimensional dataset A, a projection query specifies a subset C ⊆ [d] of columns which yields a new n × |C| array. We study the space complexity of computing data analysis functions over such subspaces, including heavy hitters and norms, when the subspaces are revealed only after observing the data. We show that this important class of problems is typically hard: for many problems, we show 2^Ω(d) lower bounds. However, we present upper bounds which demonstrate space dependency better than 2 ^d . That is, for c, c' ∈ (0, 1) and a parameter N = 2 ^d an N^c -approximation can be obtained in space $\min (N^{c^{'}}, n)$ , showing that it is possible to improve on the naïve approach of keeping information for all 2 ^d subsets of d columns. Our results are based on careful constructions of instances using coding theory and novel combinatorial reductions that exhibit such space-approximation tradeoffs.

给定一个n × d维的数据集A，一个投影查询指定了一个列的子集C≠[d]，该子集产生一个新的n × |C|数组。我们研究了在这些子空间上计算数据分析函数的空间复杂度，这些子空间包括重磅子空间和范数子空间，这些子空间只有在观察数据之后才会显示出来。我们证明了这类重要的问题通常是困难的:对于许多问题，我们给出了2Ω(d)下界。然而，我们提出的上界表明空间依赖性优于二维。也就是说，对于c, c'∈(0,1)，参数N = 2d，可以在空间min (Nc '， N)中得到Nc -近似，这表明可以改进naïve方法来保留d列的所有2d子集的信息。我们的结果是基于使用编码理论和新颖的组合约简的实例的仔细构建，这些组合约简展示了这种空间近似权衡。

{"title":"Subspace exploration: Bounds on Projected Frequency Estimation.","authors":"Graham Cormode, Charlie Dickens, David P Woodruff","doi":"10.1145/3452021.3458312","DOIUrl":"https://doi.org/10.1145/3452021.3458312","url":null,"abstract":"Given an n × d dimensional dataset A, a projection query specifies a subset C ⊆ [d] of columns which yields a new n × |C| array. We study the space complexity of computing data analysis functions over such subspaces, including heavy hitters and norms, when the subspaces are revealed only after observing the data. We show that this important class of problems is typically hard: for many problems, we show 2Ω(d) lower bounds. However, we present upper bounds which demonstrate space dependency better than 2 d . That is, for c, c' ∈ (0, 1) and a parameter N = 2 d an Nc -approximation can be obtained in space <math><mrow><mi>min</mi> <mrow><mo>(</mo> <mrow><msup><mi>N</mi> <mrow><msup><mi>c</mi> <mo>'</mo></msup> </mrow> </msup> <mo>,</mo> <mi>n</mi></mrow> <mo>)</mo></mrow> </mrow> </math> , showing that it is possible to improve on the naïve approach of keeping information for all 2 d subsets of d columns. Our results are based on careful constructions of instances using coding theory and novel combinatorial reductions that exhibit such space-approximation tradeoffs.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"2021 ","pages":"273-284"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3452021.3458312","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39289811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

PODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, Virtual Event, China, June 20-25, 2021 第40届ACM SIGMOD-SIGACT-SIGAI数据库系统原理研讨会论文集，虚拟事件，中国，2021年6月20-25日

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2021-01-01 DOI: 10.1145/3452021

引用次数: 1

Computing Optimal Repairs for Functional Dependencies. 计算功能依赖的最优修复。

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2018-06-01 DOI: 10.1145/3196959.3196980

Ester Livshits, Benny Kimelfeld, Sudeepa Roy

We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell) up-dates. For computing an optimal S-repair, we present a polynomial-time algorithm that succeeds on certain sets of FDs and fails on others. We prove the following about the algorithm. When it succeeds, it can also incorporate weighted tuples and duplicate tuples. When it fails, the problem is NP-hard, and in fact, APX-complete (hence, cannot be approximated better than some constant). Thus, we establish a dichotomy in the complexity of computing an optimal S-repair. We present general analysis techniques for the complexity of computing an optimal U-repair, some based on the dichotomy for S-repairs. We also draw a connection to a past dichotomy in the complexity of finding a "most probable database" that satisfies a set of FDs with a single attribute on the left hand side; the case of general FDs was left open, and we show how our dichotomy provides the missing generalization and thereby settles the open problem.

我们研究了在完整性约束为功能依赖(fd)的情况下，计算不一致数据库的最佳修复的复杂性。我们关注两种类型的修复:通过最小数量的元组删除获得的最优子集修复(最优s修复)，以及通过最小数量的值(单元)更新获得的最优更新修复(最优u修复)。为了计算最优s -修复，我们提出了一种多项式时间算法，该算法在某些fd集上成功，而在其他fd集上失败。关于该算法，我们证明了以下几点。成功后，它还可以合并加权元组和重复元组。当它失败时，问题是np困难的，实际上是apx完全的(因此，不能比某个常数更好地近似)。因此，我们建立了计算最优s -修复复杂度的二分法。我们提出了计算最优u -修理的复杂性的一般分析技术，其中一些是基于s -修理的二分法。我们还将其与过去的二分法联系起来，即寻找满足左侧具有单个属性的一组fd的“最可能数据库”的复杂性;一般fd的情况是开放的，我们展示了我们的二分法如何提供缺失的泛化，从而解决了开放的问题。

{"title":"Computing Optimal Repairs for Functional Dependencies.","authors":"Ester Livshits, Benny Kimelfeld, Sudeepa Roy","doi":"10.1145/3196959.3196980","DOIUrl":"https://doi.org/10.1145/3196959.3196980","url":null,"abstract":"We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell) up-dates. For computing an optimal S-repair, we present a polynomial-time algorithm that succeeds on certain sets of FDs and fails on others. We prove the following about the algorithm. When it succeeds, it can also incorporate weighted tuples and duplicate tuples. When it fails, the problem is NP-hard, and in fact, APX-complete (hence, cannot be approximated better than some constant). Thus, we establish a dichotomy in the complexity of computing an optimal S-repair. We present general analysis techniques for the complexity of computing an optimal U-repair, some based on the dichotomy for S-repairs. We also draw a connection to a past dichotomy in the complexity of finding a \"most probable database\" that satisfies a set of FDs with a single attribute on the left hand side; the case of general FDs was left open, and we show how our dichotomy provides the missing generalization and thereby settles the open problem.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"2018 ","pages":"225-237"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3196959.3196980","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37340326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Relational database behavior: utilizing relational discrete event systems and models 关系数据库行为:利用关系离散事件系统和模型

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2018-03-06 DOI: 10.1145/73721.73754

Z. Kedem, A. Tuzhilin

Behavior of relational databases is studied within the framework of Relational Discrete Event Systems (RDE-Ses) and Models (RDEMs). Production system and recurrence equation RDEMs are introduced, and their expressive powers are compared. Non-deterministic behavior is defined for both RDEMs and the expressive power of deterministic and non-deterministic production rule programs is also compared. This comparison shows that non-determinism increases expressive power of production systems. A formal concept of a production system interpreter is defined, and several specific interpreters are proposed. One interpreter, called parallel deterministic, is shown to be better than others in many respects, including the conflict resolution module of OPS5.

在关系离散事件系统(RDE-Ses)和关系离散事件模型(rdem)的框架下研究关系数据库的行为。介绍了生产系统和递归方程rdem，并比较了它们的表达能力。定义了rdem的非确定性行为，并比较了确定性和非确定性生成规则程序的表达能力。这种比较表明，非确定性增加了生产系统的表达能力。定义了生产系统解释器的正式概念，并提出了几种具体的解释器。一种称为并行确定性的解释器在许多方面都优于其他解释器，包括OPS5的冲突解决模块。

引用次数: 9

Data Citation: a Computational Challenge. 数据引用:一个计算挑战。

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2017-05-01 DOI: 10.1145/3034786.3056123

Susan B Davidson, Peter Buneman, Daniel Deutch, Tova Milo, Gianmaria Silvello

Data citation is an interesting computational challenge, whose solution draws on several well-studied problems in database theory: query answering using views, and provenance. We describe the problem, suggest an approach to its solution, and highlight several open research problems, both practical and theoretical.

数据引用是一个有趣的计算挑战，其解决方案借鉴了数据库理论中几个研究得很好的问题:使用视图回答查询和来源。我们描述了这个问题，提出了解决问题的方法，并强调了几个开放的研究问题，包括实践和理论。

引用次数: 11

A semantic approach to correctness of concurrent transaction executions 并发事务执行正确性的语义方法

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2015-09-07 DOI: 10.1145/325405.325416

A. Tuzhilin, P. Spirakis

引用次数: 8

Foundations of data-aware process analysis: a database theory perspective 数据感知过程分析的基础:数据库理论视角

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2013-06-22 DOI: 10.1145/2463664.2467796

Diego Calvanese, Giuseppe De Giacomo, M. Montali

In this work we survey the research on foundations of data-aware (business) processes that has been carried out in the database theory community. We show that this community has indeed developed over the years a multi-faceted culture of merging data and processes. We argue that it is this community that should lay the foundations to solve, at least from the point of view of formal analysis, the dichotomy between data and processes still persisting in business process management.

在这项工作中，我们调查了数据库理论社区中对数据感知(业务)过程基础的研究。我们表明，这个社区多年来确实发展了一种融合数据和流程的多方面文化。我们认为，至少从形式化分析的角度来看，正是这个社区应该为解决业务流程管理中仍然存在的数据和流程之间的二分法奠定基础。

引用次数: 153

On XPath with transitive axes and data tests 关于具有传递轴和数据测试的XPath

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2013-06-22 DOI: 10.1145/2463664.2463675

Diego Figueira

We study the satisfiability problem for XPath with data equality tests. XPath is a node selecting language for XML documents whose satisfiability problem is known to be undecidable, even for very simple fragments. However, we show that the satisfiability for XPath with the rightward, leftward and downward reflexive-transitive axes (namely following-sibling-or-self, preceding-sibling-or-self, descendant-or-self) is decidable. Our algorithm yields a complexity of 3EXPSPACE, and we also identify an expressive-equivalent normal form for the logic for which the satisfiability problem is in 2EXPSPACE. These results are in contrast with the undecidability of the satisfiability problem as soon as we replace the reflexive-transitive axes with just transitive (non-reflexive) ones.

利用数据相等性检验研究了XPath的可满足性问题。XPath是一种用于XML文档的节点选择语言，XML文档的可满足性问题是无法确定的，即使对于非常简单的片段也是如此。然而，我们证明了XPath对向右、向左和向下的自反传递轴(即继兄弟姐妹或自、前兄弟姐妹或自、后代或自)的可满足性是可确定的。我们的算法产生了3EXPSPACE的复杂度，并且我们还确定了2EXPSPACE中可满足性问题的逻辑的表达等价范式。这些结果与我们只用传递轴(非自反)代替自反传递轴时的可满足性问题的不可判定性形成了对比。

引用次数: 13

Well-founded semantics for extended datalog and ontological reasoning 扩展数据和本体论推理的良好基础语义

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2013-06-22 DOI: 10.1145/2463664.2465229

André Hernich, C. Kupke, Thomas Lukasiewicz, G. Gottlob

The Datalog± family of expressive extensions of Datalog has recently been introduced as a new paradigm for query answering over ontologies, which captures and extends several common description logics. It extends plain Datalog by features such as existentially quantified rule heads and, at the same time, restricts the rule syntax so as to achieve decidability and tractability. In this paper, we continue the research on Datalog±. More precisely, we generalize the well-founded semantics (WFS), as the standard semantics for nonmonotonic normal programs in the database context, to Datalog± programs with negation under the unique name assumption (UNA). We prove that for guarded Datalog± with negation under the standard WFS, answering normal Boolean conjunctive queries is decidable, and we provide precise complexity results for this problem, namely, in particular, completeness for PTIME (resp., 2-EXPTIME) in the data (resp., combined) complexity.

Datalog的表达性扩展家族最近被引入，作为对本体进行查询应答的新范式，它捕获并扩展了几种常见的描述逻辑。它通过存在量化的规则头等特性扩展了plain Datalog，同时对规则语法进行了限制，从而实现了可判定性和可追溯性。本文继续对Datalog±进行研究。更准确地说，我们将良好基础语义(WFS)作为数据库环境中非单调正常程序的标准语义推广到唯一名称假设(UNA)下具有否定的Datalog±程序。我们证明了在标准WFS下，对于带否定的保守Datalog±，回答正常布尔连接查询是可判定的，并给出了该问题的精确复杂度结果，特别是PTIME (rep .)的完备性。， 2-EXPTIME)。(组合)复杂性。

引用次数: 32

Communication steps for parallel query processing 并行查询处理的通信步骤

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Pub Date : 2013-06-22 DOI: 10.1145/2463664.2465224

P. Beame, Paraschos Koutris, Dan Suciu

We consider the problem of computing a relational query q on a large input database of size n, using a large number p of servers. The computation is performed in rounds, and each server can receive only O(n/p1-ε) bits of data, where ε ∈[0,1] is a parameter that controls replication. We examine how many global communication steps are needed to compute q. We establish both lower and upper bounds, in two settings. For a single round of communication, we give lower bounds in the strongest possible model, where arbitrary bits may be exchanged; we show that any algorithm requires ε ≥ 1--1/τ*, where τ* is the fractional vertex cover of the hypergraph of q. We also give an algorithm that matches the lower bound for a specific class of databases. For multiple rounds of communication, we present lower bounds in a model where routing decisions for a tuple are tuple-based. We show that for the class of tree-like queries there exists a tradeoff between the number of rounds and the space exponent ε. The lower bounds for multiple rounds are the first of their kind. Our results also imply that transitive closure cannot be computed in O(1) rounds of communication.

我们考虑在大小为n的大型输入数据库上计算关系查询q的问题，使用大量的p台服务器。计算是轮询进行的，每个服务器只能接收O(n/p1-ε)位数据，其中ε∈[0,1]是控制复制的参数。我们检查了计算q需要多少全局通信步骤。我们在两种设置中建立了下界和上界。对于单轮通信，我们给出了最强可能模型的下界，其中任意位可以交换;我们证明了任何算法都需要ε≥1—1/τ*，其中τ*是q的超图的分数顶点覆盖。我们还给出了一个匹配特定数据库类下界的算法。对于多轮通信，我们在元组的路由决策是基于元组的模型中给出了下限。我们表明，对于类树查询，在轮数和空间指数ε之间存在权衡。多轮比赛的下界是第一次。我们的结果还暗示传递闭包不能在O(1)轮通信中计算。

{"title":"Communication steps for parallel query processing","authors":"P. Beame, Paraschos Koutris, Dan Suciu","doi":"10.1145/2463664.2465224","DOIUrl":"https://doi.org/10.1145/2463664.2465224","url":null,"abstract":"We consider the problem of computing a relational query q on a large input database of size n, using a large number p of servers. The computation is performed in rounds, and each server can receive only O(n/p1-ε) bits of data, where ε ∈[0,1] is a parameter that controls replication. We examine how many global communication steps are needed to compute q. We establish both lower and upper bounds, in two settings. For a single round of communication, we give lower bounds in the strongest possible model, where arbitrary bits may be exchanged; we show that any algorithm requires ε ≥ 1--1/τ*, where τ* is the fractional vertex cover of the hypergraph of q. We also give an algorithm that matches the lower bound for a specific class of databases. For multiple rounds of communication, we present lower bounds in a model where routing decisions for a tuple are tuple-based. We show that for the class of tree-like queries there exists a tradeoff between the number of rounds and the space exponent ε. The lower bounds for multiple rounds are the first of their kind. Our results also imply that transitive closure cannot be computed in O(1) rounds of communication.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"24 1","pages":"273-284"},"PeriodicalIF":0.0,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84552276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 262

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀