Exact Model Counting of Query Expressions

ACM Transactions on Database Systems (TODS) Pub Date : 2017-02-03 DOI:10.1145/2984632

P. Beame, Jerry Li, Sudeepa Roy, Dan Suciu

{"title":"Exact Model Counting of Query Expressions","authors":"P. Beame, Jerry Li, Sudeepa Roy, Dan Suciu","doi":"10.1145/2984632","DOIUrl":null,"url":null,"abstract":"We prove exponential lower bounds on the running time of the state-of-the-art exact model counting algorithms—algorithms for exactly computing the number of satisfying assignments, or the satisfying probability, of Boolean formulas. These algorithms can be seen, either directly or indirectly, as building Decision-Decomposable Negation Normal Form (decision-DNNF) representations of the input Boolean formulas. Decision-DNNFs are a special case of d-DNNFs where d stands for deterministic. We show that any knowledge compilation representations from a class (called DLDDs in this article) that contain decision-DNNFs can be converted into equivalent Free Binary Decision Diagrams (FBDDs), also known as Read-Once Branching Programs, with only a quasi-polynomial increase in representation size. Leveraging known exponential lower bounds for FBDDs, we then obtain similar exponential lower bounds for decision-DNNFs, which imply exponential lower bounds for model-counting algorithms. We also separate the power of decision-DNNFs from d-DNNFs and a generalization of decision-DNNFs known as AND-FBDDs. We then prove new lower bounds for FBDDs that yield exponential lower bounds on the running time of these exact model counters when applied to the problem of query evaluation in tuple-independent probabilistic databases—computing the probability of an answer to a query given independent probabilities of the individual tuples in a database instance. This approach to the query evaluation problem, in which one first obtains the lineage for the query and database instance as a Boolean formula and then performs weighted model counting on the lineage, is known as grounded inference. A second approach, known as lifted inference or extensional query evaluation, exploits the high-level structure of the query as a first-order formula. Although it has been widely believed that lifted inference is strictly more powerful than grounded inference on the lineage alone, no formal separation has previously been shown for query evaluation. In this article, we show such a formal separation for the first time. In particular, we exhibit a family of database queries for which polynomial-time extensional query evaluation techniques were previously known but for which query evaluation via grounded inference using the state-of-the-art exact model counters requires exponential time.","PeriodicalId":6983,"journal":{"name":"ACM Transactions on Database Systems (TODS)","volume":"60 1","pages":"1 - 46"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"65","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Database Systems (TODS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2984632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 65

Abstract

We prove exponential lower bounds on the running time of the state-of-the-art exact model counting algorithms—algorithms for exactly computing the number of satisfying assignments, or the satisfying probability, of Boolean formulas. These algorithms can be seen, either directly or indirectly, as building Decision-Decomposable Negation Normal Form (decision-DNNF) representations of the input Boolean formulas. Decision-DNNFs are a special case of d-DNNFs where d stands for deterministic. We show that any knowledge compilation representations from a class (called DLDDs in this article) that contain decision-DNNFs can be converted into equivalent Free Binary Decision Diagrams (FBDDs), also known as Read-Once Branching Programs, with only a quasi-polynomial increase in representation size. Leveraging known exponential lower bounds for FBDDs, we then obtain similar exponential lower bounds for decision-DNNFs, which imply exponential lower bounds for model-counting algorithms. We also separate the power of decision-DNNFs from d-DNNFs and a generalization of decision-DNNFs known as AND-FBDDs. We then prove new lower bounds for FBDDs that yield exponential lower bounds on the running time of these exact model counters when applied to the problem of query evaluation in tuple-independent probabilistic databases—computing the probability of an answer to a query given independent probabilities of the individual tuples in a database instance. This approach to the query evaluation problem, in which one first obtains the lineage for the query and database instance as a Boolean formula and then performs weighted model counting on the lineage, is known as grounded inference. A second approach, known as lifted inference or extensional query evaluation, exploits the high-level structure of the query as a first-order formula. Although it has been widely believed that lifted inference is strictly more powerful than grounded inference on the lineage alone, no formal separation has previously been shown for query evaluation. In this article, we show such a formal separation for the first time. In particular, we exhibit a family of database queries for which polynomial-time extensional query evaluation techniques were previously known but for which query evaluation via grounded inference using the state-of-the-art exact model counters requires exponential time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

查询表达式的精确模型计数

我们证明了最先进的精确模型计数算法的运行时间的指数下界——精确计算布尔公式的满足赋值的数量或满足概率的算法。这些算法可以直接或间接地看作是为输入布尔公式构建决策可分解否定范式(decision-DNNF)表示。Decision-DNNFs是d- dnnfs的一种特殊情况，其中d代表确定性。我们展示了包含Decision - dnnf的类(本文中称为DLDDs)中的任何知识编译表示都可以转换为等效的自由二进制决策图(fbdd)，也称为一次读分支程序，表示大小仅增加准多项式。利用已知的fbdd的指数下界，我们得到了类似的决策- dnnf的指数下界，这意味着模型计数算法的指数下界。我们还将决策- dnnfs的权力与d-DNNFs和决策- dnnfs的概括(称为and - fbdd)分开。然后，我们证明了fbdd的新下界，当应用于元组独立概率数据库中的查询求值问题时，这些精确模型计数器的运行时间产生指数下界——给定数据库实例中单个元组的独立概率，计算查询得到答案的概率。这种处理查询求值问题的方法被称为基于推理(grounded inference)，即首先以布尔公式的形式获得查询和数据库实例的沿袭，然后对沿袭执行加权模型计数。第二种方法称为提升推理或扩展查询求值，它利用查询的高级结构作为一阶公式。虽然人们普遍认为，提升推理严格地比仅对谱系进行扎根推理更强大，但以前没有显示过对查询评估的正式分离。在本文中，我们将首次展示这种正式的分离。特别是，我们展示了一系列数据库查询，这些查询以前已知多项式时间扩展查询评估技术，但通过使用最先进的精确模型计数器进行基于推理的查询评估需要指数时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Database Systems (TODS)

自引率

0.00%

发文量

期刊最新文献

On Finding Rank Regret Representatives Answering (Unions of) Conjunctive Queries using Random Access and Random-Order Enumeration Persistent Summaries Influence Maximization Revisited: Efficient Sampling with Bound Tightened The Space-Efficient Core of Vadalog