Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing最新文献_第6页

An adaptive sublinear-time block sparse fourier transform 一种自适应亚线性时间块稀疏傅里叶变换

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-02-04 DOI: 10.1145/3055399.3055462

V. Cevher, M. Kapralov, J. Scarlett, A. Zandieh

The problem of approximately computing the k dominant Fourier coefficients of a vector X quickly, and using few samples in time domain, is known as the Sparse Fourier Transform (sparse FFT) problem. A long line of work on the sparse FFT has resulted in algorithms with O(klognlog(n/k)) runtime [Hassanieh et al., STOC'12] and O(klogn) sample complexity [Indyk et al., FOCS'14]. This paper revisits the sparse FFT problem with the added twist that the sparse coefficients approximately obey a (k0,k1)-block sparse model. In this model, signal frequencies are clustered in k0 intervals with width k1 in Fourier space, and k= k0k1 is the total sparsity. Our main result is the first sparse FFT algorithm for (k0, k1)-block sparse signals with a sample complexity of O*(k0k1 + k0log(1+ k0)logn) at constant signal-to-noise ratios, and sublinear runtime. Our algorithm crucially uses adaptivity to achieve the improved sample complexity bound, and we provide a lower bound showing that this is essential in the Fourier setting: Any non-adaptive algorithm must use Ω(k0k1logn/k0k1) samples for the (k0,k1)-block sparse model, ruling out improvements over the vanilla sparsity assumption. Our main technical innovation for adaptivity is a new randomized energy-based importance sampling technique that may be of independent interest.

稀疏傅里叶变换(Sparse Fourier Transform，简称稀疏FFT)问题是在时域中使用少量样本，快速近似计算向量X的k个傅里叶优势系数的问题。对稀疏FFT的长期研究已经产生了O(kloglog (n/k))运行时间的算法[Hassanieh等人，STOC'12]和O(klogn)样本复杂度的算法[Indyk等人，FOCS'14]。本文重新研究了稀疏FFT问题，增加了稀疏系数近似服从(k0,k1)块稀疏模型的扭曲。在该模型中，信号频率在Fourier空间中以k0个宽度为k1的区间聚类，k= k0k1为总稀疏度。我们的主要成果是(k0, k1)块稀疏信号的第一个稀疏FFT算法，其样本复杂度为O*(k0k1 + k0log(1+ k0)logn)，信噪比恒定，运行时间为亚线性。我们的算法至关重要地使用自适应性来实现改进的样本复杂性界限，并且我们提供了一个下界，表明这在傅立叶设置中是必不可少的:任何非自适应算法必须使用Ω(k0k1logn/k0k1)样本用于(k0,k1)块稀疏模型，排除了香草稀疏性假设的改进。我们在适应性方面的主要技术创新是一种新的基于随机能量的重要抽样技术，这可能是一个独立的兴趣。

{"title":"An adaptive sublinear-time block sparse fourier transform","authors":"V. Cevher, M. Kapralov, J. Scarlett, A. Zandieh","doi":"10.1145/3055399.3055462","DOIUrl":"https://doi.org/10.1145/3055399.3055462","url":null,"abstract":"The problem of approximately computing the k dominant Fourier coefficients of a vector X quickly, and using few samples in time domain, is known as the Sparse Fourier Transform (sparse FFT) problem. A long line of work on the sparse FFT has resulted in algorithms with O(klognlog(n/k)) runtime [Hassanieh et al., STOC'12] and O(klogn) sample complexity [Indyk et al., FOCS'14]. This paper revisits the sparse FFT problem with the added twist that the sparse coefficients approximately obey a (k0,k1)-block sparse model. In this model, signal frequencies are clustered in k0 intervals with width k1 in Fourier space, and k= k0k1 is the total sparsity. Our main result is the first sparse FFT algorithm for (k0, k1)-block sparse signals with a sample complexity of O*(k0k1 + k0log(1+ k0)logn) at constant signal-to-noise ratios, and sublinear runtime. Our algorithm crucially uses adaptivity to achieve the improved sample complexity bound, and we provide a lower bound showing that this is essential in the Fourier setting: Any non-adaptive algorithm must use Ω(k0k1logn/k0k1) samples for the (k0,k1)-block sparse model, ruling out improvements over the vanilla sparsity assumption. Our main technical innovation for adaptivity is a new randomized energy-based importance sampling technique that may be of independent interest.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89031162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

The non-cooperative tile assembly model is not intrinsically universal or capable of bounded Turing machine simulation 非合作拼装模型本身不具有通用性，也不能进行有界图灵机仿真

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-02-01 DOI: 10.1145/3055399.3055446

Pierre-Etienne Meunier, D. Woods

The field of algorithmic self-assembly is concerned with the computational and expressive power of nanoscale self-assembling molecular systems. In the well-studied cooperative, or temperature 2, abstract tile assembly model it is known that there is a tile set to simulate any Turing machine and an intrinsically universal tile set that simulates the shapes and dynamics of any instance of the model, up to spatial rescaling. It has been an open question as to whether the seemingly simpler noncooperative, or temperature 1, model is capable of such behaviour. Here we show that this is not the case by showing that there is no tile set in the noncooperative model that is intrinsically universal, nor one capable of time-bounded Turing machine simulation within a bounded region of the plane. Although the noncooperative model intuitively seems to lack the complexity and power of the cooperative model it has been exceedingly hard to prove this. One reason is that there have been few tools to analyse the structure of complicated paths in the plane. This paper provides a number of such tools. A second reason is that almost every obvious and small generalisation to the model (e.g. allowing error, 3D, non-square tiles, signals/wires on tiles, tiles that repel each other, parallel synchronous growth) endows it with great computational, and sometimes simulation, power. Our main results show that all of these generalisations provably increase computational and/or simulation power. Our results hold for both deterministic and nondeterministic noncooperative systems. Our first main result stands in stark contrast with the fact that for both the cooperative tile assembly model, and for 3D noncooperative tile assembly, there are respective intrinsically universal tilesets. Our second main result gives a new technique (reduction to simulation) for proving negative results about computation in tile assembly.

算法自组装领域关注的是纳米级自组装分子系统的计算和表达能力。在经过充分研究的协作(或温度2)抽象瓦片组装模型中，已知有一个瓦片集可以模拟任何图灵机，还有一个本质上通用的瓦片集可以模拟模型的任何实例的形状和动态，直至空间重新缩放。表面上看起来更简单的非合作模式(即温度1)是否具有这种行为，一直是一个悬而未决的问题。在这里，我们通过展示在非合作模型中没有本质上通用的贴图集，也没有能够在平面的有界区域内进行有时间限制的图灵机模拟来证明情况并非如此。尽管从直觉上看，非合作模型似乎缺乏合作模型的复杂性和威力，但证明这一点却极其困难。一个原因是，几乎没有工具来分析平面上复杂路径的结构。本文提供了一些这样的工具。第二个原因是，几乎每一个明显的和小的一般化模型(例如，允许误差，3D，非正方形瓷砖，瓷砖上的信号/电线，相互排斥的瓷砖，并行同步增长)都赋予它强大的计算能力，有时是模拟能力。我们的主要结果表明，所有这些推广都可以证明提高计算和/或模拟能力。我们的结果适用于确定性和非确定性非合作系统。我们的第一个主要结果与以下事实形成鲜明对比:对于合作瓷砖组装模型和3D非合作瓷砖组装，都存在各自的内在通用瓷砖集。我们的第二个主要结果给出了一种新的技术(还原到模拟)来证明关于瓷砖组装计算的负面结果。

{"title":"The non-cooperative tile assembly model is not intrinsically universal or capable of bounded Turing machine simulation","authors":"Pierre-Etienne Meunier, D. Woods","doi":"10.1145/3055399.3055446","DOIUrl":"https://doi.org/10.1145/3055399.3055446","url":null,"abstract":"The field of algorithmic self-assembly is concerned with the computational and expressive power of nanoscale self-assembling molecular systems. In the well-studied cooperative, or temperature 2, abstract tile assembly model it is known that there is a tile set to simulate any Turing machine and an intrinsically universal tile set that simulates the shapes and dynamics of any instance of the model, up to spatial rescaling. It has been an open question as to whether the seemingly simpler noncooperative, or temperature 1, model is capable of such behaviour. Here we show that this is not the case by showing that there is no tile set in the noncooperative model that is intrinsically universal, nor one capable of time-bounded Turing machine simulation within a bounded region of the plane. Although the noncooperative model intuitively seems to lack the complexity and power of the cooperative model it has been exceedingly hard to prove this. One reason is that there have been few tools to analyse the structure of complicated paths in the plane. This paper provides a number of such tools. A second reason is that almost every obvious and small generalisation to the model (e.g. allowing error, 3D, non-square tiles, signals/wires on tiles, tiles that repel each other, parallel synchronous growth) endows it with great computational, and sometimes simulation, power. Our main results show that all of these generalisations provably increase computational and/or simulation power. Our results hold for both deterministic and nondeterministic noncooperative systems. Our first main result stands in stark contrast with the fact that for both the cooperative tile assembly model, and for 3D noncooperative tile assembly, there are respective intrinsically universal tilesets. Our second main result gives a new technique (reduction to simulation) for proving negative results about computation in tile assembly.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80541816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

Quantum entanglement, sum of squares, and the log rank conjecture 量子纠缠，平方和和对数秩猜想

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-01-23 DOI: 10.1145/3055399.3055488

B. Barak, Pravesh Kothari, David Steurer

For every constant ε>0, we give an exp(Õ(∞n))-time algorithm for the 1 vs 1 - ε Best Separable State (BSS) problem of distinguishing, given an n2 x n2 matrix ℳ corresponding to a quantum measurement, between the case that there is a separable (i.e., non-entangled) state ρ that ℳ accepts with probability 1, and the case that every separable state is accepted with probability at most 1 - ε. Equivalently, our algorithm takes the description of a subspace 𝒲 ⊆ 𝔽n2 (where 𝔽 can be either the real or complex field) and distinguishes between the case that contains a rank one matrix, and the case that every rank one matrix is at least ε far (in 𝓁2 distance) from 𝒲. To the best of our knowledge, this is the first improvement over the brute-force exp(n)-time algorithm for this problem. Our algorithm is based on the sum-of-squares hierarchy and its analysis is inspired by Lovett's proof (STOC '14, JACM '16) that the communication complexity of every rank-n Boolean matrix is bounded by Õ(√n).

对于ε>0的每一个常数，我们给出了一个exp(Õ(∞n))时间算法来解决1 vs 1 - ε最佳可分离状态(BSS)问题，该问题给出了一个与量子测量相对应的n2 x n2矩阵，在存在一个可分离(即非纠缠)状态ρ且其接受概率为1的情况下，与存在一个可分离(即非纠缠)状态ρ且其接受概率不超过1 - ε的情况下进行区分。同样地，我们的算法取一子空间𝒲≥𝔽n2(其中的∈可以是实域也可以是复域)的描述，并区分包含一个秩1矩阵的情况，以及每个秩1矩阵离𝒲至少有ε远(以𝓁2为距离)的情况。据我们所知，这是针对该问题的暴力破解exp(n)时间算法的第一个改进。我们的算法基于平方和层次结构，其分析灵感来自于Lovett的证明(STOC '14, JACM '16)，即每个n阶布尔矩阵的通信复杂性都由Õ(√n)限制。

引用次数: 28

Succinct hitting sets and barriers to proving algebraic circuits lower bounds 证明代数回路下界的简洁击打集和障碍

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-01-19 DOI: 10.1145/3055399.3055496

Michael A. Forbes, Amir Shpilka, Ben lee Volk

We formalize a framework of algebraically natural lower bounds for algebraic circuits. Just as with the natural proofs notion of Razborov and Rudich for boolean circuit lower bounds, our notion of algebraically natural lower bounds captures nearly all lower bound techniques known. However, unlike the boolean setting, there has been no concrete evidence demonstrating that this is a barrier to obtaining super-polynomial lower bounds for general algebraic circuits, as there is little understanding whether algebraic circuits are expressive enough to support "cryptography" secure against algebraic circuits. Following a similar result of Williams in the boolean setting, we show that the existence of an algebraic natural proofs barrier is equivalent to the existence of succinct derandomization of the polynomial identity testing problem. That is, whether the coefficient vectors of polylog(N)-degree polylog(N)-size circuits is a hitting set for the class of poly(N)-degree poly(N)-size circuits. Further, we give an explicit universal construction showing that if such a succinct hitting set exists, then our universal construction suffices. Further, we assess the existing literature constructing hitting sets for restricted classes of algebraic circuits and observe that none of them are succinct as given. Yet, we show how to modify some of these constructions to obtain succinct hitting sets. This constitutes the first evidence supporting the existence of an algebraic natural proofs barrier. Our framework is similar to the Geometric Complexity Theory (GCT) program of Mulmuley and Sohoni, except that here we emphasize constructiveness of the proofs while the GCT program emphasizes symmetry. Nevertheless, our succinct hitting sets have relevance to the GCT program as they imply lower bounds for the complexity of the defining equations of polynomials computed by small circuits.

我们形式化了代数回路的代数自然下界框架。正如Razborov和Rudich对布尔电路下界的自然证明概念一样，我们的代数自然下界概念涵盖了几乎所有已知的下界技术。然而，与布尔设置不同，没有具体的证据表明这是获得一般代数电路的超多项式下界的障碍，因为很少有人了解代数电路是否具有足够的表达能力来支持针对代数电路的“加密”安全。根据Williams在布尔设置下的类似结果，我们证明了代数自然证明障碍的存在性等价于多项式恒等式检验问题的简洁非随机化的存在性。即，多对数(N)次多对数(N)次电路的系数向量是否为多对数(N)次电路类的命中集。进一步，我们给出了一个明确的全称构造，表明如果这样一个简洁的命中集存在，那么我们的全称构造是充分的。进一步，我们评估了现有的构造代数电路受限类命中集的文献，并观察到它们都不像给定的那样简洁。然而，我们展示了如何修改这些结构以获得简洁的命中集。这构成了支持代数自然证明屏障存在的第一个证据。我们的框架类似于Mulmuley和Sohoni的几何复杂性理论(GCT)程序，除了这里我们强调证明的建设性，而GCT程序强调对称性。然而，我们的简洁命中集与GCT程序相关，因为它们暗示了由小电路计算的多项式定义方程的复杂性的下界。

{"title":"Succinct hitting sets and barriers to proving algebraic circuits lower bounds","authors":"Michael A. Forbes, Amir Shpilka, Ben lee Volk","doi":"10.1145/3055399.3055496","DOIUrl":"https://doi.org/10.1145/3055399.3055496","url":null,"abstract":"We formalize a framework of algebraically natural lower bounds for algebraic circuits. Just as with the natural proofs notion of Razborov and Rudich for boolean circuit lower bounds, our notion of algebraically natural lower bounds captures nearly all lower bound techniques known. However, unlike the boolean setting, there has been no concrete evidence demonstrating that this is a barrier to obtaining super-polynomial lower bounds for general algebraic circuits, as there is little understanding whether algebraic circuits are expressive enough to support \"cryptography\" secure against algebraic circuits. Following a similar result of Williams in the boolean setting, we show that the existence of an algebraic natural proofs barrier is equivalent to the existence of succinct derandomization of the polynomial identity testing problem. That is, whether the coefficient vectors of polylog(N)-degree polylog(N)-size circuits is a hitting set for the class of poly(N)-degree poly(N)-size circuits. Further, we give an explicit universal construction showing that if such a succinct hitting set exists, then our universal construction suffices. Further, we assess the existing literature constructing hitting sets for restricted classes of algebraic circuits and observe that none of them are succinct as given. Yet, we show how to modify some of these constructions to obtain succinct hitting sets. This constitutes the first evidence supporting the existence of an algebraic natural proofs barrier. Our framework is similar to the Geometric Complexity Theory (GCT) program of Mulmuley and Sohoni, except that here we emphasize constructiveness of the proofs while the GCT program emphasizes symmetry. Nevertheless, our succinct hitting sets have relevance to the GCT program as they imply lower bounds for the complexity of the defining equations of polynomials computed by small circuits.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79161560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Sum of squares lower bounds for refuting any CSP 反驳任何CSP的平方和下界

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-01-17 DOI: 10.1145/3055399.3055485

Pravesh Kothari, R. Mori, R. O'Donnell, David Witmer

Let P:{0,1}k → {0,1} be a nontrivial k-ary predicate. Consider a random instance of the constraint satisfaction problem (P) on n variables with Δ n constraints, each being P applied to k randomly chosen literals. Provided the constraint density satisfies Δ ≫ 1, such an instance is unsatisfiable with high probability. The refutation problem is to efficiently find a proof of unsatisfiability. We show that whenever the predicate P supports a t-wise uniform probability distribution on its satisfying assignments, the sum of squares (SOS) algorithm of degree d = Θ(n/Δ2/(t-1) logΔ) (which runs in time nO(d)) cannot refute a random instance of (P). In particular, the polynomial-time SOS algorithm requires Ω(n(t+1)/2) constraints to refute random instances of CSP(P) when P supports a t-wise uniform distribution on its satisfying assignments. Together with recent work of Lee, Raghavendra, Steurer (2015), our result also implies that any polynomial-size semidefinite programming relaxation for refutation requires at least Ω(n(t+1)/2) constraints. More generally, we consider the δ-refutation problem, in which the goal is to certify that at most a (1 - δ)-fraction of constraints can be simultaneously satisfied. We show that if P is δ-close to supporting a t-wise uniform distribution on satisfying assignments, then the degree-Θ(n/Δ2/(t - 1) logΔ) SOS algorithm cannot (δ+o(1))-refute a random instance of CSP(P). This is the first result to show a distinction between the degree SOS needs to solve the refutation problem and the degree it needs to solve the harder δ-refutation problem. Our results (which also extend with no change to CSPs over larger alphabets) subsume all previously known lower bounds for semialgebraic refutation of random CSPs. For every constraint predicate P, they give a three-way hardness tradeoff between the density of constraints, the SOS degree (hence running time), and the strength of the refutation. By recent algorithmic results of Allen, O'Donnell, Witmer (2015) and Raghavendra, Rao, Schramm (2016), this full three-way tradeoff is tight, up to lower-order factors.

设P:{0,1}k→{0,1}是一个非平凡k元谓词。考虑一个约束满足问题(P)的随机实例，在n个变量上有Δ n个约束，每个约束P应用于k个随机选择的字面值。如果约束密度满足Δ > 1，则该实例大概率不满足。反驳问题是如何有效地找到不满足性的证明。我们证明，只要谓词P在其满足的赋值上支持t向均匀概率分布，那么d度的平方和(SOS)算法(运行时间nO(d)) = Θ(n/Δ2/(t-1) logΔ)就不能反驳(P)的随机实例。特别是，多项式时间SOS算法需要Ω(n(t+1)/2)约束来反驳CSP(P)的随机实例，当P在其满足的赋值上支持t向均匀分布时。结合Lee, Raghavendra, Steurer(2015)最近的工作，我们的结果还表明，任何多项式大小的半确定规划松弛用于反驳至少需要Ω(n(t+1)/2)约束。更一般地，我们考虑δ-反驳问题，其目标是证明最多(1 - δ)分数的约束可以同时满足。我们证明，如果P是δ-接近于支持t向均匀分布的满足赋值，那么度-Θ(n/Δ2/(t - 1) logΔ) SOS算法不能(δ+o(1))-驳斥CSP(P)的随机实例。这是第一个表明SOS解决反驳问题所需的程度与解决更难的δ-反驳问题所需的程度之间存在区别的结果。我们的结果(也扩展没有变化的csp在更大的字母)包含所有以前已知的下界的半代数反驳随机csp。对于每个约束谓词P，他们给出了约束密度、SOS度(因此运行时间)和反驳强度之间的三向硬度权衡。根据Allen, O'Donnell, Witmer(2015)和Raghavendra, Rao, Schramm(2016)最近的算法结果，这种完整的三方权衡是紧密的，直到低阶因素。

{"title":"Sum of squares lower bounds for refuting any CSP","authors":"Pravesh Kothari, R. Mori, R. O'Donnell, David Witmer","doi":"10.1145/3055399.3055485","DOIUrl":"https://doi.org/10.1145/3055399.3055485","url":null,"abstract":"Let P:{0,1}k → {0,1} be a nontrivial k-ary predicate. Consider a random instance of the constraint satisfaction problem (P) on n variables with Δ n constraints, each being P applied to k randomly chosen literals. Provided the constraint density satisfies Δ ≫ 1, such an instance is unsatisfiable with high probability. The refutation problem is to efficiently find a proof of unsatisfiability. We show that whenever the predicate P supports a t-wise uniform probability distribution on its satisfying assignments, the sum of squares (SOS) algorithm of degree d = Θ(n/Δ2/(t-1) logΔ) (which runs in time nO(d)) cannot refute a random instance of (P). In particular, the polynomial-time SOS algorithm requires Ω(n(t+1)/2) constraints to refute random instances of CSP(P) when P supports a t-wise uniform distribution on its satisfying assignments. Together with recent work of Lee, Raghavendra, Steurer (2015), our result also implies that any polynomial-size semidefinite programming relaxation for refutation requires at least Ω(n(t+1)/2) constraints. More generally, we consider the δ-refutation problem, in which the goal is to certify that at most a (1 - δ)-fraction of constraints can be simultaneously satisfied. We show that if P is δ-close to supporting a t-wise uniform distribution on satisfying assignments, then the degree-Θ(n/Δ2/(t - 1) logΔ) SOS algorithm cannot (δ+o(1))-refute a random instance of CSP(P). This is the first result to show a distinction between the degree SOS needs to solve the refutation problem and the degree it needs to solve the harder δ-refutation problem. Our results (which also extend with no change to CSPs over larger alphabets) subsume all previously known lower bounds for semialgebraic refutation of random CSPs. For every constraint predicate P, they give a three-way hardness tradeoff between the density of constraints, the SOS degree (hence running time), and the strength of the refutation. By recent algorithmic results of Allen, O'Donnell, Witmer (2015) and Raghavendra, Rao, Schramm (2016), this full three-way tradeoff is tight, up to lower-order factors.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88340091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 91

Provable learning of noisy-OR networks 噪声或网络的可证明学习

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2016-12-28 DOI: 10.1145/3055399.3055482

Sanjeev Arora, Rong Ge, Tengyu Ma, Andrej Risteski

Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Learning the model ---that is, the mapping from hidden variables to visible ones and vice versa---is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structure: topic models, mixture models, hidden markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer noisy-OR network, which is a textbook example of a bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.

许多机器学习应用程序使用潜在变量模型来解释数据中的结构，由此可见变量(=给定数据点的坐标)被解释为一些隐藏变量的概率函数。学习模型——即从隐藏变量到可见变量的映射，反之亦然——即使在非常简单的设置中也是np困难的。近年来，针对线性结构的模型，如主题模型、混合模型、隐马尔可夫模型等，开发了可证明有效的算法。这些算法使用矩阵或张量分解，并对底层模型的参数做出一些合理的假设。但是当潜在变量模型具有非线性时，矩阵或张量分解似乎没有什么用处。目前的论文展示了如何取得进展:张量分解用于学习单层噪声或网络，这是贝叶斯网络的教科书示例，例如在经典的QMR-DT软件中用于通过观察患者表现出的症状来诊断患者可能患有哪种疾病。这里的技术新颖性，应该在未来的其他设置中有用，是在存在系统误差的情况下分析张量分解(即，噪声/误差与信号相关，并且随着样本数量趋于无穷大而不减少)。这需要从头开始重新思考张量分解方法的所有步骤。为简单起见，我们的分析假设网络参数是从概率分布中选择的，但这种方法似乎更适用于一般情况。

{"title":"Provable learning of noisy-OR networks","authors":"Sanjeev Arora, Rong Ge, Tengyu Ma, Andrej Risteski","doi":"10.1145/3055399.3055482","DOIUrl":"https://doi.org/10.1145/3055399.3055482","url":null,"abstract":"Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Learning the model ---that is, the mapping from hidden variables to visible ones and vice versa---is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structure: topic models, mixture models, hidden markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer noisy-OR network, which is a textbook example of a bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86500393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Set similarity search beyond MinHash 设置超越MinHash的相似性搜索

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2016-12-22 DOI: 10.1145/3055399.3055443

Tobias Christiani, R. Pagh

We consider the problem of approximate set similarity search under Braun-Blanquet similarity B(x, y) = |x ∩ y| / max(|x|, |y|). The (b1, b2)-approximate Braun-Blanquet similarity search problem is to preprocess a collection of sets P such that, given a query set q, if there exists x Ε P with B(q, x) ≥ b1, then we can efficiently return x′ Ε P with B(q, x′) > b2. We present a simple data structure that solves this problem with space usage O(n1+ρlogn + ∑x ε P|x|) and query time O(|q|nρ logn) where n = |P| and ρ = log(1/b1)/log(1/b2). Making use of existing lower bounds for locality-sensitive hashing by O'Donnell et al. (TOCT 2014) we show that this value of ρ is tight across the parameter space, i.e., for every choice of constants 0 < b2 < b1 < 1. In the case where all sets have the same size our solution strictly improves upon the value of ρ that can be obtained through the use of state-of-the-art data-independent techniques in the Indyk-Motwani locality-sensitive hashing framework (STOC 1998) such as Broder's MinHash (CCS 1997) for Jaccard similarity and Andoni et al.'s cross-polytope LSH (NIPS 2015) for cosine similarity. Surprisingly, even though our solution is data-independent, for a large part of the parameter space we outperform the currently best data-dependent method by Andoni and Razenshteyn (STOC 2015).

考虑Braun-Blanquet相似度B(x, y) = |x∩y| / max(|x|， |y|)下的近似集相似度搜索问题。(b1, b2)近似布朗-布兰凯相似搜索问题是对集合P进行预处理，使得给定一个查询集q，如果存在x Ε P且B(q, x)≥b1，则我们可以有效地返回x ' Ε P且B(q, x ') > b2。我们提出了一个简单的数据结构来解决这个问题，它的空间使用为O(n1+ρlogn +∑x ε P|x|)，查询时间为O(|q|nρ logn)，其中n = |P|， ρ = log(1/b1)/log(1/b2)。利用O'Donnell等人(TOCT 2014)对位置敏感哈希的现有下界，我们表明ρ的这个值在参数空间上是紧的，即对于常数0 < b2 < b1 < 1的每一个选择。在所有集合具有相同大小的情况下，我们的解决方案严格改进了ρ值，ρ值可以通过使用最先进的数据独立技术在Indyk-Motwani位置敏感散列框架(STOC 1998)中获得，例如Broder的MinHash (CCS 1997)用于Jaccard相似性和Andoni等人的交叉多面体LSH (NIPS 2015)用于余弦相似性。令人惊讶的是，尽管我们的解决方案是数据独立的，但在很大一部分参数空间中，我们的性能优于目前最好的由Andoni和Razenshteyn (STOC 2015)提出的数据依赖方法。

{"title":"Set similarity search beyond MinHash","authors":"Tobias Christiani, R. Pagh","doi":"10.1145/3055399.3055443","DOIUrl":"https://doi.org/10.1145/3055399.3055443","url":null,"abstract":"We consider the problem of approximate set similarity search under Braun-Blanquet similarity B(x, y) = |x ∩ y| / max(|x|, |y|). The (b1, b2)-approximate Braun-Blanquet similarity search problem is to preprocess a collection of sets P such that, given a query set q, if there exists x Ε P with B(q, x) ≥ b1, then we can efficiently return x′ Ε P with B(q, x′) > b2. We present a simple data structure that solves this problem with space usage O(n1+ρlogn + ∑x ε P|x|) and query time O(|q|nρ logn) where n = |P| and ρ = log(1/b1)/log(1/b2). Making use of existing lower bounds for locality-sensitive hashing by O'Donnell et al. (TOCT 2014) we show that this value of ρ is tight across the parameter space, i.e., for every choice of constants 0 < b2 < b1 < 1. In the case where all sets have the same size our solution strictly improves upon the value of ρ that can be obtained through the use of state-of-the-art data-independent techniques in the Indyk-Motwani locality-sensitive hashing framework (STOC 1998) such as Broder's MinHash (CCS 1997) for Jaccard similarity and Andoni et al.'s cross-polytope LSH (NIPS 2015) for cosine similarity. Surprisingly, even though our solution is data-independent, for a large part of the parameter space we outperform the currently best data-dependent method by Andoni and Razenshteyn (STOC 2015).","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90980984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41

Trace reconstruction with exp(O(n1/3)) samples 用exp(O(n1/3))个样本进行痕量重建

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2016-12-12 DOI: 10.1145/3055399.3055494

F. Nazarov, Y. Peres

In the trace reconstruction problem, an unknown bit string x ∈ {0,1}n is observed through the deletion channel, which deletes each bit of x with some constant probability q, yielding a contracted string x. How many independent copies of x are needed to reconstruct x with high probability? Prior to this work, the best upper bound, due to Holenstein, Mitzenmacher, Panigrahy, and Wieder (2008), was exp(O(n1/2)). We improve this bound to exp(O(n1/3)) using statistics of individual bits in the output and show that this bound is sharp in the restricted model where this is the only information used. Our method, that uses elementary complex analysis, can also handle insertions. Similar results were obtained independently and simultaneously by Anindya De, Ryan O'Donnell and Rocco Servedio.

在轨迹重建问题中，通过删除通道观察到一个未知的位串x∈{0,1}n，该通道以一定的常数概率q删除x的每个位，得到一个压缩的字符串x。需要多少个x的独立副本才能高概率地重建x ?在此工作之前，由于Holenstein, Mitzenmacher, Panigrahy和Wieder(2008)，最好的上界是exp(O(n1/2))。我们使用输出中单个比特的统计数据将这个边界改进为exp(O(n /3))，并表明这个边界在受限模型中是尖锐的，其中这是唯一使用的信息。我们的方法使用初等复分析，也可以处理插入。Anindya De、Ryan O'Donnell和Rocco Servedio分别独立并同时获得了类似的结果。

引用次数: 58

Optimal mean-based algorithms for trace reconstruction 基于均值的最优轨迹重建算法

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2016-12-09 DOI: 10.1145/3055399.3055450

Anindya De, R. O'Donnell, R. Servedio

In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein et al. [SODA 2008]; it uses exp(O(n1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least nΩ(logn) samples. In this paper we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < Ω < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp(Ω(n1/3)) traces. In fact, we obtain matching upper and lower bounds even for Ω subconstant and ρ := 1 - Ω subconstant: when (log3 n)/n ≪ Ω ≤ 1/2 the bound is exp(-Θ(δδ n)1/3), and when 1/√n ≪ ρ ≥ 1/2 the bound is exp(-Θ(n/Θ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.

在(删除通道)迹重建问题中，存在一个未知的n位源字符串x。给出了一种算法来访问x的独立迹，其中通过以概率δ独立地删除x的每个位来形成迹。该算法的目标是精确地(高概率地)恢复x，同时最小化样本(跟踪数)和运行时间。此前，最著名的轨迹重建算法是Holenstein等人提出的[SODA 2008];对于任意固定的0 < δ < 1，它使用exp(O(n1/2))样本和运行时间。这也是我们所说的“基于均值的算法”，意思是它只使用轨迹中单个比特的经验均值。Holenstein等人也给出了一个下界，表明任何基于均值的算法必须使用至少nΩ(logn)个样本。本文改进了这两个结果，得到了基于均值的轨迹重建的匹配上界和下界。对于任意恒定的删除率0 < Ω < 1，我们给出了一个基于均值的算法，该算法使用exp(O(n1/3))时间和轨迹;我们还证明了任何基于均值的算法必须至少使用exp(Ω(n1/3))条轨迹。事实上，即使对于Ω亚常数和ρ:= 1 - Ω亚常数，我们也能得到匹配的上界和下界:当(log3n)/n≪Ω≤1/2时，界为exp(-Θ(δδ n)1/3)，当1/√n≪ρ≥1/2时，界为exp(-Θ(n/Θ)1/3)。我们的证明涉及对复盘上利特伍德多项式的最大值的估计。我们表明，这些技术也可以用于执行随机插入和位翻转的跟踪重建，除了删除。我们还发现了一个令人惊讶的结果:对于缺失概率δ > 1/2，插入的存在实际上有助于痕迹重建。

{"title":"Optimal mean-based algorithms for trace reconstruction","authors":"Anindya De, R. O'Donnell, R. Servedio","doi":"10.1145/3055399.3055450","DOIUrl":"https://doi.org/10.1145/3055399.3055450","url":null,"abstract":"In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein et al. [SODA 2008]; it uses exp(O(n1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a \"mean-based algorithm\", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least nΩ(logn) samples. In this paper we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < Ω < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp(Ω(n1/3)) traces. In fact, we obtain matching upper and lower bounds even for Ω subconstant and ρ := 1 - Ω subconstant: when (log3 n)/n ≪ Ω ≤ 1/2 the bound is exp(-Θ(δδ n)1/3), and when 1/√n ≪ ρ ≥ 1/2 the bound is exp(-Θ(n/Θ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74024974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

Approximate modularity revisited 近似模块化

Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2016-12-06 DOI: 10.1145/3055399.3055476

U. Feige, M. Feldman, Inbal Talgam-Cohen

Set functions with convenient properties (such as submodularity) appear in application areas of current interest, such as algorithmic game theory, and allow for improved optimization algorithms. It is natural to ask (e.g., in the context of data driven optimization) how robust such properties are, and whether small deviations from them can be tolerated. We consider two such questions in the important special case of linear set functions. One question that we address is whether any set function that approximately satisfies the modularity equation (linear functions satisfy the modularity equation exactly) is close to a linear function. The answer to this is positive (in a precise formal sense) as shown by Kalton and Roberts [1983] (and further improved by Bondarenko, Prymak, and Radchenko [2013]). We revisit their proof idea that is based on expander graphs, and provide significantly stronger upper bounds by combining it with new techniques. Furthermore, we provide improved lower bounds for this problem. Another question that we address is that of how to learn a linear function h that is close to an approximately linear function f, while querying the value of f on only a small number of sets. We present a deterministic algorithm that makes only linearly many (in the number of items) nonadaptive queries, by this improving over a previous algorithm of Chierichetti, Das, Dasgupta and Kumar [2015] that is randomized and makes more than a quadratic number of queries. Our learning algorithm is based on a Hadamard transform.

具有方便属性(如子模块化)的集合函数出现在当前感兴趣的应用领域，如算法博弈论，并允许改进优化算法。很自然地会问(例如，在数据驱动优化的上下文中)这些属性有多健壮，以及是否可以容忍对它们的小偏差。我们在线性集合函数的重要特例中考虑两个这样的问题。我们要解决的一个问题是，近似满足模块化方程(线性函数完全满足模块化方程)的任何集合函数是否接近于线性函数。Kalton和Roberts[1983]给出的答案是肯定的(在精确的形式意义上)(Bondarenko、Prymak和Radchenko[2013]进一步完善了这一结论)。我们重新审视了他们基于展开图的证明思想，并通过将其与新技术相结合，提供了明显更强的上界。此外，我们给出了改进的下界。我们要解决的另一个问题是如何学习一个接近近似线性函数f的线性函数h，同时只在少数集合上查询f的值。我们提出了一种确定性算法，通过改进Chierichetti, Das, Dasgupta和Kumar[2015]的先前算法，该算法只进行线性多(项目数量)的非自适应查询，该算法是随机的，并且查询次数超过二次。我们的学习算法是基于阿达玛变换的。

{"title":"Approximate modularity revisited","authors":"U. Feige, M. Feldman, Inbal Talgam-Cohen","doi":"10.1145/3055399.3055476","DOIUrl":"https://doi.org/10.1145/3055399.3055476","url":null,"abstract":"Set functions with convenient properties (such as submodularity) appear in application areas of current interest, such as algorithmic game theory, and allow for improved optimization algorithms. It is natural to ask (e.g., in the context of data driven optimization) how robust such properties are, and whether small deviations from them can be tolerated. We consider two such questions in the important special case of linear set functions. One question that we address is whether any set function that approximately satisfies the modularity equation (linear functions satisfy the modularity equation exactly) is close to a linear function. The answer to this is positive (in a precise formal sense) as shown by Kalton and Roberts [1983] (and further improved by Bondarenko, Prymak, and Radchenko [2013]). We revisit their proof idea that is based on expander graphs, and provide significantly stronger upper bounds by combining it with new techniques. Furthermore, we provide improved lower bounds for this problem. Another question that we address is that of how to learn a linear function h that is close to an approximately linear function f, while querying the value of f on only a small number of sets. We present a deterministic algorithm that makes only linearly many (in the number of items) nonadaptive queries, by this improving over a previous algorithm of Chierichetti, Das, Dasgupta and Kumar [2015] that is randomized and makes more than a quadratic number of queries. Our learning algorithm is based on a Hadamard transform.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84147076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4