We present an O*(n3) randomized algorithm for estimating the volume of a well-rounded convex body given by a membership oracle, improving on the previous best complexity of O*(n4). The new algorithmic ingredient is an accelerated cooling schedule where the rate of cooling increases with the temperature. Previously, the known approach for potentially achieving such complexity relied on a positive resolution of the KLS hyperplane conjecture, a central open problem in convex geometry.
{"title":"Bypassing KLS: Gaussian Cooling and an O^*(n3) Volume Algorithm","authors":"Benjamin R. Cousins, S. Vempala","doi":"10.1145/2746539.2746563","DOIUrl":"https://doi.org/10.1145/2746539.2746563","url":null,"abstract":"We present an O*(n3) randomized algorithm for estimating the volume of a well-rounded convex body given by a membership oracle, improving on the previous best complexity of O*(n4). The new algorithmic ingredient is an accelerated cooling schedule where the rate of cooling increases with the temperature. Previously, the known approach for potentially achieving such complexity relied on a positive resolution of the KLS hyperplane conjecture, a central open problem in convex geometry.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76225026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The celebrated Cheeger's Inequality [AM85,a86] establishes a bound on the expansion of a graph via its spectrum. This inequality is central to a rich spectral theory of graphs, based on studying the eigenvalues and eigenvectors of the adjacency matrix (and other related matrices) of graphs. It has remained open to define a suitable spectral model for hypergraphs whose spectra can be used to estimate various combinatorial properties of the hypergraph. In this paper we introduce a new hypergraph Laplacian operator generalizing the Laplacian matrix of graphs. Our operator can be viewed as the gradient operator applied to a certain natural quadratic form for hypergraphs. We show that various hypergraph parameters (for e.g. expansion, diameter, etc) can be bounded using this operator's eigenvalues. We study the heat diffusion process associated with this Laplacian operator, and bound its parameters in terms of its spectra. All our results are generalizations of the corresponding results for graphs. We show that there can be no linear operator for hypergraphs whose spectra captures hypergraph expansion in a Cheeger-like manner. Our Laplacian operator is non-linear, and thus computing its eigenvalues exactly is intractable. For any k, we give a polynomial time algorithm to compute an approximation to the kth smallest eigenvalue of the operator. We show that this approximation factor is optimal under the SSE hypothesis (introduced by [RS10]) for constant values of k. Finally, using the factor preserving reduction from vertex expansion in graphs to hypergraph expansion, we show that all our results for hypergraphs extend to vertex expansion in graphs.
{"title":"Hypergraph Markov Operators, Eigenvalues and Approximation Algorithms","authors":"Anand Louis","doi":"10.1145/2746539.2746555","DOIUrl":"https://doi.org/10.1145/2746539.2746555","url":null,"abstract":"The celebrated Cheeger's Inequality [AM85,a86] establishes a bound on the expansion of a graph via its spectrum. This inequality is central to a rich spectral theory of graphs, based on studying the eigenvalues and eigenvectors of the adjacency matrix (and other related matrices) of graphs. It has remained open to define a suitable spectral model for hypergraphs whose spectra can be used to estimate various combinatorial properties of the hypergraph. In this paper we introduce a new hypergraph Laplacian operator generalizing the Laplacian matrix of graphs. Our operator can be viewed as the gradient operator applied to a certain natural quadratic form for hypergraphs. We show that various hypergraph parameters (for e.g. expansion, diameter, etc) can be bounded using this operator's eigenvalues. We study the heat diffusion process associated with this Laplacian operator, and bound its parameters in terms of its spectra. All our results are generalizations of the corresponding results for graphs. We show that there can be no linear operator for hypergraphs whose spectra captures hypergraph expansion in a Cheeger-like manner. Our Laplacian operator is non-linear, and thus computing its eigenvalues exactly is intractable. For any k, we give a polynomial time algorithm to compute an approximation to the kth smallest eigenvalue of the operator. We show that this approximation factor is optimal under the SSE hypothesis (introduced by [RS10]) for constant values of k. Finally, using the factor preserving reduction from vertex expansion in graphs to hypergraph expansion, we show that all our results for hypergraphs extend to vertex expansion in graphs.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88692151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Super-resolution is a fundamental task in imaging, where the goal is to extract fine-grained structure from coarse-grained measurements. Here we are interested in a popular mathematical abstraction of this problem that has been widely studied in the statistics, signal processing and machine learning communities. We exactly resolve the threshold at which noisy super-resolution is possible. In particular, we establish a sharp phase transition for the relationship between the cutoff frequency (m) and the separation (Δ). If m > 1/Δ + 1, our estimator converges to the true values at an inverse polynomial rate in terms of the magnitude of the noise. And when m < (1-ε) /Δ no estimator can distinguish between a particular pair of Δ-separated signals even if the magnitude of the noise is exponentially small. Our results involve making novel connections between extremal functions and the spectral properties of Vandermonde matrices. We establish a sharp phase transition for their condition number which in turn allows us to give the first noise tolerance bounds for the matrix pencil method. Moreover we show that our methods can be interpreted as giving preconditioners for Vandermonde matrices, and we use this observation to design faster algorithms for super-resolution. We believe that these ideas may have other applications in designing faster algorithms for other basic tasks in signal processing.
{"title":"Super-resolution, Extremal Functions and the Condition Number of Vandermonde Matrices","authors":"Ankur Moitra","doi":"10.1145/2746539.2746561","DOIUrl":"https://doi.org/10.1145/2746539.2746561","url":null,"abstract":"Super-resolution is a fundamental task in imaging, where the goal is to extract fine-grained structure from coarse-grained measurements. Here we are interested in a popular mathematical abstraction of this problem that has been widely studied in the statistics, signal processing and machine learning communities. We exactly resolve the threshold at which noisy super-resolution is possible. In particular, we establish a sharp phase transition for the relationship between the cutoff frequency (m) and the separation (Δ). If m > 1/Δ + 1, our estimator converges to the true values at an inverse polynomial rate in terms of the magnitude of the noise. And when m < (1-ε) /Δ no estimator can distinguish between a particular pair of Δ-separated signals even if the magnitude of the noise is exponentially small. Our results involve making novel connections between extremal functions and the spectral properties of Vandermonde matrices. We establish a sharp phase transition for their condition number which in turn allows us to give the first noise tolerance bounds for the matrix pencil method. Moreover we show that our methods can be interpreted as giving preconditioners for Vandermonde matrices, and we use this observation to design faster algorithms for super-resolution. We believe that these ideas may have other applications in designing faster algorithms for other basic tasks in signal processing.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73245694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A 2-server Private Information Retrieval (PIR) scheme allows a user to retrieve the ith bit of an n-bit database replicated among two non-communicating servers, while not revealing any information about i to either server. In this work we construct a 2-server PIR scheme with total communication cost nO√(log log n)/(log n). This improves over current 2-server protocols which all require Ω(n1/3) communication. Our construction circumvents the n1/3 barrier of Razborov and Yekhanin which holds for the restricted model of bilinear group-based schemes (covering all previous 2-server schemes). The improvement comes from reducing the number of servers in existing protocols, based on Matching Vector Codes, from 3 or 4 servers to 2. This is achieved by viewing these protocols in an algebraic way (using polynomial interpolation) and extending them using partial derivatives.
{"title":"2-Server PIR with Sub-Polynomial Communication","authors":"Zeev Dvir, Sivakanth Gopi","doi":"10.1145/2746539.2746546","DOIUrl":"https://doi.org/10.1145/2746539.2746546","url":null,"abstract":"A 2-server Private Information Retrieval (PIR) scheme allows a user to retrieve the ith bit of an n-bit database replicated among two non-communicating servers, while not revealing any information about i to either server. In this work we construct a 2-server PIR scheme with total communication cost nO√(log log n)/(log n). This improves over current 2-server protocols which all require Ω(n1/3) communication. Our construction circumvents the n1/3 barrier of Razborov and Yekhanin which holds for the restricted model of bilinear group-based schemes (covering all previous 2-server schemes). The improvement comes from reducing the number of servers in existing protocols, based on Matching Vector Codes, from 3 or 4 servers to 2. This is achieved by viewing these protocols in an algebraic way (using polynomial interpolation) and extending them using partial derivatives.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86421681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We say a turnstile streaming algorithm is {em non-adaptive} if, during updates, the memory cells written and read depend only on the index being updated and random coins tossed at the beginning of the stream (and not on the memory contents of the algorithm). Memory cells read during queries may be decided upon adaptively. All known turnstile streaming algorithms in the literature, except a single recent example for a particular promise problem [7], are non-adaptive. In fact, even more specifically, they are all linear sketches. We prove the first non-trivial update time lower bounds for both randomized and deterministic turnstile streaming algorithms, which hold when the algorithms are non-adaptive. While there has been abundant success in proving space lower bounds, there have been no non-trivial turnstile update time lower bounds. Our lower bounds hold against classically studied problems such as heavy hitters, point query, entropy estimation, and moment estimation. In some cases of deterministic algorithms, our lower bounds nearly match known upper bounds.
{"title":"Time Lower Bounds for Nonadaptive Turnstile Streaming Algorithms","authors":"Kasper Green Larsen, Jelani Nelson, Huy L. Nguyen","doi":"10.1145/2746539.2746542","DOIUrl":"https://doi.org/10.1145/2746539.2746542","url":null,"abstract":"We say a turnstile streaming algorithm is {em non-adaptive} if, during updates, the memory cells written and read depend only on the index being updated and random coins tossed at the beginning of the stream (and not on the memory contents of the algorithm). Memory cells read during queries may be decided upon adaptively. All known turnstile streaming algorithms in the literature, except a single recent example for a particular promise problem [7], are non-adaptive. In fact, even more specifically, they are all linear sketches. We prove the first non-trivial update time lower bounds for both randomized and deterministic turnstile streaming algorithms, which hold when the algorithms are non-adaptive. While there has been abundant success in proving space lower bounds, there have been no non-trivial turnstile update time lower bounds. Our lower bounds hold against classically studied problems such as heavy hitters, point query, entropy estimation, and moment estimation. In some cases of deterministic algorithms, our lower bounds nearly match known upper bounds.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78274607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The planted bisection model is a random graph model in which the nodes are divided into two equal-sized communities and then edges are added randomly in a way that depends on the community membership. We establish necessary and sufficient conditions for the asymptotic recoverability of the planted bisection in this model. When the bisection is asymptotically recoverable, we give an efficient algorithm that successfully recovers it. We also show that the planted bisection is recoverable asymptotically if and only if with high probability every node belongs to the same community as the majority of its neighbors. Our algorithm for finding the planted bisection runs in time almost linear in the number of edges. It has three stages: spectral clustering to compute an initial guess, a "replica" stage to get almost every vertex correct, and then some simple local moves to finish the job. An independent work by Abbe, Bandeira, and Hall establishes similar (slightly weaker) results but only in the sparse case where pn, qn = Θ(log n /n).
种植对分模型是一种随机图模型,该模型将节点划分为两个大小相等的社区,然后根据社区的隶属度随机添加边。在该模型中,我们建立了种植切分的渐近可恢复性的充分必要条件。当等分线是渐近可恢复时,给出了一种有效的算法。我们还证明了当且仅当每个节点与它的大多数邻居高概率地属于同一个群落时,种植平分是渐近可恢复的。我们的算法寻找种植平分在时间上几乎是线性运行的边的数量。它有三个阶段:光谱聚类计算初始猜测,“复制”阶段使几乎每个顶点都正确,然后进行一些简单的局部移动来完成工作。Abbe, Bandeira和Hall的一项独立研究建立了类似的(稍弱的)结果,但仅在pn, qn = Θ(log n /n)的稀疏情况下。
{"title":"Consistency Thresholds for the Planted Bisection Model","authors":"Elchanan Mossel, Joe Neeman, A. Sly","doi":"10.1145/2746539.2746603","DOIUrl":"https://doi.org/10.1145/2746539.2746603","url":null,"abstract":"The planted bisection model is a random graph model in which the nodes are divided into two equal-sized communities and then edges are added randomly in a way that depends on the community membership. We establish necessary and sufficient conditions for the asymptotic recoverability of the planted bisection in this model. When the bisection is asymptotically recoverable, we give an efficient algorithm that successfully recovers it. We also show that the planted bisection is recoverable asymptotically if and only if with high probability every node belongs to the same community as the majority of its neighbors. Our algorithm for finding the planted bisection runs in time almost linear in the number of edges. It has three stages: spectral clustering to compute an initial guess, a \"replica\" stage to get almost every vertex correct, and then some simple local moves to finish the job. An independent work by Abbe, Bandeira, and Hall establishes similar (slightly weaker) results but only in the sparse case where pn, qn = Θ(log n /n).","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75934459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We give a new approach to the dictionary learning (also known as "sparse coding") problem of recovering an unknown n x m matrix A (for m ≥ n) from examples of the form [y = Ax + e,] where x is a random vector in Rm with at most τ m nonzero coordinates, and e is a random noise vector in Rn with bounded magnitude. For the case m=O(n), our algorithm recovers every column of A within arbitrarily good constant accuracy in time mO(log m/log(τ-1)), in particular achieving polynomial time if τ = m-δ for any δ>0, and time mO(log m) if τ is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector $x$ to be much sparser---at most √n nonzero coordinates---and there were intrinsic barriers preventing these algorithms from applying for denser x. We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor T, given access to a tensor T' that is τ-close to T in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of T and T' have similar structures. Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems.
{"title":"Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method","authors":"B. Barak, Jonathan A. Kelner, David Steurer","doi":"10.1145/2746539.2746605","DOIUrl":"https://doi.org/10.1145/2746539.2746605","url":null,"abstract":"We give a new approach to the dictionary learning (also known as \"sparse coding\") problem of recovering an unknown n x m matrix A (for m ≥ n) from examples of the form [y = Ax + e,] where x is a random vector in Rm with at most τ m nonzero coordinates, and e is a random noise vector in Rn with bounded magnitude. For the case m=O(n), our algorithm recovers every column of A within arbitrarily good constant accuracy in time mO(log m/log(τ-1)), in particular achieving polynomial time if τ = m-δ for any δ>0, and time mO(log m) if τ is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector $x$ to be much sparser---at most √n nonzero coordinates---and there were intrinsic barriers preventing these algorithms from applying for denser x. We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor T, given access to a tensor T' that is τ-close to T in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of T and T' have similar structures. Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82368776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We prove that finding an ε-approximate Nash equilibrium is PPAD-complete for constant ε and a particularly simple class of games: polymatrix, degree 3 graphical games, in which each player has only two actions. As corollaries, we also prove similar inapproximability results for Bayesian Nash equilibrium in a two-player incomplete information game with a constant number of actions, for relative ε-Nash equilibrium in a two-player game, for market equilibrium in a non-monotone market, for the generalized circuit problem defined by Chen et al. [4], and for approximate competitive equilibrium from equal incomes with indivisible goods.
{"title":"Inapproximability of Nash Equilibrium","authors":"A. Rubinstein","doi":"10.1145/2746539.2746578","DOIUrl":"https://doi.org/10.1145/2746539.2746578","url":null,"abstract":"We prove that finding an ε-approximate Nash equilibrium is PPAD-complete for constant ε and a particularly simple class of games: polymatrix, degree 3 graphical games, in which each player has only two actions. As corollaries, we also prove similar inapproximability results for Bayesian Nash equilibrium in a two-player incomplete information game with a constant number of actions, for relative ε-Nash equilibrium in a two-player game, for market equilibrium in a non-monotone market, for the generalized circuit problem defined by Chen et al. [4], and for approximate competitive equilibrium from equal incomes with indivisible goods.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"97 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83538504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Graph Pricing problem is among the fundamental problems whose approximability is not well-understood. While there is a simple combinatorial 1/4-approximation algorithm, the best hardness result remains at 1/2 assuming the Unique Games Conjecture (UGC). We show that it is NP-hard to approximate within a factor better than 1/4 under the UGC, so that the simple combinatorial algorithm might be the best possible. We also prove that for any ε > 0, there exists δ > 0 such that the integrality gap of nδ-rounds of the Sherali-Adams hierarchy of linear programming for Graph Pricing is at most 1/4 + ε. This work is based on the effort to view the Graph Pricing problem as a Constraint Satisfaction Problem (CSP) simpler than the standard and complicated formulation. We propose the problem called Generalized Max-Dicut(T), which has a domain size T + 1 for every T ≥ 1. Generalized Max-Dicut(1) is well-known Max-Dicut. There is an approximation preserving reduction from Generalized Max-Dicut on directed acyclic graphs (DAGs) to Graph Pricing, and both our results are achieved through this reduction. Besides its connection to Graph Pricing, the hardness of Generalized Max-Dicut is interesting in its own right since in most arity two CSPs studied in the literature, SDP-based algorithms perform better than LP-based or combinatorial algorithms --- for this arity two CSP, a simple combinatorial algorithm does the best.
{"title":"Hardness of Graph Pricing Through Generalized Max-Dicut","authors":"Euiwoong Lee","doi":"10.1145/2746539.2746549","DOIUrl":"https://doi.org/10.1145/2746539.2746549","url":null,"abstract":"The Graph Pricing problem is among the fundamental problems whose approximability is not well-understood. While there is a simple combinatorial 1/4-approximation algorithm, the best hardness result remains at 1/2 assuming the Unique Games Conjecture (UGC). We show that it is NP-hard to approximate within a factor better than 1/4 under the UGC, so that the simple combinatorial algorithm might be the best possible. We also prove that for any ε > 0, there exists δ > 0 such that the integrality gap of nδ-rounds of the Sherali-Adams hierarchy of linear programming for Graph Pricing is at most 1/4 + ε. This work is based on the effort to view the Graph Pricing problem as a Constraint Satisfaction Problem (CSP) simpler than the standard and complicated formulation. We propose the problem called Generalized Max-Dicut(T), which has a domain size T + 1 for every T ≥ 1. Generalized Max-Dicut(1) is well-known Max-Dicut. There is an approximation preserving reduction from Generalized Max-Dicut on directed acyclic graphs (DAGs) to Graph Pricing, and both our results are achieved through this reduction. Besides its connection to Graph Pricing, the hardness of Generalized Max-Dicut is interesting in its own right since in most arity two CSPs studied in the literature, SDP-based algorithms perform better than LP-based or combinatorial algorithms --- for this arity two CSP, a simple combinatorial algorithm does the best.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77361423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the problem of identifying the parameters of an unknown mixture of two arbitrary d-dimensional gaussians from a sequence of independent random samples. Our main results are upper and lower bounds giving a computationally efficient moment-based estimator with an optimal convergence rate, thus resolving a problem introduced by Pearson (1894). Denoting by σ2 the variance of the unknown mixture, we prove that Θ(σ12) samples are necessary and sufficient to estimate each parameter up to constant additive error when d=1. Our upper bound extends to arbitrary dimension d>1 up to a (provably necessary) logarithmic loss in d using a novel---yet simple---dimensionality reduction technique. We further identify several interesting special cases where the sample complexity is notably smaller than our optimal worst-case bound. For instance, if the means of the two components are separated by Ω(σ) the sample complexity reduces to O(σ2) and this is again optimal. Our results also apply to learning each component of the mixture up to small error in total variation distance, where our algorithm gives strong improvements in sample complexity over previous work.
{"title":"Tight Bounds for Learning a Mixture of Two Gaussians","authors":"Moritz Hardt, Eric Price","doi":"10.1145/2746539.2746579","DOIUrl":"https://doi.org/10.1145/2746539.2746579","url":null,"abstract":"We consider the problem of identifying the parameters of an unknown mixture of two arbitrary d-dimensional gaussians from a sequence of independent random samples. Our main results are upper and lower bounds giving a computationally efficient moment-based estimator with an optimal convergence rate, thus resolving a problem introduced by Pearson (1894). Denoting by σ2 the variance of the unknown mixture, we prove that Θ(σ12) samples are necessary and sufficient to estimate each parameter up to constant additive error when d=1. Our upper bound extends to arbitrary dimension d>1 up to a (provably necessary) logarithmic loss in d using a novel---yet simple---dimensionality reduction technique. We further identify several interesting special cases where the sample complexity is notably smaller than our optimal worst-case bound. For instance, if the means of the two components are separated by Ω(σ) the sample complexity reduces to O(σ2) and this is again optimal. Our results also apply to learning each component of the mixture up to small error in total variation distance, where our algorithm gives strong improvements in sample complexity over previous work.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2014-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82529188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}