Proceedings of the forty-seventh annual ACM symposium on Theory of Computing最新文献

英文中文

From Independence to Expansion and Back Again 从独立到扩张再回来

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-06-11 DOI: 10.1145/2746539.2746620

Tobias Christiani, R. Pagh, M. Thorup

We consider the following fundamental problems: Constructing k-independent hash functions with a space-time tradeoff close to Siegel's lower bound. Constructing representations of unbalanced expander graphs having small size and allowing fast computation of the neighbor function. It is not hard to show that these problems are intimately connected in the sense that a good solution to one of them leads to a good solution to the other one. In this paper we exploit this connection to present efficient, recursive constructions of k-independent hash functions (and hence expanders with a small representation). While the previously most efficient construction (Thorup, FOCS 2013) needed time quasipolynomial in Siegel's lower bound, our time bound is just a logarithmic factor from the lower bound.

我们考虑以下基本问题:构造具有接近西格尔下界的时空权衡的k独立哈希函数。构造具有小尺寸且允许快速计算邻居函数的不平衡展开图的表示。不难看出，这些问题是密切相关的，因为对其中一个问题的良好解决会导致对另一个问题的良好解决。在本文中，我们利用这种联系来提供k无关哈希函数的高效递归构造(因此具有小表示的扩展器)。虽然以前最有效的构造(Thorup, FOCS 2013)需要在西格尔下界中使用时间拟多项式，但我们的时间界只是下界的对数因子。

引用次数: 17

Polynomially Low Error PCPs with polyloglog n Queries via Modular Composition 基于模组合的多对数n查询多项式低误差pcp

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-05-23 DOI: 10.1145/2746539.2746630

Irit Dinur, P. Harsha, Guy Kindler

We show that every language in NP has a PCP verifier that tosses O(log n) random coins, has perfect completeness, and a soundness error of at most 1/poly(n), while making O(poly log log n) queries into a proof over an alphabet of size at most n1/poly log log n. Previous constructions that obtain 1/poly(n) soundness error used either poly log n queries or an exponential alphabet, i.e. of size 2nc for some c> 0. Our result is an exponential improvement in both parameters simultaneously. Our result can be phrased as polynomial-gap hardness for approximate CSPs with arity poly log log n and alphabet size n1/poly log n. The ultimate goal, in this direction, would be to prove polynomial hardness for CSPs with constant arity and polynomial alphabet size (aka the sliding scale conjecture for inverse polynomial soundness error). Our construction is based on a modular generalization of previous PCP constructions in this parameter regime, which involves a composition theorem that uses an extra 'consistency' query but maintains the inverse polynomial relation between the soundness error and the alphabet size. Our main technical/conceptual contribution is a new notion of soundness, which we refer to as distributional soundness, that replaces the previous notion of "list decoding soundness", and allows us to invoke composition a super-constant number of times without incurring a blow-up in the soundness error.

我们证明了NP中的每种语言都有一个PCP验证器，该验证器投掷O(log n)个随机硬币，具有完美的完备性，并且稳健性误差最多为1/poly(n)，而在最大n1/poly log log n的字母表上进行O(poly log log n)查询以证明。先前获得1/poly(n)稳健性误差的构造使用poly log n查询或指数字母表，即对于某些c> 0的大小为2nc。我们的结果是两个参数同时呈指数级提高。我们的结果可以被描述为多项式间隙硬度近似的csp，其密度为poly log log n，字母大小为n1/poly log n。在这个方向上，最终目标是证明具有恒定密度和多项式字母大小的csp的多项式硬度(又名逆多项式稳稳性误差的滑动尺度猜想)。我们的构造是基于该参数体系中先前的PCP构造的模块化推广，其中涉及一个组合定理，该定理使用了额外的“一致性”查询，但保持了稳健性误差和字母大小之间的逆多项式关系。我们的主要技术/概念贡献是一个新的稳健性概念，我们将其称为分布稳健性，它取代了之前的“列表解码稳健性”概念，并允许我们调用组合的超常数次数而不会导致稳健性错误的爆发。

{"title":"Polynomially Low Error PCPs with polyloglog n Queries via Modular Composition","authors":"Irit Dinur, P. Harsha, Guy Kindler","doi":"10.1145/2746539.2746630","DOIUrl":"https://doi.org/10.1145/2746539.2746630","url":null,"abstract":"We show that every language in NP has a PCP verifier that tosses O(log n) random coins, has perfect completeness, and a soundness error of at most 1/poly(n), while making O(poly log log n) queries into a proof over an alphabet of size at most n1/poly log log n. Previous constructions that obtain 1/poly(n) soundness error used either poly log n queries or an exponential alphabet, i.e. of size 2nc for some c> 0. Our result is an exponential improvement in both parameters simultaneously. Our result can be phrased as polynomial-gap hardness for approximate CSPs with arity poly log log n and alphabet size n1/poly log n. The ultimate goal, in this direction, would be to prove polynomial hardness for CSPs with constant arity and polynomial alphabet size (aka the sliding scale conjecture for inverse polynomial soundness error). Our construction is based on a modular generalization of previous PCP constructions in this parameter regime, which involves a composition theorem that uses an extra 'consistency' query but maintains the inverse polynomial relation between the soundness error and the alphabet size. Our main technical/conceptual contribution is a new notion of soundness, which we refer to as distributional soundness, that replaces the previous notion of \"list decoding soundness\", and allows us to invoke composition a super-constant number of times without incurring a blow-up in the soundness error.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79688281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

A Polynomial-time Bicriteria Approximation Scheme for Planar Bisection 平面剖分的多项式时间双准则逼近格式

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-04-29 DOI: 10.1145/2746539.2746564

K. Fox, P. Klein, S. Mozes

Given an undirected graph with edge costs and node weights, the minimum bisection problem asks for a partition of the nodes into two parts of equal weight such that the sum of edge costs between the parts is minimized. We give a polynomial time bicriteria approximation scheme for bisection on planar graphs. Specifically, let W be the total weight of all nodes in a planar graph G. For any constant ε > 0, our algorithm outputs a bipartition of the nodes such that each part weighs at most W/2 + ε and the total cost of edges crossing the partition is at most (1+ε) times the total cost of the optimal bisection. The previously best known approximation for planar minimum bisection, even with unit node weights, was ~O(log n). Our algorithm actually solves a more general problem where the input may include a target weight for the smaller side of the bipartition.

给定一个具有边代价和节点权值的无向图，最小对分问题要求将节点划分为两个权值相等的部分，使各部分之间的边代价之和最小。给出了平面图上平分的多项式时间双准则近似格式。具体地说，设W为平面图g中所有节点的总权重。对于任意常数ε > 0，我们的算法输出节点的二分割，使得每个部分的权重不超过W/2 +ε，并且穿过该分割的边的总代价不超过(1+ε)倍于最优二分割的总代价。以前最著名的平面最小平分近似，即使是单位节点权重，也是~O(log n)。我们的算法实际上解决了一个更一般的问题，即输入可能包括二分划较小边的目标权重。

引用次数: 7

On the Lovász Theta function for Independent Sets in Sparse Graphs 稀疏图中独立集的Lovász θ函数

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-04-18 DOI: 10.1145/2746539.2746607

N. Bansal, Anupam Gupta, Guru Guruganesh

We consider the maximum independent set problem on graphs with maximum degree d. We show that the integrality gap of the Lovasz Theta function-based SDP has an integrality gap of O~(d/log3/2 d). This improves on the previous best result of O~(d/log d), and narrows the gap of this basic SDP to the integrality gap of O~(d/log2 d) recently shown for stronger SDPs, namely those obtained using poly log(d) levels of the SA+ semidefinite hierarchy. The improvement comes from an improved Ramsey-theoretic bound on the independence number of Kr-free graphs for large values of r. We also show how to obtain an algorithmic version of the above-mentioned SAplus-based integrality gap result, via a coloring algorithm of Johansson. The resulting approximation guarantee of O~(d/log2 d) matches the best unique-games-based hardness result up to lower-order poly (log log d) factors.

我们考虑了最大度为d的图上的最大独立集问题。我们证明了基于Lovasz Theta函数的SDP的完整性缺口具有O~(d/log3/ 2d)的完整性缺口，这改进了之前的最佳结果O~(d/log d)，并将该基本SDP的完整性缺口缩小到最近显示的更强的SDP的完整性缺口O~(d/ log2d)，即使用SA+半确定层次的多log(d)层次获得的SDP。改进来自于对r值较大的无k图的独立性数的改进ramsey理论界。我们还展示了如何通过Johansson的着色算法获得上述基于sapplus的完整性间隙结果的算法版本。得到的近似保证O~(d/ log2d)与基于唯一博弈的最佳硬度结果匹配到低阶多(log log d)因子。

引用次数: 28

Local, Private, Efficient Protocols for Succinct Histograms 简洁直方图的本地、私有、高效协议

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-04-17 DOI: 10.1145/2746539.2746632

Raef Bassily, Adam D. Smith

We give efficient protocols and matching accuracy lower bounds for frequency estimation in the local model for differential privacy. In this model, individual users randomize their data themselves, sending differentially private reports to an untrusted server that aggregates them. We study protocols that produce a succinct histogram representation of the data. A succinct histogram is a list of the most frequent items in the data (often called "heavy hitters") along with estimates of their frequencies; the frequency of all other items is implicitly estimated as 0. If there are n users whose items come from a universe of size d, our protocols run in time polynomial in n and log(d). With high probability, they estimate the accuracy of every item up to error O(√{log(d)/(ε2n)}). Moreover, we show that this much error is necessary, regardless of computational efficiency, and even for the simple setting where only one item appears with significant frequency in the data set. Previous protocols (Mishra and Sandler, 2006; Hsu, Khanna and Roth, 2012) for this task either ran in time Ω(d) or had much worse error (about √[6]{log(d)/(ε2n)}), and the only known lower bound on error was Ω(1/√{n}). We also adapt a result of McGregor et al (2010) to the local setting. In a model with public coins, we show that each user need only send 1 bit to the server. For all known local protocols (including ours), the transformation preserves computational efficiency.

给出了差分隐私局部模型频率估计的有效协议和匹配精度下界。在这个模型中，单个用户自己随机化他们的数据，将不同的私有报告发送到聚合它们的不受信任的服务器。我们研究的协议产生一个简洁的直方图表示的数据。简洁的直方图是数据中最频繁项目的列表(通常称为“heavy hitters”)以及它们的频率估计;所有其他项目的频率隐式估计为0。如果有n个用户的物品来自一个大小为d的域，那么我们的协议在时间多项式(n和log(d))中运行。在高概率下，他们估计每个项目的准确性达到误差O(√{log(d)/(ε2n)})。此外，我们表明，无论计算效率如何，甚至对于数据集中只有一个项目出现频率显著的简单设置，如此多的误差是必要的。以前的协议(Mishra和Sandler, 2006;Hsu, Khanna和Roth, 2012)对于这个任务，要么及时运行Ω(d)，要么误差更大(约√[6]{log(d)/(ε2n)})，唯一已知的误差下界是Ω(1/√{n})。我们还将McGregor等人(2010)的结果适应当地环境。在使用公共币的模型中，我们显示每个用户只需要向服务器发送1比特。对于所有已知的本地协议(包括我们的)，转换保持了计算效率。

{"title":"Local, Private, Efficient Protocols for Succinct Histograms","authors":"Raef Bassily, Adam D. Smith","doi":"10.1145/2746539.2746632","DOIUrl":"https://doi.org/10.1145/2746539.2746632","url":null,"abstract":"We give efficient protocols and matching accuracy lower bounds for frequency estimation in the local model for differential privacy. In this model, individual users randomize their data themselves, sending differentially private reports to an untrusted server that aggregates them. We study protocols that produce a succinct histogram representation of the data. A succinct histogram is a list of the most frequent items in the data (often called \"heavy hitters\") along with estimates of their frequencies; the frequency of all other items is implicitly estimated as 0. If there are n users whose items come from a universe of size d, our protocols run in time polynomial in n and log(d). With high probability, they estimate the accuracy of every item up to error O(√{log(d)/(ε2n)}). Moreover, we show that this much error is necessary, regardless of computational efficiency, and even for the simple setting where only one item appears with significant frequency in the data set. Previous protocols (Mishra and Sandler, 2006; Hsu, Khanna and Roth, 2012) for this task either ran in time Ω(d) or had much worse error (about √[6]{log(d)/(ε2n)}), and the only known lower bound on error was Ω(1/√{n}). We also adapt a result of McGregor et al (2010) to the local setting. In a model with public coins, we show that each user need only send 1 bit to the server. For all known local protocols (including ours), the transformation preserves computational efficiency.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91087294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 373

Testing Cluster Structure of Graphs 测试图的聚类结构

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-04-13 DOI: 10.1145/2746539.2746618

A. Czumaj, Pan Peng, C. Sohler

We study the problem of recognizing the cluster structure of a graph in the framework of property testing in the bounded degree model. Given a parameter ε, a d-bounded degree graph is defined to be (k, φ)-clusterable, if it can be partitioned into no more than k parts, such that the (inner) conductance of the induced subgraph on each part is at least φ and the (outer) conductance of each part is at most cd,kε4φ2, where cd,k depends only on d,k. Our main result is a sublinear algorithm with the running time ~O(√n ⋅ poly(φ,k,1/ε)) that takes as input a graph with maximum degree bounded by d, parameters k, φ, ε, and with probability at least 2/3, accepts the graph if it is (k,φ)-clusterable and rejects the graph if it is ε-far from (k, φ*)-clusterable for φ* = c'd,kφ2 ε4}/log n, where c'd,k depends only on d,k. By the lower bound of Ω(√n) on the number of queries needed for testing graph expansion, which corresponds to k=1 in our problem, our algorithm is asymptotically optimal up to polylogarithmic factors.

研究了在有界度模型的性质检验框架下图的聚类结构识别问题。给定一个参数ε，定义一个d有界度图是(k， φ)可聚类的，如果它可以划分为不超过k个部分，使得诱导子图在每个部分上的(内)电导至少为φ，并且每个部分的(外)电导最多为cd,kε4φ2，其中cd,k仅依赖于d,k。我们的主要结果是一个运行时间为~O(√n·poly(φ，k,1/ε))的次线性算法，该算法以一个最大度以d为界的图为输入，参数k，φ， ε，且概率至少为2/3，当图为(k，φ)-可聚类时接受图，当图为ε-远离(k，φ *)-可聚类时拒绝图，当图为φ* = c'd,kφ2 ε4}/log n时，其中c'd,k仅依赖于d,k。通过测试图展开所需查询数Ω(√n)的下界，对应于我们问题中的k=1，我们的算法对于多对数因子是渐近最优的。

引用次数: 29

Learning Arbitrary Statistical Mixtures of Discrete Distributions 学习离散分布的任意统计混合

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-04-09 DOI: 10.1145/2746539.2746584

Jian Li, Y. Rabani, L. Schulman, Chaitanya Swamy

We study the problem of learning from unlabeled samples very general statistical mixture models on large finite sets. Specifically, the model to be learned, mix, is a probability distribution over probability distributions p, where each such p is a probability distribution over [n] = {1,2,...,n}. When we sample from mix, we do not observe p directly, but only indirectly and in very noisy fashion, by sampling from [n] repeatedly, independently K times from the distribution p. The problem is to infer mix to high accuracy in transportation (earthmover) distance. We give the first efficient algorithms for learning this mixture model without making any restricting assumptions on the structure of the distribution $mix$. We bound the quality of the solution as a function of the size of the samples K and the number of samples used. Our model and results have applications to a variety of unsupervised learning scenarios, including learning topic models and collaborative filtering.

我们研究了在大有限集上从未标记样本中学习非常一般的统计混合模型的问题。具体来说，要学习的模型mix是概率分布p上的概率分布，其中每个这样的p都是[n] ={1,2，…，n}上的概率分布。当我们从混合物中取样时，我们不直接观察到p，而只是间接地以非常嘈杂的方式观察到p，通过从分布p中独立地从[n]中重复采样K次。问题是在运输(推土机)距离上以高精度推断混合物。我们给出了学习这个混合模型的第一个有效算法，而没有对分布$mix$的结构做任何限制假设。我们将溶液的质量限定为样本大小K和所用样本数量的函数。我们的模型和结果适用于各种无监督学习场景，包括学习主题模型和协同过滤。

引用次数: 19

Byzantine Agreement with Optimal Early Stopping, Optimal Resilience and Polynomial Complexity 具有最优提前停止、最优弹性和多项式复杂度的拜占庭协议

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-04-09 DOI: 10.1145/2746539.2746581

Ittai Abraham, D. Dolev

We provide the first protocol that solves Byzantine agreement with optimal early stopping (min{f+2,t+1} rounds) and optimal resilience (n>3t) using polynomial message size and computation. All previous approaches obtained sub-optimal results and used resolve rules that looked only at the immediate children in the EIG (Exponential Information Gathering) tree. At the heart of our solution are new resolve rules that look at multiple layers of the EIG tree.

我们提供了第一个使用多项式消息大小和计算解决拜占庭协议的最佳早期停止(min{f+2,t+1}轮)和最佳弹性(n>3t)的协议。所有以前的方法都获得了次优结果，并且使用的解析规则只关注EIG(指数信息收集)树中的直接子节点。我们的解决方案的核心是查看EIG树的多个层的新解析规则。

引用次数: 22

Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams 一遍动态流上保持密集子图的空间和时间效率算法

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-04-09 DOI: 10.1145/2746539.2746592

Sayan Bhattacharya, M. Henzinger, Danupon Nanongkai, Charalampos E. Tsourakakis

While in many graph mining applications it is crucial to handle a stream of updates efficiently in terms of both time and space, not much was known about achieving such type of algorithm. In this paper we study this issue for a problem which lies at the core of many graph mining applications called densest subgraph problem. We develop an algorithm that achieves time- and space-efficiency for this problem simultaneously. It is one of the first of its kind for graph problems to the best of our knowledge. Given an input graph, the densest subgraph is the subgraph that maximizes the ratio between the number of edges and the number of nodes. For any ε>0, our algorithm can, with high probability, maintain a (4+ε)-approximate solution under edge insertions and deletions using ~O(n) space and ~O(1) amortized time per update; here, $n$ is the number of nodes in the graph and ~O hides the O(polylog_{1+ε} n) term. The approximation ratio can be improved to (2+ε) with more time. It can be extended to a (2+ε)-approximation sublinear-time algorithm and a distributed-streaming algorithm. Our algorithm is the first streaming algorithm that can maintain the densest subgraph in one pass. Prior to this, no algorithm could do so even in the special case of an incremental stream and even when there is no time restriction. The previously best algorithm in this setting required O(log n) passes [BahmaniKV12]. The space required by our algorithm is tight up to a polylogarithmic factor.

虽然在许多图挖掘应用程序中，从时间和空间的角度有效地处理更新流是至关重要的，但人们对实现这种类型的算法知之甚少。在本文中，我们研究了许多图挖掘应用的核心问题——最密集子图问题。我们开发了一种同时实现时间和空间效率的算法。据我们所知，这是图形问题的第一种方法。给定一个输入图，密度最大的子图是使边数与节点数之比最大化的子图。对于任何ε>0，我们的算法在插入和删除边缘的情况下，使用~O(n)空间和~O(1)平摊时间，有很高的概率保持一个(4+ε)-近似解;这里，$n$是图中的节点数，~O隐藏了O(polylog_{1+ε} n)项。随着时间的延长，近似比可以提高到(2+ε)。它可以推广为(2+ε)近似亚线性时间算法和分布式流算法。我们的算法是第一个可以一次维护最密集子图的流算法。在此之前，即使在增量流的特殊情况下，即使没有时间限制，也没有算法可以做到这一点。在此设置中，之前的最佳算法需要通过O(log n)次[BahmaniKV12]。我们的算法所需的空间紧到一个多对数因子。

{"title":"Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams","authors":"Sayan Bhattacharya, M. Henzinger, Danupon Nanongkai, Charalampos E. Tsourakakis","doi":"10.1145/2746539.2746592","DOIUrl":"https://doi.org/10.1145/2746539.2746592","url":null,"abstract":"While in many graph mining applications it is crucial to handle a stream of updates efficiently in terms of both time and space, not much was known about achieving such type of algorithm. In this paper we study this issue for a problem which lies at the core of many graph mining applications called densest subgraph problem. We develop an algorithm that achieves time- and space-efficiency for this problem simultaneously. It is one of the first of its kind for graph problems to the best of our knowledge. Given an input graph, the densest subgraph is the subgraph that maximizes the ratio between the number of edges and the number of nodes. For any ε>0, our algorithm can, with high probability, maintain a (4+ε)-approximate solution under edge insertions and deletions using ~O(n) space and ~O(1) amortized time per update; here, $n$ is the number of nodes in the graph and ~O hides the O(polylog_{1+ε} n) term. The approximation ratio can be improved to (2+ε) with more time. It can be extended to a (2+ε)-approximation sublinear-time algorithm and a distributed-streaming algorithm. Our algorithm is the first streaming algorithm that can maintain the densest subgraph in one pass. Prior to this, no algorithm could do so even in the special case of an incremental stream and even when there is no time restriction. The previously best algorithm in this setting required O(log n) passes [BahmaniKV12]. The space required by our algorithm is tight up to a polylogarithmic factor.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73985031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 127

Learning Mixtures of Gaussians in High Dimensions 高维高斯混合学习

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

Pub Date : 2015-03-01 DOI: 10.1145/2746539.2746616

Rong Ge, Qingqing Huang, S. Kakade

Efficiently learning mixture of Gaussians is a fundamental problem in statistics and learning theory. Given samples coming from a random one out of k Gaussian distributions in Rn, the learning problem asks to estimate the means and the covariance matrices of these Gaussians. This learning problem arises in many areas ranging from the natural sciences to the social sciences, and has also found many ma- chine learning applications. Unfortunately, learning mixture of Gaussians is an information theoretically hard problem: in order to learn the parameters up to a reasonable accuracy, the number of samples required is exponential in the number of Gaussian components in the worst case. In this work, we show that provided we are in high enough dimensions, the class of Gaussian mixtures is learnable in its most general form under a smoothed analysis framework, where the parameters are randomly perturbed from an adversarial starting point. In particular, given samples from a mixture of Gaussians with randomly perturbed parameters, when n ≥ Ω(k2), we give an algorithm that learns the parameters with polynomial running time and using polynomial number of samples. The central algorithmic ideas consist of new ways to de- compose the moment tensor of the Gaussian mixture by exploiting its structural properties. The symmetries of this tensor are derived from the combinatorial structure of higher order moments of Gaussian distributions (sometimes referred to as Isserlis' theorem or Wick's theorem). We also develop new tools for bounding smallest singular values of structured random matrices, which could be useful in other smoothed analysis settings.

有效地学习高斯混合分布是统计学和学习理论中的一个基本问题。给定样本来自Rn中k个高斯分布中的随机样本，学习问题要求估计这些高斯分布的均值和协方差矩阵。这个学习问题出现在从自然科学到社会科学的许多领域，并且也发现了许多机器学习的应用。不幸的是，学习高斯混合是一个信息理论上的难题:为了学习参数达到合理的精度，在最坏的情况下，所需的样本数量是高斯分量数量的指数。在这项工作中，我们表明，如果我们在足够高的维度上，高斯混合类在平滑分析框架下的最一般形式是可学习的，其中参数从对抗起点随机摄动。特别地，对于具有随机扰动参数的高斯混合样本，当n≥Ω(k2)时，我们给出了一种以多项式运行时间和使用多项式样本数学习参数的算法。核心算法思想包括利用高斯混合矩张量的结构特性来分解它的新方法。这个张量的对称性来源于高斯分布的高阶矩的组合结构(有时被称为Isserlis定理或Wick定理)。我们还开发了结构化随机矩阵的最小奇异值边界的新工具，这可能在其他平滑分析设置中有用。

{"title":"Learning Mixtures of Gaussians in High Dimensions","authors":"Rong Ge, Qingqing Huang, S. Kakade","doi":"10.1145/2746539.2746616","DOIUrl":"https://doi.org/10.1145/2746539.2746616","url":null,"abstract":"Efficiently learning mixture of Gaussians is a fundamental problem in statistics and learning theory. Given samples coming from a random one out of k Gaussian distributions in Rn, the learning problem asks to estimate the means and the covariance matrices of these Gaussians. This learning problem arises in many areas ranging from the natural sciences to the social sciences, and has also found many ma- chine learning applications. Unfortunately, learning mixture of Gaussians is an information theoretically hard problem: in order to learn the parameters up to a reasonable accuracy, the number of samples required is exponential in the number of Gaussian components in the worst case. In this work, we show that provided we are in high enough dimensions, the class of Gaussian mixtures is learnable in its most general form under a smoothed analysis framework, where the parameters are randomly perturbed from an adversarial starting point. In particular, given samples from a mixture of Gaussians with randomly perturbed parameters, when n ≥ Ω(k2), we give an algorithm that learns the parameters with polynomial running time and using polynomial number of samples. The central algorithmic ideas consist of new ways to de- compose the moment tensor of the Gaussian mixture by exploiting its structural properties. The symmetries of this tensor are derived from the combinatorial structure of higher order moments of Gaussian distributions (sometimes referred to as Isserlis' theorem or Wick's theorem). We also develop new tools for bounding smallest singular values of structured random matrices, which could be useful in other smoothed analysis settings.","PeriodicalId":20566,"journal":{"name":"Proceedings of the forty-seventh annual ACM symposium on Theory of Computing","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74555554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 117

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀