关于分离良好高斯分布的学习混合

O. Regev, Aravindan Vijayaraghavan
{"title":"关于分离良好高斯分布的学习混合","authors":"O. Regev, Aravindan Vijayaraghavan","doi":"10.1109/FOCS.2017.17","DOIUrl":null,"url":null,"abstract":"We consider the problem of efficiently learning mixtures of a large number of spherical Gaussians, when the components of the mixture are well separated. In the most basic form of this problem, we are given samples from a uniform mixture of k standard spherical Gaussians with means mu_1,...,mu_k in R^d, and the goal is to estimate the means up to accuracy δ using poly(k,d, 1/δ) samples.In this work, we study the following question: what is the minimum separation needed between the means for solving this task? The best known algorithm due to Vempala and Wang [JCSS 2004] requires a separation of roughly min{k,d}^{1/4}. On the other hand, Moitra and Valiant [FOCS 2010] showed that with separation o(1), exponentially many samples are required. We address the significant gap between these two bounds, by showing the following results.• We show that with separation o(√log k), super-polynomially many samples are required. In fact, this holds even when the k means of the Gaussians are picked at random in d=O(log k) dimensions.• We show that with separation Ω(√log k), poly(k,d,1/δ) samples suffice. Notice that the bound on the separation is independent of δ. This result is based on a new and efficient accuracy boosting algorithm that takes as input coarse estimates of the true means and in time (and samples) poly(k,d, 1δ) outputs estimates of the means up to arbitrarily good accuracy δ assuming the separation between the means is Ωmin √(log k),√d) (independently of δ). The idea of the algorithm is to iteratively solve a diagonally dominant system of non-linear equations.We also (1) present a computationally efficient algorithm in d=O(1) dimensions with only Ω(√{d}) separation, and (2) extend our results to the case that components might have different weights and variances. These results together essentially characterize the optimal order of separation between components that is needed to learn a mixture of k spherical Gaussians with polynomial samples.","PeriodicalId":311592,"journal":{"name":"2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"67","resultStr":"{\"title\":\"On Learning Mixtures of Well-Separated Gaussians\",\"authors\":\"O. Regev, Aravindan Vijayaraghavan\",\"doi\":\"10.1109/FOCS.2017.17\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of efficiently learning mixtures of a large number of spherical Gaussians, when the components of the mixture are well separated. In the most basic form of this problem, we are given samples from a uniform mixture of k standard spherical Gaussians with means mu_1,...,mu_k in R^d, and the goal is to estimate the means up to accuracy δ using poly(k,d, 1/δ) samples.In this work, we study the following question: what is the minimum separation needed between the means for solving this task? The best known algorithm due to Vempala and Wang [JCSS 2004] requires a separation of roughly min{k,d}^{1/4}. On the other hand, Moitra and Valiant [FOCS 2010] showed that with separation o(1), exponentially many samples are required. We address the significant gap between these two bounds, by showing the following results.• We show that with separation o(√log k), super-polynomially many samples are required. In fact, this holds even when the k means of the Gaussians are picked at random in d=O(log k) dimensions.• We show that with separation Ω(√log k), poly(k,d,1/δ) samples suffice. Notice that the bound on the separation is independent of δ. This result is based on a new and efficient accuracy boosting algorithm that takes as input coarse estimates of the true means and in time (and samples) poly(k,d, 1δ) outputs estimates of the means up to arbitrarily good accuracy δ assuming the separation between the means is Ωmin √(log k),√d) (independently of δ). The idea of the algorithm is to iteratively solve a diagonally dominant system of non-linear equations.We also (1) present a computationally efficient algorithm in d=O(1) dimensions with only Ω(√{d}) separation, and (2) extend our results to the case that components might have different weights and variances. These results together essentially characterize the optimal order of separation between components that is needed to learn a mixture of k spherical Gaussians with polynomial samples.\",\"PeriodicalId\":311592,\"journal\":{\"name\":\"2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"67\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FOCS.2017.17\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FOCS.2017.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 67

摘要

我们考虑了当混合物的成分被很好地分离时,有效地学习大量球形高斯混合物的问题。在这个问题的最基本形式中,我们从k个平均为mu_1,…的标准球面高斯函数的均匀混合物中得到样本。,mu_k在R^d中,目标是估计达到精度的均值δ使用poly(k,d, 1/δ)样品。在这项工作中,我们研究了以下问题:解决这个任务的方法之间需要的最小分离是什么?Vempala和Wang [JCSS 2004]提出的最著名的算法需要大约min{k,d}^{1/4}的分离。另一方面,Moitra和Valiant [FOCS 2010]表明,当分离为0(1)时,需要的样本数量呈指数级增长。通过显示以下结果,我们解决了这两个界限之间的显著差距。我们表明,在分离为0 (√log k)的情况下,需要超多项式的多个样本。事实上,即使在d=O(log k)维中随机选取高斯分布的k个均值时,这一点也成立。我们表明,在分离Ω(√log k)的情况下,poly(k,d,1/δ)样本就足够了。注意,分隔的边界与δ无关。该结果基于一种新的高效精度提升算法,该算法将真实均值的粗略估计作为输入,并在时间(和样本)poly(k,d, 1δ)中输出均值的估计,达到任意好的精度δ假设均值之间的距离为Ωmin √(log k),√d)(独立于δ)。该算法的思想是迭代求解一个对角占优的非线性方程组。我们还(1)提出了d=O(1)个维度的计算效率高的算法,只有Ω(√{d})分离,并且(2)将我们的结果扩展到组件可能具有不同权重和方差的情况。这些结果结合在一起,本质上表征了学习k个具有多项式样本的球状高斯混合所需的组件之间的最佳分离顺序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
On Learning Mixtures of Well-Separated Gaussians
We consider the problem of efficiently learning mixtures of a large number of spherical Gaussians, when the components of the mixture are well separated. In the most basic form of this problem, we are given samples from a uniform mixture of k standard spherical Gaussians with means mu_1,...,mu_k in R^d, and the goal is to estimate the means up to accuracy δ using poly(k,d, 1/δ) samples.In this work, we study the following question: what is the minimum separation needed between the means for solving this task? The best known algorithm due to Vempala and Wang [JCSS 2004] requires a separation of roughly min{k,d}^{1/4}. On the other hand, Moitra and Valiant [FOCS 2010] showed that with separation o(1), exponentially many samples are required. We address the significant gap between these two bounds, by showing the following results.• We show that with separation o(√log k), super-polynomially many samples are required. In fact, this holds even when the k means of the Gaussians are picked at random in d=O(log k) dimensions.• We show that with separation Ω(√log k), poly(k,d,1/δ) samples suffice. Notice that the bound on the separation is independent of δ. This result is based on a new and efficient accuracy boosting algorithm that takes as input coarse estimates of the true means and in time (and samples) poly(k,d, 1δ) outputs estimates of the means up to arbitrarily good accuracy δ assuming the separation between the means is Ωmin √(log k),√d) (independently of δ). The idea of the algorithm is to iteratively solve a diagonally dominant system of non-linear equations.We also (1) present a computationally efficient algorithm in d=O(1) dimensions with only Ω(√{d}) separation, and (2) extend our results to the case that components might have different weights and variances. These results together essentially characterize the optimal order of separation between components that is needed to learn a mixture of k spherical Gaussians with polynomial samples.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On Learning Mixtures of Well-Separated Gaussians Obfuscating Compute-and-Compare Programs under LWE Minor-Free Graphs Have Light Spanners Lockable Obfuscation How to Achieve Non-Malleability in One or Two Rounds
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1