Mixture models, robustness, and sum of squares proofs

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing Pub Date : 2017-11-20 DOI:10.1145/3188745.3188748

Samuel B. Hopkins, Jerry Li

{"title":"Mixture models, robustness, and sum of squares proofs","authors":"Samuel B. Hopkins, Jerry Li","doi":"10.1145/3188745.3188748","DOIUrl":null,"url":null,"abstract":"We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Our contributions are: Mixture models with separated means: We study mixtures of poly(k)-many k-dimensional distributions where the means of every pair of distributions are separated by at least kε. In the special case of spherical Gaussian mixtures, we give a kO(1/ε)-time algorithm that learns the means assuming separation at least kε, for any ε> 0. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. Robust estimation: When an unknown (1−ε)-fraction of X1,…,Xn are chosen from a sub-Gaussian distribution with mean µ but the remaining points are chosen adversarially, we give an algorithm recovering µ to error ε1−1/t in time kO(t), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than ε1/2. As a corollary, we achieve similar results for robust covariance estimation. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,…,Xn ∈ ℝk for large k and n = poly(k) with the promise that a subset of X1,…,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"108 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"158","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3188745.3188748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 158

Abstract

We use the Sum of Squares method to develop new efficient algorithms for learning well-separated mixtures of Gaussians and robust mean estimation, both in high dimensions, that substantially improve upon the statistical guarantees achieved by previous efficient algorithms. Our contributions are: Mixture models with separated means: We study mixtures of poly(k)-many k-dimensional distributions where the means of every pair of distributions are separated by at least kε. In the special case of spherical Gaussian mixtures, we give a kO(1/ε)-time algorithm that learns the means assuming separation at least kε, for any ε> 0. This is the first algorithm to improve on greedy (“single-linkage”) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k1/4. Robust estimation: When an unknown (1−ε)-fraction of X1,…,Xn are chosen from a sub-Gaussian distribution with mean µ but the remaining points are chosen adversarially, we give an algorithm recovering µ to error ε1−1/t in time kO(t), so long as sub-Gaussian-ness up to O(t) moments can be certified by a Sum of Squares proof. This is the first polynomial-time algorithm with guarantees approaching the information-theoretic limit for non-Gaussian distributions. Previous algorithms could not achieve error better than ε1/2. As a corollary, we achieve similar results for robust covariance estimation. Both of these results are based on a unified technique. Inspired by recent algorithms of Diakonikolas et al. in robust statistics, we devise an SDP based on the Sum of Squares method for the following setting: given X1,…,Xn ∈ ℝk for large k and n = poly(k) with the promise that a subset of X1,…,Xn were sampled from a probability distribution with bounded moments, recover some information about that distribution.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

混合模型，鲁棒性和平方和证明

我们使用平方和方法来开发新的高效算法，用于学习良好分离的高斯混合和鲁棒平均估计，两者都是在高维上，大大提高了以前高效算法所实现的统计保证。我们的贡献是:具有分离均值的混合模型:我们研究多(k)-许多k维分布的混合，其中每对分布的均值至少相隔kε。在球形高斯混合的特殊情况下，我们给出了一个kO(1/ε)时间算法，该算法学习了假设分离至少为kε的均值，对于任何ε> 0。这是第一个改进贪婪(“单链接”)和谱聚类的算法，打破了在分离k1/4时高效算法的长期障碍。鲁棒性估计:当从均值为μ的亚高斯分布中选取未知的(1−ε)分数X1，…，Xn，而其余的点都是逆向选取时，我们给出了一种在kO(t)时间内恢复μ to误差ε1−1/t的算法，只要在O(t)阶矩以内的亚高斯性可以通过平方和证明得到证明。这是第一个多项式时间算法，保证接近非高斯分布的信息论极限。以往的算法均不能达到优于ε1/2的误差。作为推论，我们在稳健协方差估计上也得到了类似的结果。这两个结果都是基于一个统一的技术。受Diakonikolas等人在鲁棒统计中的最新算法的启发，我们设计了一种基于平方和方法的SDP，用于以下设置:给定X1，…，Xn∈∈k(大k)， n = poly(k)，并承诺从具有有界矩的概率分布中采样X1，…，Xn的子集，恢复该分布的一些信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

自引率

0.00%

发文量

期刊最新文献

Data-dependent hashing via nonlinear spectral gaps Interactive compression to external information The query complexity of graph isomorphism: bypassing distribution testing lower bounds Collusion resistant traitor tracing from learning with errors Explicit binary tree codes with polylogarithmic size alphabet