We explicitly construct an extractor for two independent sources on n bits, each with polylogarithmic min-entropy. Our extractor outputs one bit and has polynomially small error. The best previous extractor, by Bourgain, required each source to have min-entropy .499n. A key ingredient in our construction is an explicit construction of a monotone, almost-balanced Boolean functions that are resilient to coalitions. In fact, our construction is stronger in that it gives an explicit extractor for a generalization of non-oblivious bit-fixing sources on n bits, where some unknown n-q bits are chosen almost polylogarithmic-wise independently, and the remaining q bits are chosen by an adversary as an arbitrary function of the n-q bits. The best previous construction, by Viola, achieved q quadratically smaller than our result. Our explicit two-source extractor directly implies improved constructions of a K-Ramsey graph over N vertices, improving bounds obtained by Barak et al. and matching independent work by Cohen.
{"title":"Explicit two-source extractors and resilient functions","authors":"Eshan Chattopadhyay, David Zuckerman","doi":"10.1145/2897518.2897528","DOIUrl":"https://doi.org/10.1145/2897518.2897528","url":null,"abstract":"We explicitly construct an extractor for two independent sources on n bits, each with polylogarithmic min-entropy. Our extractor outputs one bit and has polynomially small error. The best previous extractor, by Bourgain, required each source to have min-entropy .499n. A key ingredient in our construction is an explicit construction of a monotone, almost-balanced Boolean functions that are resilient to coalitions. In fact, our construction is stronger in that it gives an explicit extractor for a generalization of non-oblivious bit-fixing sources on n bits, where some unknown n-q bits are chosen almost polylogarithmic-wise independently, and the remaining q bits are chosen by an adversary as an arbitrary function of the n-q bits. The best previous construction, by Viola, achieved q quadratically smaller than our result. Our explicit two-source extractor directly implies improved constructions of a K-Ramsey graph over N vertices, improving bounds obtained by Barak et al. and matching independent work by Cohen.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115773476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincent Cohen-Addad, Éric Colin de Verdière, P. Klein, Claire Mathieu, David Meierfrankenfeld
We present a framework for addressing several problems on weighted planar graphs and graphs of bounded genus. With that framework, we derive polynomial-time approximation schemes for the following problems in planar graphs or graphs of bounded genus: edge-weighted tree cover and tour cover; vertex-weighted connected dominating set, max-weight-leaf spanning tree, and connected vertex cover. In addition, we obtain a polynomial-time approximation scheme for feedback vertex set in planar graphs. These are the first polynomial-time approximation schemes for all those problems in weighted embedded graphs. (For unweighted versions of some of these problems, polynomial-time approximation schemes were previously given using bidimensionality.)
{"title":"Approximating connectivity domination in weighted bounded-genus graphs","authors":"Vincent Cohen-Addad, Éric Colin de Verdière, P. Klein, Claire Mathieu, David Meierfrankenfeld","doi":"10.1145/2897518.2897635","DOIUrl":"https://doi.org/10.1145/2897518.2897635","url":null,"abstract":"We present a framework for addressing several problems on weighted planar graphs and graphs of bounded genus. With that framework, we derive polynomial-time approximation schemes for the following problems in planar graphs or graphs of bounded genus: edge-weighted tree cover and tour cover; vertex-weighted connected dominating set, max-weight-leaf spanning tree, and connected vertex cover. In addition, we obtain a polynomial-time approximation scheme for feedback vertex set in planar graphs. These are the first polynomial-time approximation schemes for all those problems in weighted embedded graphs. (For unweighted versions of some of these problems, polynomial-time approximation schemes were previously given using bidimensionality.)","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116280483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The classical low rank approximation problem is: given a matrix A, find a rank-k matrix B such that the Frobenius norm of A − B is minimized. It can be solved efficiently using, for instance, the Singular Value Decomposition (SVD). If one allows randomization and approximation, it can be solved in time proportional to the number of non-zero entries of A with high probability. Inspired by practical applications, we consider a weighted version of low rank approximation: for a non-negative weight matrix W we seek to minimize ∑i, j (Wi, j · (Ai,j − Bi,j))2. The classical problem is a special case of this problem when all weights are 1. Weighted low rank approximation is known to be NP-hard, so we are interested in a meaningful parametrization that would allow efficient algorithms. In this paper we present several efficient algorithms for the case of small k and under the assumption that the weight matrix W is of low rank, or has a small number of distinct columns. An important feature of our algorithms is that they do not assume anything about the matrix A. We also obtain lower bounds that show that our algorithms are nearly optimal in these parameters. We give several applications in which these parameters are small. To the best of our knowledge, the present paper is the first to provide algorithms for the weighted low rank approximation problem with provable guarantees. Perhaps even more importantly, our algorithms proceed via a new technique, which we call “guess the sketch”. The technique turns out to be general enough to give solutions to several other fundamental problems: adversarial matrix completion, weighted non-negative matrix factorization and tensor completion.
{"title":"Weighted low rank approximations with provable guarantees","authors":"Ilya P. Razenshteyn, Zhao Song, David P. Woodruff","doi":"10.1145/2897518.2897639","DOIUrl":"https://doi.org/10.1145/2897518.2897639","url":null,"abstract":"The classical low rank approximation problem is: given a matrix A, find a rank-k matrix B such that the Frobenius norm of A − B is minimized. It can be solved efficiently using, for instance, the Singular Value Decomposition (SVD). If one allows randomization and approximation, it can be solved in time proportional to the number of non-zero entries of A with high probability. Inspired by practical applications, we consider a weighted version of low rank approximation: for a non-negative weight matrix W we seek to minimize ∑i, j (Wi, j · (Ai,j − Bi,j))2. The classical problem is a special case of this problem when all weights are 1. Weighted low rank approximation is known to be NP-hard, so we are interested in a meaningful parametrization that would allow efficient algorithms. In this paper we present several efficient algorithms for the case of small k and under the assumption that the weight matrix W is of low rank, or has a small number of distinct columns. An important feature of our algorithms is that they do not assume anything about the matrix A. We also obtain lower bounds that show that our algorithms are nearly optimal in these parameters. We give several applications in which these parameters are small. To the best of our knowledge, the present paper is the first to provide algorithms for the weighted low rank approximation problem with provable guarantees. Perhaps even more importantly, our algorithms proceed via a new technique, which we call “guess the sketch”. The technique turns out to be general enough to give solutions to several other fundamental problems: adversarial matrix completion, weighted non-negative matrix factorization and tensor completion.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125311977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A spanner is a sparse subgraph that approximately preserves the pairwise distances of the original graph. It is well known that there is a smooth tradeoff between the sparsity of a spanner and the quality of its approximation, so long as distance error is measured multiplicatively. A central open question in the field is to prove or disprove whether such a tradeoff exists also in the regime of additive error. That is, is it true that for all ε>0, there is a constant kε such that every graph has a spanner on O(n1+ε) edges that preserves its pairwise distances up to +kε? Previous lower bounds are consistent with a positive resolution to this question, while previous upper bounds exhibit the beginning of a tradeoff curve: all graphs have +2 spanners on O(n3/2) edges, +4 spanners on Õ(n7/5) edges, and +6 spanners on O(n4/3) edges. However, progress has mysteriously halted at the n4/3 bound, and despite significant effort from the community, the question has remained open for all 0 < ε < 1/3. Our main result is a surprising negative resolution of the open question, even in a highly generalized setting. We show a new information theoretic incompressibility bound: there is no function that compresses graphs into O(n4/3 − ε) bits so that distance information can be recovered within +no(1) error. As a special case of our theorem, we get a tight lower bound on the sparsity of additive spanners: the +6 spanner on O(n4/3) edges cannot be improved in the exponent, even if any subpolynomial amount of additive error is allowed. Our theorem implies new lower bounds for related objects as well; for example, the twenty-year-old +4 emulator on O(n4/3) edges also cannot be improved in the exponent unless the error allowance is polynomial. Central to our construction is a new type of graph product, which we call the Obstacle Product. Intuitively, it takes two graphs G,H and produces a new graph G H whose shortest paths structure looks locally like H but globally like G.
{"title":"The 4/3 additive spanner exponent is tight","authors":"Amir Abboud, Gregory Bodwin","doi":"10.1145/2897518.2897555","DOIUrl":"https://doi.org/10.1145/2897518.2897555","url":null,"abstract":"A spanner is a sparse subgraph that approximately preserves the pairwise distances of the original graph. It is well known that there is a smooth tradeoff between the sparsity of a spanner and the quality of its approximation, so long as distance error is measured multiplicatively. A central open question in the field is to prove or disprove whether such a tradeoff exists also in the regime of additive error. That is, is it true that for all ε>0, there is a constant kε such that every graph has a spanner on O(n1+ε) edges that preserves its pairwise distances up to +kε? Previous lower bounds are consistent with a positive resolution to this question, while previous upper bounds exhibit the beginning of a tradeoff curve: all graphs have +2 spanners on O(n3/2) edges, +4 spanners on Õ(n7/5) edges, and +6 spanners on O(n4/3) edges. However, progress has mysteriously halted at the n4/3 bound, and despite significant effort from the community, the question has remained open for all 0 < ε < 1/3. Our main result is a surprising negative resolution of the open question, even in a highly generalized setting. We show a new information theoretic incompressibility bound: there is no function that compresses graphs into O(n4/3 − ε) bits so that distance information can be recovered within +no(1) error. As a special case of our theorem, we get a tight lower bound on the sparsity of additive spanners: the +6 spanner on O(n4/3) edges cannot be improved in the exponent, even if any subpolynomial amount of additive error is allowed. Our theorem implies new lower bounds for related objects as well; for example, the twenty-year-old +4 emulator on O(n4/3) edges also cannot be improved in the exponent unless the error allowance is polynomial. Central to our construction is a new type of graph product, which we call the Obstacle Product. Intuitively, it takes two graphs G,H and produces a new graph G H whose shortest paths structure looks locally like H but globally like G.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133348568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diptarka Chakraborty, Elazar Goldenberg, M. Koucký
The Hamming and the edit metrics are two common notions of measuring distances between pairs of strings x,y lying in the Boolean hypercube. The edit distance between x and y is defined as the minimum number of character insertion, deletion, and bit flips needed for converting x into y. Whereas, the Hamming distance between x and y is the number of bit flips needed for converting x to y. In this paper we study a randomized injective embedding of the edit distance into the Hamming distance with a small distortion. We show a randomized embedding with quadratic distortion. Namely, for any x,y satisfying that their edit distance equals k, the Hamming distance between the embedding of x and y is O(k2) with high probability. This improves over the distortion ratio of O( n * n) obtained by Jowhari (2012) for small values of k. Moreover, the embedding output size is linear in the input size and the embedding can be computed using a single pass over the input. We provide several applications for this embedding. Among our results we provide a one-pass (streaming) algorithm for edit distance running in space O(s) and computing edit distance exactly up-to distance s1/6. This algorithm is based on kernelization for edit distance that is of independent interest.
{"title":"Streaming algorithms for embedding and computing edit distance in the low distance regime","authors":"Diptarka Chakraborty, Elazar Goldenberg, M. Koucký","doi":"10.1145/2897518.2897577","DOIUrl":"https://doi.org/10.1145/2897518.2897577","url":null,"abstract":"The Hamming and the edit metrics are two common notions of measuring distances between pairs of strings x,y lying in the Boolean hypercube. The edit distance between x and y is defined as the minimum number of character insertion, deletion, and bit flips needed for converting x into y. Whereas, the Hamming distance between x and y is the number of bit flips needed for converting x to y. In this paper we study a randomized injective embedding of the edit distance into the Hamming distance with a small distortion. We show a randomized embedding with quadratic distortion. Namely, for any x,y satisfying that their edit distance equals k, the Hamming distance between the embedding of x and y is O(k2) with high probability. This improves over the distortion ratio of O( n * n) obtained by Jowhari (2012) for small values of k. Moreover, the embedding output size is linear in the input size and the embedding can be computed using a single pass over the input. We provide several applications for this embedding. Among our results we provide a one-pass (streaming) algorithm for edit distance running in space O(s) and computing edit distance exactly up-to distance s1/6. This algorithm is based on kernelization for edit distance that is of independent interest.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123975951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the following basic learning task: given independent draws from an unknown distribution over a discrete support, output an approximation of the distribution that is as accurate as possible in L1 distance (equivalently, total variation distance, or "statistical distance"). Perhaps surprisingly, it is often possible to "de-noise" the empirical distribution of the samples to return an approximation of the true distribution that is significantly more accurate than the empirical distribution, without relying on any prior assumptions on the distribution. We present an instance optimal learning algorithm which optimally performs this de-noising for every distribution for which such a de-noising is possible. More formally, given n independent draws from a distribution p, our algorithm returns a labelled vector whose expected distance from p is equal to the minimum possible expected error that could be obtained by any algorithm, even one that is given the true unlabeled vector of probabilities of distribution p and simply needs to assign labels---up to an additive subconstant term that is independent of p and goes to zero as n gets large. This somewhat surprising result has several conceptual implications, including the fact that, for any large sample from a distribution over discrete support, prior knowledge of the rates of decay of the tails of the distribution (e.g. power-law type assumptions) is not significantly helpful for the task of learning the distribution. As a consequence of our techniques, we also show that given a set of n samples from an arbitrary distribution, one can accurately estimate the expected number of distinct elements that will be observed in a sample of any size up to n log n. This sort of extrapolation is practically relevant, particularly to domains such as genomics where it is important to understand how much more might be discovered given larger sample sizes, and we are optimistic that our approach is practically viable.
我们考虑以下基本学习任务:在离散支持上给定来自未知分布的独立绘图,输出在L1距离(相当于总变化距离或“统计距离”)中尽可能准确的分布近似值。也许令人惊讶的是,通常可以对样本的经验分布进行“去噪”,以返回比经验分布精确得多的真实分布的近似值,而不依赖于对分布的任何先前假设。我们提出了一种实例最优学习算法,该算法对每个可能进行这种去噪的分布都进行了最优的去噪。更正式地说,给定n个独立的分布p,我们的算法返回一个有标签的向量,它到p的期望距离等于任何算法可以获得的最小可能的期望误差,即使给定分布p的概率的真正未标记向量,只需要分配标签——直到一个与p无关的附加次常数项,随着n变大而趋于零。这个有点令人惊讶的结果有几个概念含义,包括这样一个事实,即对于离散支持分布的任何大样本,分布尾部衰减率的先验知识(例如幂律类型的假设)对学习分布的任务没有显着帮助。由于我们的技术,我们也表明,给定一组n任意分布的样本,可以准确地估计预期数量的不同的元素,将观察到的任何大小的样本到n o (log n)。这种外推法实际上是相关的,尤其是基因组学等领域,重要的是要了解更可能发现更大的样本量,我们乐观,我们的方法是实际可行的。
{"title":"Instance optimal learning of discrete distributions","authors":"G. Valiant, Paul Valiant","doi":"10.1145/2897518.2897641","DOIUrl":"https://doi.org/10.1145/2897518.2897641","url":null,"abstract":"We consider the following basic learning task: given independent draws from an unknown distribution over a discrete support, output an approximation of the distribution that is as accurate as possible in L1 distance (equivalently, total variation distance, or \"statistical distance\"). Perhaps surprisingly, it is often possible to \"de-noise\" the empirical distribution of the samples to return an approximation of the true distribution that is significantly more accurate than the empirical distribution, without relying on any prior assumptions on the distribution. We present an instance optimal learning algorithm which optimally performs this de-noising for every distribution for which such a de-noising is possible. More formally, given n independent draws from a distribution p, our algorithm returns a labelled vector whose expected distance from p is equal to the minimum possible expected error that could be obtained by any algorithm, even one that is given the true unlabeled vector of probabilities of distribution p and simply needs to assign labels---up to an additive subconstant term that is independent of p and goes to zero as n gets large. This somewhat surprising result has several conceptual implications, including the fact that, for any large sample from a distribution over discrete support, prior knowledge of the rates of decay of the tails of the distribution (e.g. power-law type assumptions) is not significantly helpful for the task of learning the distribution. As a consequence of our techniques, we also show that given a set of n samples from an arbitrary distribution, one can accurately estimate the expected number of distinct elements that will be observed in a sample of any size up to n log n. This sort of extrapolation is practically relevant, particularly to domains such as genomics where it is important to understand how much more might be discovered given larger sample sizes, and we are optimistic that our approach is practically viable.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126216732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a candidate reduction for ruling out polynomial-time algorithms for unique games, either under plausible complexity assumptions, or unconditionally for Lasserre semi-definite programs with a constant number of rounds. We analyze the completeness and Lasserre solution of our construction, and provide a soundness analysis in a certain setting of interest. Addressing general settings is tightly connected to a question on Gaussian isoperimetry. Our construction is based on our previous work on the complexity of approximately solving a system of linear equations over reals, which we suggested as an avenue towards a (positive) resolution of the Unique Games Conjecture. The construction employs a new encoding scheme that we call the real code. The real code has two useful properties: like the long code, it has a unique local test, and like the Hadamard code, it has the so-called sub-code covering property.
我们提出了一种候选约简,用于排除唯一博弈的多项式时间算法,无论是在似是而非的复杂性假设下,还是对于具有常数轮数的Lasserre半确定规划的无条件约简。我们分析了我们的结构的完备性和Lasserre解,并在一定的兴趣设置下提供了稳健性分析。处理一般设置与高斯等密度的问题密切相关。我们的构建是基于我们之前关于近似解决实数上的线性方程组的复杂性的工作,我们建议将其作为解决Unique Games Conjecture的途径。这种结构采用了一种新的编码方案,我们称之为实码。真实代码有两个有用的属性:像长代码一样,它有一个唯一的局部测试,像Hadamard代码一样,它有所谓的子代码覆盖属性。
{"title":"Candidate hard unique game","authors":"Subhash Khot, Dana Moshkovitz","doi":"10.1145/2897518.2897531","DOIUrl":"https://doi.org/10.1145/2897518.2897531","url":null,"abstract":"We propose a candidate reduction for ruling out polynomial-time algorithms for unique games, either under plausible complexity assumptions, or unconditionally for Lasserre semi-definite programs with a constant number of rounds. We analyze the completeness and Lasserre solution of our construction, and provide a soundness analysis in a certain setting of interest. Addressing general settings is tightly connected to a question on Gaussian isoperimetry. Our construction is based on our previous work on the complexity of approximately solving a system of linear equations over reals, which we suggested as an avenue towards a (positive) resolution of the Unique Games Conjecture. The construction employs a new encoding scheme that we call the real code. The real code has two useful properties: like the long code, it has a unique local test, and like the Hadamard code, it has the so-called sub-code covering property.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128460813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new model of weak random sources which we call sumset sources. A sumset source X is the sum of C independent sources, with each source on n bits source having min-entropy k. We show that extractors for this class of sources can be used to give extractors for most classes of weak sources that have been studied previously, including independent sources, affine sources (which generalizes oblivious bit-fixing sources), small space sources, total entropy independent sources, and interleaved sources. This provides a unified approach for randomness extraction. A known extractor for this class of sources, prior to our work, is the Paley graph function introduced by Chor and Goldreich, which works for the sum of 2 independent sources, where one has min-entropy at least 0.51n and the other has logarithmic min-entropy. To the best of our knowledge, the only other known construction is from the work of Kamp, Rao, Vadhan and Zuckerman, which can extract from the sum of exponentially many independent sources. Our main result is an explicit extractor for the sum of C independent sources for some large enough constant C, where each source has polylogarithmic min-entropy. We then use this extractor to obtain improved extractors for other well studied classes of sources including small-space sources, affine sources and interleaved sources.
{"title":"Extractors for sumset sources","authors":"Eshan Chattopadhyay, Xin Li","doi":"10.1145/2897518.2897643","DOIUrl":"https://doi.org/10.1145/2897518.2897643","url":null,"abstract":"We propose a new model of weak random sources which we call sumset sources. A sumset source X is the sum of C independent sources, with each source on n bits source having min-entropy k. We show that extractors for this class of sources can be used to give extractors for most classes of weak sources that have been studied previously, including independent sources, affine sources (which generalizes oblivious bit-fixing sources), small space sources, total entropy independent sources, and interleaved sources. This provides a unified approach for randomness extraction. A known extractor for this class of sources, prior to our work, is the Paley graph function introduced by Chor and Goldreich, which works for the sum of 2 independent sources, where one has min-entropy at least 0.51n and the other has logarithmic min-entropy. To the best of our knowledge, the only other known construction is from the work of Kamp, Rao, Vadhan and Zuckerman, which can extract from the sum of exponentially many independent sources. Our main result is an explicit extractor for the sum of C independent sources for some large enough constant C, where each source has polylogarithmic min-entropy. We then use this extractor to obtain improved extractors for other well studied classes of sources including small-space sources, affine sources and interleaved sources.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127179538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael B. Cohen, Y. Lee, G. Miller, J. Pachocki, Aaron Sidford
In this paper we provide faster algorithms for solving the geometric median problem: given n points in d compute a point that minimizes the sum of Euclidean distances to the points. This is one of the oldest non-trivial problems in computational geometry yet despite a long history of research the previous fastest running times for computing a (1+є)-approximate geometric median were O(d· n4/3є−8/3) by Chin et. al, Õ(dexpє−4logє−1) by Badoiu et. al, O(nd+poly(d,є−1)) by Feldman and Langberg, and the polynomial running time of O((nd)O(1)log1/є) by Parrilo and Sturmfels and Xue and Ye. In this paper we show how to compute such an approximate geometric median in time O(ndlog3n/є) and O(dє−2). While our O(dє−2) is a fairly straightforward application of stochastic subgradient descent, our O(ndlog3n/є) time algorithm is a novel long step interior point method. We start with a simple O((nd)O(1)log1/є) time interior point method and show how to improve it, ultimately building an algorithm that is quite non-standard from the perspective of interior point literature. Our result is one of few cases of outperforming standard interior point theory. Furthermore, it is the only case we know of where interior point methods yield a nearly linear time algorithm for a canonical optimization problem that traditionally requires superlinear time.
{"title":"Geometric median in nearly linear time","authors":"Michael B. Cohen, Y. Lee, G. Miller, J. Pachocki, Aaron Sidford","doi":"10.1145/2897518.2897647","DOIUrl":"https://doi.org/10.1145/2897518.2897647","url":null,"abstract":"In this paper we provide faster algorithms for solving the geometric median problem: given n points in d compute a point that minimizes the sum of Euclidean distances to the points. This is one of the oldest non-trivial problems in computational geometry yet despite a long history of research the previous fastest running times for computing a (1+є)-approximate geometric median were O(d· n4/3є−8/3) by Chin et. al, Õ(dexpє−4logє−1) by Badoiu et. al, O(nd+poly(d,є−1)) by Feldman and Langberg, and the polynomial running time of O((nd)O(1)log1/є) by Parrilo and Sturmfels and Xue and Ye. In this paper we show how to compute such an approximate geometric median in time O(ndlog3n/є) and O(dє−2). While our O(dє−2) is a fairly straightforward application of stochastic subgradient descent, our O(ndlog3n/є) time algorithm is a novel long step interior point method. We start with a simple O((nd)O(1)log1/є) time interior point method and show how to improve it, ultimately building an algorithm that is quite non-standard from the perspective of interior point literature. Our result is one of few cases of outperforming standard interior point theory. Furthermore, it is the only case we know of where interior point methods yield a nearly linear time algorithm for a canonical optimization problem that traditionally requires superlinear time.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116824631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present the first polynomial-time approximation scheme (PTAS), i.e., (1+ε)-approximation algorithm for any constant ε> 0, for the planar group Steiner tree problem (in which each group lies on a boundary of a face). This result improves on the best previous approximation factor of O(logn (loglogn)O(1)). We achieve this result via a novel and powerful technique called spanner bootstrapping, which allows one to bootstrap from a superconstant approximation factor (even superpolynomial in the input size) all the way down to a PTAS. This is in contrast with the popular existing approach for planar PTASs of constructing light-weight spanners in one iteration, which notably requires a constant-factor approximate solution to start from. Spanner bootstrapping removes one of the main barriers for designing PTASs for problems which have no known constant-factor approximation (even on planar graphs), and thus can be used to obtain PTASs for several difficult-to-approximate problems. Our second major contribution required for the planar group Steiner tree PTAS is a spanner construction, which reduces the graph to have total weight within a factor of the optimal solution while approximately preserving the optimal solution. This is particularly challenging because group Steiner tree requires deciding which terminal in each group to connect by the tree, making it much harder than recent previous approaches to construct spanners for planar TSP by Klein [SIAM J. Computing 2008], subset TSP by Klein [STOC 2006], Steiner tree by Borradaile, Klein, and Mathieu [ACM Trans. Algorithms 2009], and Steiner forest by Bateni, Hajiaghayi, and Marx [J. ACM 2011] (and its improvement to an efficient PTAS by Eisenstat, Klein, and Mathieu [SODA 2012]. The main conceptual contribution here is realizing that selecting which terminals may be relevant is essentially a complicated prize-collecting process: we have to carefully weigh the cost and benefits of reaching or avoiding certain terminals in the spanner. Via a sequence of involved prize-collecting procedures, we can construct a spanner that reaches a set of terminals that is sufficient for an almost-optimal solution. Our PTAS for planar group Steiner tree implies the first PTAS for geometric Euclidean group Steiner tree with obstacles, as well as a (2+)-approximation algorithm for group TSP with obstacles, improving over the best previous constant-factor approximation algorithms. By contrast, we show that planar group Steiner forest, a slight generalization of planar group Steiner tree, is APX-hard on planar graphs of treewidth 3, even if the groups are pairwise disjoint and every group is a vertex or an edge.
针对平面群Steiner树问题(其中每一群位于一个面的边界上),给出了第一个多项式时间逼近格式(PTAS),即ε> 0的任意常数的(1+ε)逼近算法。这个结果改进了之前的最佳近似因子O(logn (loglog)O(1))。我们通过一种称为扳手引导的新颖而强大的技术实现了这一结果,该技术允许人们从超常数近似因子(甚至输入大小的超多项式)一直引导到PTAS。这与在一次迭代中构造轻型扳手的平面pass的流行现有方法形成对比,后者明显需要一个常数因子近似解作为起点。扳手自举消除了为没有已知常因子近似(甚至在平面图上)的问题设计PTASs的主要障碍之一,因此可以用于获得一些难以近似的问题的PTASs。我们对平面群Steiner树PTAS的第二个主要贡献是扳手构造,它减少了图的总权重,使其在最优解的一个因子内,同时近似保留了最优解。这尤其具有挑战性,因为组Steiner树需要决定每个组中的哪个终端通过树连接,这使得它比最近的方法更难以构建平面TSP(由Klein [SIAM J. Computing 2008],子集TSP由Klein [STOC 2006],由Borradaile, Klein和Mathieu [ACM Trans]的Steiner树)的钳子。算法2009],Steiner forest by Bateni, Hajiaghayi, and Marx [J]。(以及Eisenstat, Klein, and Mathieu对高效PTAS的改进[SODA 2012])。这里的主要概念贡献是认识到选择哪些终端可能是相关的本质上是一个复杂的奖励收集过程:我们必须仔细权衡在扳手中达到或避免某些终端的成本和收益。通过一系列相关的奖品收集程序,我们可以构造一个扳手,它可以达到一组足以产生几乎最优解的终端。我们的平面群Steiner树的PTAS暗示了具有障碍物的几何欧几里得群Steiner树的第一个PTAS,以及具有障碍物的TSP群的(2+)-近似算法,改进了之前最好的常因子近似算法。通过对比,我们证明了平面群Steiner树(平面群Steiner树的一种轻微推广)在树宽为3的平面图上是APX-hard的,即使群是两两不相交且每群都是一个顶点或一条边。
{"title":"A PTAS for planar group Steiner tree via spanner bootstrapping and prize collecting","authors":"M. Bateni, E. Demaine, M. Hajiaghayi, D. Marx","doi":"10.1145/2897518.2897549","DOIUrl":"https://doi.org/10.1145/2897518.2897549","url":null,"abstract":"We present the first polynomial-time approximation scheme (PTAS), i.e., (1+ε)-approximation algorithm for any constant ε> 0, for the planar group Steiner tree problem (in which each group lies on a boundary of a face). This result improves on the best previous approximation factor of O(logn (loglogn)O(1)). We achieve this result via a novel and powerful technique called spanner bootstrapping, which allows one to bootstrap from a superconstant approximation factor (even superpolynomial in the input size) all the way down to a PTAS. This is in contrast with the popular existing approach for planar PTASs of constructing light-weight spanners in one iteration, which notably requires a constant-factor approximate solution to start from. Spanner bootstrapping removes one of the main barriers for designing PTASs for problems which have no known constant-factor approximation (even on planar graphs), and thus can be used to obtain PTASs for several difficult-to-approximate problems. Our second major contribution required for the planar group Steiner tree PTAS is a spanner construction, which reduces the graph to have total weight within a factor of the optimal solution while approximately preserving the optimal solution. This is particularly challenging because group Steiner tree requires deciding which terminal in each group to connect by the tree, making it much harder than recent previous approaches to construct spanners for planar TSP by Klein [SIAM J. Computing 2008], subset TSP by Klein [STOC 2006], Steiner tree by Borradaile, Klein, and Mathieu [ACM Trans. Algorithms 2009], and Steiner forest by Bateni, Hajiaghayi, and Marx [J. ACM 2011] (and its improvement to an efficient PTAS by Eisenstat, Klein, and Mathieu [SODA 2012]. The main conceptual contribution here is realizing that selecting which terminals may be relevant is essentially a complicated prize-collecting process: we have to carefully weigh the cost and benefits of reaching or avoiding certain terminals in the spanner. Via a sequence of involved prize-collecting procedures, we can construct a spanner that reaches a set of terminals that is sufficient for an almost-optimal solution. Our PTAS for planar group Steiner tree implies the first PTAS for geometric Euclidean group Steiner tree with obstacles, as well as a (2+)-approximation algorithm for group TSP with obstacles, improving over the best previous constant-factor approximation algorithms. By contrast, we show that planar group Steiner forest, a slight generalization of planar group Steiner tree, is APX-hard on planar graphs of treewidth 3, even if the groups are pairwise disjoint and every group is a vertex or an edge.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"774 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116143392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}