In the classical Node-Disjoint Paths (NDP) problem, the input consists of an undirected n-vertex graph G, and a collection M={(s1,t1),…,(sk,tk)} of pairs of its vertices, called source-destination, or demand, pairs. The goal is to route the largest possible number of the demand pairs via node-disjoint paths. The best current approximation for the problem is achieved by a simple greedy algorithm, whose approximation factor is O(√n), while the best current negative result is an Ω(log1/2-δn)-hardness of approximation for any constant δ, under standard complexity assumptions. Even seemingly simple special cases of the problem are still poorly understood: when the input graph is a grid, the best current algorithm achieves an Õ(n1/4)-approximation, and when it is a general planar graph, the best current approximation ratio of an efficient algorithm is Õ(n9/19). The best currently known lower bound for both these versions of the problem is APX-hardness. In this paper we prove that NDP is 2Ω(√logn)-hard to approximate, unless all problems in NP have algorithms with running time nO(logn). Our result holds even when the underlying graph is a planar graph with maximum vertex degree 4, and all source vertices lie on the boundary of a single face (but the destination vertices may lie anywhere in the graph). We extend this result to the closely related Edge-Disjoint Paths problem, showing the same hardness of approximation ratio even for sub-cubic planar graphs with all sources lying on the boundary of a single face.
{"title":"New hardness results for routing on disjoint paths","authors":"Julia Chuzhoy, David H. K. Kim, Rachit Nimavat","doi":"10.1145/3055399.3055411","DOIUrl":"https://doi.org/10.1145/3055399.3055411","url":null,"abstract":"In the classical Node-Disjoint Paths (NDP) problem, the input consists of an undirected n-vertex graph G, and a collection M={(s1,t1),…,(sk,tk)} of pairs of its vertices, called source-destination, or demand, pairs. The goal is to route the largest possible number of the demand pairs via node-disjoint paths. The best current approximation for the problem is achieved by a simple greedy algorithm, whose approximation factor is O(√n), while the best current negative result is an Ω(log1/2-δn)-hardness of approximation for any constant δ, under standard complexity assumptions. Even seemingly simple special cases of the problem are still poorly understood: when the input graph is a grid, the best current algorithm achieves an Õ(n1/4)-approximation, and when it is a general planar graph, the best current approximation ratio of an efficient algorithm is Õ(n9/19). The best currently known lower bound for both these versions of the problem is APX-hardness. In this paper we prove that NDP is 2Ω(√logn)-hard to approximate, unless all problems in NP have algorithms with running time nO(logn). Our result holds even when the underlying graph is a planar graph with maximum vertex degree 4, and all source vertices lie on the boundary of a single face (but the destination vertices may lie anywhere in the graph). We extend this result to the closely related Edge-Disjoint Paths problem, showing the same hardness of approximation ratio even for sub-cubic planar graphs with all sources lying on the boundary of a single face.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"106 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89312751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Several fundamental optimization and counting problems arising in computer science, mathematics and physics can be reduced to one of the following computational tasks involving polynomials and set systems: given an oracle access to an m-variate real polynomial g and to a family of (multi-)subsets ℬ of [m], (1) compute the sum of coefficients of monomials in g corresponding to all the sets that appear in B(1), or find S ε ℬ such that the monomial in g corresponding to S has the largest coefficient in g. Special cases of these problems, such as computing permanents and mixed discriminants, sampling from determinantal point processes, and maximizing sub-determinants with combinatorial constraints have been topics of much recent interest in theoretical computer science. In this paper we present a general convex programming framework geared to solve both of these problems. Subsequently, we show that roughly, when g is a real stable polynomial with non-negative coefficients and B is a matroid, the integrality gap of our convex relaxation is finite and depends only on m (and not on the coefficients of g). Prior to this work, such results were known only in important but sporadic cases that relied heavily on the structure of either g or ℬ; it was not even a priori clear if one could formulate a convex relaxation that has a finite integrality gap beyond these special cases. Two notable examples are a result by Gurvits for real stable polynomials g when ℬ contains one element, and a result by Nikolov and Singh for a family of multi-linear real stable polynomials when B is the partition matroid. This work, which encapsulates almost all interesting cases of g and B, benefits from both - it is inspired by the latter in coming up with the right convex programming relaxation and the former in deriving the integrality gap. However, proving our results requires extensions of both; in that process we come up with new notions and connections between real stable polynomials and matroids which might be of independent interest.
{"title":"Real stable polynomials and matroids: optimization and counting","authors":"D. Straszak, Nisheeth K. Vishnoi","doi":"10.1145/3055399.3055457","DOIUrl":"https://doi.org/10.1145/3055399.3055457","url":null,"abstract":"Several fundamental optimization and counting problems arising in computer science, mathematics and physics can be reduced to one of the following computational tasks involving polynomials and set systems: given an oracle access to an m-variate real polynomial g and to a family of (multi-)subsets ℬ of [m], (1) compute the sum of coefficients of monomials in g corresponding to all the sets that appear in B(1), or find S ε ℬ such that the monomial in g corresponding to S has the largest coefficient in g. Special cases of these problems, such as computing permanents and mixed discriminants, sampling from determinantal point processes, and maximizing sub-determinants with combinatorial constraints have been topics of much recent interest in theoretical computer science. In this paper we present a general convex programming framework geared to solve both of these problems. Subsequently, we show that roughly, when g is a real stable polynomial with non-negative coefficients and B is a matroid, the integrality gap of our convex relaxation is finite and depends only on m (and not on the coefficients of g). Prior to this work, such results were known only in important but sporadic cases that relied heavily on the structure of either g or ℬ; it was not even a priori clear if one could formulate a convex relaxation that has a finite integrality gap beyond these special cases. Two notable examples are a result by Gurvits for real stable polynomials g when ℬ contains one element, and a result by Nikolov and Singh for a family of multi-linear real stable polynomials when B is the partition matroid. This work, which encapsulates almost all interesting cases of g and B, benefits from both - it is inspired by the latter in coming up with the right convex programming relaxation and the former in deriving the integrality gap. However, proving our results requires extensions of both; in that process we come up with new notions and connections between real stable polynomials and matroids which might be of independent interest.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"36 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74439956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We give a Las Vegas data structure which maintains a minimum spanning forest in an n-vertex edge-weighted undirected dynamic graph undergoing updates consisting of any mixture of edge insertions and deletions. Each update is supported in O(n1/2 - c) worst-case time w.h.p. where c > 0 is some constant, and this bound also holds in expectation. This is the first data structure achieving an improvement over the O(√n) deterministic worst-case update time of Eppstein et al., a bound that has been standing for 25 years. In fact, it was previously not even known how to maintain a spanning forest of an unweighted graph in worst-case time polynomially faster than Θ(√n). Our result is achieved by first giving a reduction from fully-dynamic to decremental minimum spanning forest preserving worst-case update time up to logarithmic factors. Then decremental minimum spanning forest is solved using several novel techniques, one of which involves keeping track of low-conductance cuts in a dynamic graph. An immediate corollary of our result is the first Las Vegas data structure for fully-dynamic connectivity where each update is handled in worst-case time polynomially faster than Θ(√n) w.h.p.; this data structure has O(1) worst-case query time.
{"title":"Fully-dynamic minimum spanning forest with improved worst-case update time","authors":"Christian Wulff-Nilsen","doi":"10.1145/3055399.3055415","DOIUrl":"https://doi.org/10.1145/3055399.3055415","url":null,"abstract":"We give a Las Vegas data structure which maintains a minimum spanning forest in an n-vertex edge-weighted undirected dynamic graph undergoing updates consisting of any mixture of edge insertions and deletions. Each update is supported in O(n1/2 - c) worst-case time w.h.p. where c > 0 is some constant, and this bound also holds in expectation. This is the first data structure achieving an improvement over the O(√n) deterministic worst-case update time of Eppstein et al., a bound that has been standing for 25 years. In fact, it was previously not even known how to maintain a spanning forest of an unweighted graph in worst-case time polynomially faster than Θ(√n). Our result is achieved by first giving a reduction from fully-dynamic to decremental minimum spanning forest preserving worst-case update time up to logarithmic factors. Then decremental minimum spanning forest is solved using several novel techniques, one of which involves keeping track of low-conductance cuts in a dynamic graph. An immediate corollary of our result is the first Las Vegas data structure for fully-dynamic connectivity where each update is handled in worst-case time polynomially faster than Θ(√n) w.h.p.; this data structure has O(1) worst-case query time.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81188623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper is centered on the complexity of graph problems in the well-studied LOCAL model of distributed computing, introduced by Linial [FOCS '87]. It is widely known that for many of the classic distributed graph problems (including maximal independent set (MIS) and (Δ+1)-vertex coloring), the randomized complexity is at most polylogarithmic in the size n of the network, while the best deterministic complexity is typically 2O(√logn). Understanding and potentially narrowing down this exponential gap is considered to be one of the central long-standing open questions in the area of distributed graph algorithms. We investigate the problem by introducing a complexity-theoretic framework that allows us to shed some light on the role of randomness in the LOCAL model. We define the SLOCAL model as a sequential version of the LOCAL model. Our framework allows us to prove completeness results with respect to the class of problems which can be solved efficiently in the SLOCAL model, implying that if any of the complete problems can be solved deterministically in logn rounds in the LOCAL model, we can deterministically solve all efficient SLOCAL-problems (including MIS and (Δ+1)-coloring) in logn rounds in the LOCAL model. Perhaps most surprisingly, we show that a rather rudimentary looking graph coloring problem is complete in the above sense: Color the nodes of a graph with colors red and blue such that each node of sufficiently large polylogarithmic degree has at least one neighbor of each color. The problem admits a trivial zero-round randomized solution. The result can be viewed as showing that the only obstacle to getting efficient determinstic algorithms in the LOCAL model is an efficient algorithm to approximately round fractional values into integer values. In addition, our formal framework also allows us to develop polylogarithmic-time randomized distributed algorithms in a simpler way. As a result, we provide a polylog-time distributed approximation scheme for arbitrary distributed covering and packing integer linear programs.
{"title":"On the complexity of local distributed graph problems","authors":"M. Ghaffari, F. Kuhn, Yannic Maus","doi":"10.1145/3055399.3055471","DOIUrl":"https://doi.org/10.1145/3055399.3055471","url":null,"abstract":"This paper is centered on the complexity of graph problems in the well-studied LOCAL model of distributed computing, introduced by Linial [FOCS '87]. It is widely known that for many of the classic distributed graph problems (including maximal independent set (MIS) and (Δ+1)-vertex coloring), the randomized complexity is at most polylogarithmic in the size n of the network, while the best deterministic complexity is typically 2O(√logn). Understanding and potentially narrowing down this exponential gap is considered to be one of the central long-standing open questions in the area of distributed graph algorithms. We investigate the problem by introducing a complexity-theoretic framework that allows us to shed some light on the role of randomness in the LOCAL model. We define the SLOCAL model as a sequential version of the LOCAL model. Our framework allows us to prove completeness results with respect to the class of problems which can be solved efficiently in the SLOCAL model, implying that if any of the complete problems can be solved deterministically in logn rounds in the LOCAL model, we can deterministically solve all efficient SLOCAL-problems (including MIS and (Δ+1)-coloring) in logn rounds in the LOCAL model. Perhaps most surprisingly, we show that a rather rudimentary looking graph coloring problem is complete in the above sense: Color the nodes of a graph with colors red and blue such that each node of sufficiently large polylogarithmic degree has at least one neighbor of each color. The problem admits a trivial zero-round randomized solution. The result can be viewed as showing that the only obstacle to getting efficient determinstic algorithms in the LOCAL model is an efficient algorithm to approximately round fractional values into integer values. In addition, our formal framework also allows us to develop polylogarithmic-time randomized distributed algorithms in a simpler way. As a result, we provide a polylog-time distributed approximation scheme for arbitrary distributed covering and packing integer linear programs.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89131534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The vast majority of theoretical results in machine learning and statistics assume that the training data is a reliable reflection of the phenomena to be learned. Similarly, most learning techniques used in practice are brittle to the presence of large amounts of biased or malicious data. Motivated by this, we consider two frameworks for studying estimation, learning, and optimization in the presence of significant fractions of arbitrary data. The first framework, list-decodable learning, asks whether it is possible to return a list of answers such that at least one is accurate. For example, given a dataset of n points for which an unknown subset of αn points are drawn from a distribution of interest, and no assumptions are made about the remaining (1 - α)n points, is it possible to return a list of poly(1/α) answers? The second framework, which we term the semi-verified model, asks whether a small dataset of trusted data (drawn from the distribution in question) can be used to extract accurate information from a much larger but untrusted dataset (of which only an α-fraction is drawn from the distribution). We show strong positive results in both settings, and provide an algorithm for robust learning in a very general stochastic optimization setting. This result has immediate implications for robustly estimating the mean of distributions with bounded second moments, robustly learning mixtures of such distributions, and robustly finding planted partitions in random graphs in which significant portions of the graph have been perturbed by an adversary.
{"title":"Learning from untrusted data","authors":"M. Charikar, J. Steinhardt, G. Valiant","doi":"10.1145/3055399.3055491","DOIUrl":"https://doi.org/10.1145/3055399.3055491","url":null,"abstract":"The vast majority of theoretical results in machine learning and statistics assume that the training data is a reliable reflection of the phenomena to be learned. Similarly, most learning techniques used in practice are brittle to the presence of large amounts of biased or malicious data. Motivated by this, we consider two frameworks for studying estimation, learning, and optimization in the presence of significant fractions of arbitrary data. The first framework, list-decodable learning, asks whether it is possible to return a list of answers such that at least one is accurate. For example, given a dataset of n points for which an unknown subset of αn points are drawn from a distribution of interest, and no assumptions are made about the remaining (1 - α)n points, is it possible to return a list of poly(1/α) answers? The second framework, which we term the semi-verified model, asks whether a small dataset of trusted data (drawn from the distribution in question) can be used to extract accurate information from a much larger but untrusted dataset (of which only an α-fraction is drawn from the distribution). We show strong positive results in both settings, and provide an algorithm for robust learning in a very general stochastic optimization setting. This result has immediate implications for robustly estimating the mean of distributions with bounded second moments, robustly learning mixtures of such distributions, and robustly finding planted partitions in random graphs in which significant portions of the graph have been perturbed by an adversary.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88682645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The partial coloring method is one of the most powerful and widely used method in combinatorial discrepancy problems. However, in many cases it leads to sub-optimal bounds as the partial coloring step must be iterated a logarithmic number of times, and the errors can add up in an adversarial way. We give a new and general algorithmic framework that overcomes the limitations of the partial coloring method and can be applied in a black-box manner to various problems. Using this framework, we give new improved bounds and algorithms for several classic problems in discrepancy. In particular, for Tusnady&'s problem, we give an improved O(log2 n) bound for discrepancy of axis-parallel rectangles and more generally an Od(logdn) bound for d-dimensional boxes in ℝd. Previously, even non-constructively, the best bounds were O(log2.5 n) and Od(logd+0.5n) respectively. Similarly, for the Steinitz problem we give the first algorithm that matches the best known non-constructive bounds due to Banaszczyk in the 𝓁∞ case, and improves the previous algorithmic bounds substantially in the 𝓁2 case. Our framework is based upon a substantial generalization of the techniques developed recently in the context of the Komlós discrepancy problem.
{"title":"Algorithmic discrepancy beyond partial coloring","authors":"N. Bansal, S. Garg","doi":"10.1145/3055399.3055490","DOIUrl":"https://doi.org/10.1145/3055399.3055490","url":null,"abstract":"The partial coloring method is one of the most powerful and widely used method in combinatorial discrepancy problems. However, in many cases it leads to sub-optimal bounds as the partial coloring step must be iterated a logarithmic number of times, and the errors can add up in an adversarial way. We give a new and general algorithmic framework that overcomes the limitations of the partial coloring method and can be applied in a black-box manner to various problems. Using this framework, we give new improved bounds and algorithms for several classic problems in discrepancy. In particular, for Tusnady&'s problem, we give an improved O(log2 n) bound for discrepancy of axis-parallel rectangles and more generally an Od(logdn) bound for d-dimensional boxes in ℝd. Previously, even non-constructively, the best bounds were O(log2.5 n) and Od(logd+0.5n) respectively. Similarly, for the Steinitz problem we give the first algorithm that matches the best known non-constructive bounds due to Banaszczyk in the 𝓁∞ case, and improves the previous algorithmic bounds substantially in the 𝓁2 case. Our framework is based upon a substantial generalization of the techniques developed recently in the context of the Komlós discrepancy problem.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86293632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new algorithmic framework, called “partial rejection sampling”, to draw samples exactly from a product distribution, conditioned on none of a number of bad events occurring. Our framework builds (perhaps surprising) new connections between the variable framework of the Lovász Local Lemma and some clas- sical sampling algorithms such as the “cycle-popping” algorithm for rooted spanning trees by Wilson. Among other applications, we discover new algorithms to sample satisfying assignments of k-CNF formulas with bounded variable occurrences.
{"title":"Uniform sampling through the Lovasz local lemma","authors":"Heng Guo, M. Jerrum, Jingcheng Liu","doi":"10.1145/3055399.3055410","DOIUrl":"https://doi.org/10.1145/3055399.3055410","url":null,"abstract":"We propose a new algorithmic framework, called “partial rejection sampling”, to draw samples exactly from a product distribution, conditioned on none of a number of bad events occurring. Our framework builds (perhaps surprising) new connections between the variable framework of the Lovász Local Lemma and some clas- sical sampling algorithms such as the “cycle-popping” algorithm for rooted spanning trees by Wilson. Among other applications, we discover new algorithms to sample satisfying assignments of k-CNF formulas with bounded variable occurrences.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74411089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A basic combinatorial interpretation of Shannon's entropy function is via the "20 questions" game. This cooperative game is played by two players, Alice and Bob: Alice picks a distribution Π over the numbers {1,…,n}, and announces it to Bob. She then chooses a number x according to Π, and Bob attempts to identify x using as few Yes/No queries as possible, on average. An optimal strategy for the "20 questions" game is given by a Huffman code for Π: Bob's questions reveal the codeword for x bit by bit. This strategy finds x using fewer than H(Π)+1 questions on average. However, the questions asked by Bob could be arbitrary. In this paper, we investigate the following question: *Are there restricted sets of questions that match the performance of Huffman codes, either exactly or approximately? Our first main result shows that for every distribution Π, Bob has a strategy that uses only questions of the form "x < c?" and "x = c?", and uncovers x using at most H(Π)+1 questions on average, matching the performance of Huffman codes in this sense. We also give a natural set of O(rn1/r) questions that achieve a performance of at most H(Π)+r, and show that Ωrn1/r) questions are required to achieve such a guarantee. Our second main result gives a set Q of 1.25n+o(n) questions such that for every distribution Π, Bob can implement an optimal strategy for Π using only questions from Q. We also show that 1.25n-o(n) questions are needed, for infinitely many n. If we allow a small slack of r over the optimal strategy, then roughly (rn)Θ(1/r) questions are necessary and sufficient.
{"title":"Twenty (simple) questions","authors":"Y. Dagan, Yuval Filmus, Ariel Gabizon, S. Moran","doi":"10.1145/3055399.3055422","DOIUrl":"https://doi.org/10.1145/3055399.3055422","url":null,"abstract":"A basic combinatorial interpretation of Shannon's entropy function is via the \"20 questions\" game. This cooperative game is played by two players, Alice and Bob: Alice picks a distribution Π over the numbers {1,…,n}, and announces it to Bob. She then chooses a number x according to Π, and Bob attempts to identify x using as few Yes/No queries as possible, on average. An optimal strategy for the \"20 questions\" game is given by a Huffman code for Π: Bob's questions reveal the codeword for x bit by bit. This strategy finds x using fewer than H(Π)+1 questions on average. However, the questions asked by Bob could be arbitrary. In this paper, we investigate the following question: *Are there restricted sets of questions that match the performance of Huffman codes, either exactly or approximately? Our first main result shows that for every distribution Π, Bob has a strategy that uses only questions of the form \"x < c?\" and \"x = c?\", and uncovers x using at most H(Π)+1 questions on average, matching the performance of Huffman codes in this sense. We also give a natural set of O(rn1/r) questions that achieve a performance of at most H(Π)+r, and show that Ωrn1/r) questions are required to achieve such a guarantee. Our second main result gives a set Q of 1.25n+o(n) questions such that for every distribution Π, Bob can implement an optimal strategy for Π using only questions from Q. We also show that 1.25n-o(n) questions are needed, for infinitely many n. If we allow a small slack of r over the optimal strategy, then roughly (rn)Θ(1/r) questions are necessary and sufficient.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82995242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One of the biggest open problems in external memory data structures is the priority queue problem with DecreaseKey operations. If only Insert and ExtractMin operations need to be supported, one can design a comparison-based priority queue performing O((N/B)lgM/B N) I/Os over a sequence of N operations, where B is the disk block size in number of words and M is the main memory size in number of words. This matches the lower bound for comparison-based sorting and is hence optimal for comparison-based priority queues. However, if we also need to support DecreaseKeys, the performance of the best known priority queue is only O((N/B) lg2 N) I/Os. The big open question is whether a degradation in performance really is necessary. We answer this question affirmatively by proving a lower bound of Ω((N/B) lglgN B) I/Os for processing a sequence of N intermixed Insert, ExtraxtMin and DecreaseKey operations. Our lower bound is proved in the cell probe model and thus holds also for non-comparison-based priority queues.
{"title":"DecreaseKeys are expensive for external memory priority queues","authors":"Kasper Eenberg, Kasper Green Larsen, Huacheng Yu","doi":"10.1145/3055399.3055437","DOIUrl":"https://doi.org/10.1145/3055399.3055437","url":null,"abstract":"One of the biggest open problems in external memory data structures is the priority queue problem with DecreaseKey operations. If only Insert and ExtractMin operations need to be supported, one can design a comparison-based priority queue performing O((N/B)lgM/B N) I/Os over a sequence of N operations, where B is the disk block size in number of words and M is the main memory size in number of words. This matches the lower bound for comparison-based sorting and is hence optimal for comparison-based priority queues. However, if we also need to support DecreaseKeys, the performance of the best known priority queue is only O((N/B) lg2 N) I/Os. The big open question is whether a degradation in performance really is necessary. We answer this question affirmatively by proving a lower bound of Ω((N/B) lglgN B) I/Os for processing a sequence of N intermixed Insert, ExtraxtMin and DecreaseKey operations. Our lower bound is proved in the cell probe model and thus holds also for non-comparison-based priority queues.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79745123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naman Agarwal, Z. Zhu, Brian Bullins, Elad Hazan, Tengyu Ma
We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples. The time complexity of our algorithm to find an approximate local minimum is even faster than that of gradient descent to find a critical point. Our algorithm applies to a general class of optimization problems including training a neural network and other non-convex objectives arising in machine learning.
{"title":"Finding approximate local minima faster than gradient descent","authors":"Naman Agarwal, Z. Zhu, Brian Bullins, Elad Hazan, Tengyu Ma","doi":"10.1145/3055399.3055464","DOIUrl":"https://doi.org/10.1145/3055399.3055464","url":null,"abstract":"We design a non-convex second-order optimization algorithm that is guaranteed to return an approximate local minimum in time which scales linearly in the underlying dimension and the number of training examples. The time complexity of our algorithm to find an approximate local minimum is even faster than that of gradient descent to find a critical point. Our algorithm applies to a general class of optimization problems including training a neural network and other non-convex objectives arising in machine learning.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74443111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}