Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing最新文献_第10页

Extractor-based time-space lower bounds for learning 基于提取器的学习时空下界

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-08-08 DOI: 10.1145/3188745.3188962

Sumegha Garg, R. Raz, Avishay Tal

A matrix M: A × X → {−1,1} corresponds to the following learning problem: An unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1, b1), (a2, b2) …, where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai,x). Assume that k, l, r are such that any submatrix of M of at least 2−k · |A| rows and at least 2−l · |X| columns, has a bias of at most 2−r. We show that any learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω(k · l ), or at least 2Ω(r) samples. The result holds even if the learner has an exponentially small success probability (of 2−Ω(r)). In particular, this shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least Ω((log|X|) · (log|A|)) or an exponential number of samples, achieving a tight Ω((log|X|) · (log|A|)) lower bound on the size of the memory, rather than a bound of Ω(min{(log|X|)2,(log|A|)2}) obtained in previous works by Raz [FOCS’17] and Moshkovitz and Moshkovitz [ITCS’18]. Moreover, our result implies all previous memory-samples lower bounds, as well as a number of new applications. Our proof builds on the work of Raz [FOCS’17] that gave a general technique for proving memory samples lower bounds.

矩阵M: A × X→{−1,1}对应如下学习问题:随机均匀选择未知元素X∈X。学习者尝试从样本流(a1, b1)， (a2, b2)…中学习x，其中对于每一个i, ai∈A是随机均匀选择的，且bi = M(ai,x)。假设k, l, r满足M的任意子矩阵至少有2−k·|A|行，至少有2−l·|X|列，其偏置不超过2−r。我们表明，任何与M对应的学习问题的学习算法都需要至少Ω(k·l)的内存大小，或者至少2Ω(r)个样本。即使学习者有一个指数级的小成功概率(2 - Ω(r))，结果仍然成立。特别地，这表明对于一大类学习问题，任何学习算法要么需要至少Ω((log|X|)·(log| a |))的内存大小，要么需要指数数量的样本，以实现内存大小的一个紧密的Ω((log|X|)·(log| a |))下界，而不是Raz [FOCS ' 17]和Moshkovitz和Moshkovitz [ITCS ' 18]在以前的工作中得到的Ω(min{(log|X|)2，(log| a |)2})的下界。此外，我们的结果暗示了所有以前的内存样本的下界，以及一些新的应用程序。我们的证明建立在Raz [FOCS ' 17]的工作基础上，该工作给出了证明内存样本下界的一般技术。

{"title":"Extractor-based time-space lower bounds for learning","authors":"Sumegha Garg, R. Raz, Avishay Tal","doi":"10.1145/3188745.3188962","DOIUrl":"https://doi.org/10.1145/3188745.3188962","url":null,"abstract":"A matrix M: A × X → {−1,1} corresponds to the following learning problem: An unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a1, b1), (a2, b2) …, where for every i, ai ∈ A is chosen uniformly at random and bi = M(ai,x). Assume that k, l, r are such that any submatrix of M of at least 2−k · |A| rows and at least 2−l · |X| columns, has a bias of at most 2−r. We show that any learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω(k · l ), or at least 2Ω(r) samples. The result holds even if the learner has an exponentially small success probability (of 2−Ω(r)). In particular, this shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least Ω((log|X|) · (log|A|)) or an exponential number of samples, achieving a tight Ω((log|X|) · (log|A|)) lower bound on the size of the memory, rather than a bound of Ω(min{(log|X|)2,(log|A|)2}) obtained in previous works by Raz [FOCS’17] and Moshkovitz and Moshkovitz [ITCS’18]. Moreover, our result implies all previous memory-samples lower bounds, as well as a number of new applications. Our proof builds on the work of Raz [FOCS’17] that gave a general technique for proving memory samples lower bounds.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82337508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

The Gram-Schmidt walk: a cure for the Banaszczyk blues 克拉姆-施密特步法:治愈巴纳斯奇克忧郁

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-08-03 DOI: 10.1145/3188745.3188850

N. Bansal, D. Dadush, S. Garg, Shachar Lovett

An important result in discrepancy due to Banaszczyk states that for any set of n vectors in ℝm of ℓ2 norm at most 1 and any convex body K in ℝm of Gaussian measure at least half, there exists a ± 1 combination of these vectors which lies in 5K. This result implies the best known bounds for several problems in discrepancy. Banaszczyk’s proof of this result is non-constructive and an open problem has been to give an efficient algorithm to find such a ± 1 combination of the vectors. In this paper, we resolve this question and give an efficient randomized algorithm to find a ± 1 combination of the vectors which lies in cK for c>0 an absolute constant. This leads to new efficient algorithms for several problems in discrepancy theory.

Banaszczyk关于差异的一个重要结果表明，对于任何一个n个向量的集合，在∈m中，≥2范数为1，并且在∈m中，任意一个凸体K≥一半的高斯测度，在5K中存在这些向量的±1组合。这一结果暗示了几个差异问题的已知界。Banaszczyk对这个结果的证明是非建设性的，一个开放的问题是给出一个有效的算法来找到这样一个向量的±1组合。在本文中，我们解决了这个问题，并给出了一个有效的随机化算法来求在cK中c>0为绝对常数的向量的±1组合。这导致了对差异理论中一些问题的新的有效算法。

引用次数: 60

On approximating the number of k-cliques in sublinear time 在次线性时间内近似k-团的数目

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-07-16 DOI: 10.1145/3188745.3188810

T. Eden, D. Ron, C. Seshadhri

We study the problem of approximating the number of k-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and Ck the number of k-cliques. We design an algorithm that outputs a (1+ε)-approximation (with high probability) for Ck, whose expected query complexity and running time are O(n/Ck1/k+mk/2/Ck )(logn, 1/ε,k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck = ω(mk/2−1). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on logn, 1/ε and k). The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting (k=2) and by Eden et al. (FOCS 2015) for triangle counting (k=3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any k≥ 3 by designing a procedure that samples each k-clique incident to a given set S of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex.

研究了当给定对图的查询访问权时，图中k-团数目的逼近问题。我们通过(1)度查询，(2)邻居查询和(3)对查询来考虑一般图的标准查询模型。设n表示图中顶点的数量，m表示边的数量，Ck表示k个团的数量。我们设计了一种算法，对Ck输出(1+ε)-近似(高概率)，其期望查询复杂度和运行时间为O(n/Ck1/k+mk/2/Ck)(logn, 1/ε，k)。因此，当Ck = ω(mk/2−1)时，算法的复杂度在图的大小上是次线性的。此外，我们证明了一个下界，表明我们的算法的查询复杂性本质上是最优的(直到对logn, 1/ε和k的依赖)。这方面的先前结果是Feige (SICOMP 06)， Goldreich和Ron (RSA 08)对边缘计数(k=2)和Eden等人(FOCS 2015)对三角形计数(k=3)。我们的结果符合这些结果的复杂性。Eden等人之前的结果取决于某种仅适用于三角形计数的摊销技术，而不适用于更大的团。我们通过设计一个程序，以近似相等的概率对每个k-团事件到给定集合S的顶点进行采样，从而获得一个适用于任何k≥3的通用算法。主要的困难在于寻找与纯高阶顶点相关的派系，因为在邻居中随机抽样的成功概率很低。这是通过一种算法来实现的，该算法对均匀随机的高阶顶点进行采样，并在估计纯粹与高阶顶点相关的团和包含低阶顶点的团之间进行仔细的权衡。

{"title":"On approximating the number of k-cliques in sublinear time","authors":"T. Eden, D. Ron, C. Seshadhri","doi":"10.1145/3188745.3188810","DOIUrl":"https://doi.org/10.1145/3188745.3188810","url":null,"abstract":"We study the problem of approximating the number of k-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and Ck the number of k-cliques. We design an algorithm that outputs a (1+ε)-approximation (with high probability) for Ck, whose expected query complexity and running time are O(n/Ck1/k+mk/2/Ck )(logn, 1/ε,k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck = ω(mk/2−1). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on logn, 1/ε and k). The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting (k=2) and by Eden et al. (FOCS 2015) for triangle counting (k=3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any k≥ 3 by designing a procedure that samples each k-clique incident to a given set S of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"100 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80376908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 72

Fine-grained reductions from approximate counting to decision 从近似计数到决策的细粒度缩减

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-07-14 DOI: 10.1145/3188745.3188920

Holger Dell, John Lapinskas

In this paper, we introduce a general framework for fine-grained reductions of approximate counting problems to their decision versions. (Thus we use an oracle that decides whether any witness exists to multiplicatively approximate the number of witnesses with minimal overhead.) This mirrors a foundational result of Sipser (STOC 1983) and Stockmeyer (SICOMP 1985) in the polynomial-time setting, and a similar result of Müller (IWPEC 2006) in the FPT setting. Using our framework, we obtain such reductions for some of the most important problems in fine-grained complexity: the Orthogonal Vectors problem, 3SUM, and the Negative-Weight Triangle problem (which is closely related to All-Pairs Shortest Path). While all these problems have simple algorithms over which it is conjectured that no polynomial improvement is possible, our reductions would remain interesting even if these conjectures were proved; they have only polylogarithmic overhead, and can therefore be applied to subpolynomial improvements such as the n3/exp(Θ(√logn))-time algorithm for the Negative-Weight Triangle problem due to Williams (STOC 2014). Our framework is also general enough to apply to versions of the problems for which more efficient algorithms are known. For example, the Orthogonal Vectors problem over GF(m)d for constant m can be solved in time n·poly(d) by a result of Williams and Yu (SODA 2014); our result implies that we can approximately count the number of orthogonal pairs with essentially the same running time. We also provide a fine-grained reduction from approximate #SAT to SAT. Suppose the Strong Exponential Time Hypothesis (SETH) is false, so that for some 1 0 as part of the input). A full version of this paper containing detailed proofs is available at https://arxiv.org/abs/1707.04609.

在本文中，我们引入了一个将近似计数问题的细粒度约简为其决策版本的一般框架。(因此，我们使用一个oracle来决定是否存在任何证人，以最小的开销乘近似证人的数量。)这反映了Sipser (STOC 1983)和Stockmeyer (SICOMP 1985)在多项式时间设置中的基本结果，以及m ller (IWPEC 2006)在FPT设置中的类似结果。使用我们的框架，我们获得了细粒度复杂性中一些最重要问题的这种约简:正交向量问题、3SUM和负权三角形问题(与全对最短路径密切相关)。虽然所有这些问题都有简单的算法，但据推测，没有多项式改进是可能的，即使这些猜想被证明，我们的缩减仍然很有趣;它们只有多对数开销，因此可以应用于次多项式改进，例如Williams (STOC 2014)针对负权三角形问题的n3/exp(Θ(√logn))时间算法。我们的框架也足够通用，可以应用于已知更有效算法的问题版本。例如，根据Williams和Yu (SODA 2014)的结果，常数m的GF(m)d上的正交向量问题可以在n·poly(d)时间内解决;我们的结果表明，我们可以近似地计算运行时间基本相同的正交对的数量。我们还提供了从近似#SAT到SAT的细粒度缩减。假设强指数时间假设(SETH)为假，那么对于一些10作为输入的一部分)。本文包含详细证明的完整版本可在https://arxiv.org/abs/1707.04609上获得。

{"title":"Fine-grained reductions from approximate counting to decision","authors":"Holger Dell, John Lapinskas","doi":"10.1145/3188745.3188920","DOIUrl":"https://doi.org/10.1145/3188745.3188920","url":null,"abstract":"In this paper, we introduce a general framework for fine-grained reductions of approximate counting problems to their decision versions. (Thus we use an oracle that decides whether any witness exists to multiplicatively approximate the number of witnesses with minimal overhead.) This mirrors a foundational result of Sipser (STOC 1983) and Stockmeyer (SICOMP 1985) in the polynomial-time setting, and a similar result of Müller (IWPEC 2006) in the FPT setting. Using our framework, we obtain such reductions for some of the most important problems in fine-grained complexity: the Orthogonal Vectors problem, 3SUM, and the Negative-Weight Triangle problem (which is closely related to All-Pairs Shortest Path). While all these problems have simple algorithms over which it is conjectured that no polynomial improvement is possible, our reductions would remain interesting even if these conjectures were proved; they have only polylogarithmic overhead, and can therefore be applied to subpolynomial improvements such as the n3/exp(Θ(√logn))-time algorithm for the Negative-Weight Triangle problem due to Williams (STOC 2014). Our framework is also general enough to apply to versions of the problems for which more efficient algorithms are known. For example, the Orthogonal Vectors problem over GF(m)d for constant m can be solved in time n·poly(d) by a result of Williams and Yu (SODA 2014); our result implies that we can approximately count the number of orthogonal pairs with essentially the same running time. We also provide a fine-grained reduction from approximate #SAT to SAT. Suppose the Strong Exponential Time Hypothesis (SETH) is false, so that for some 1<c<2 and all k there is an O(cn)-time algorithm for #k-SAT. Then we prove that for all k, there is an O((c+o(1))n)-time algorithm for approximate #k-SAT. In particular, our result implies that the Exponential Time Hypothesis (ETH) is equivalent to the seemingly-weaker statement that there is no algorithm to approximate #3-SAT to within a factor of 1+ε in time 2o(n)/ε2 (taking ε > 0 as part of the input). A full version of this paper containing detailed proofs is available at https://arxiv.org/abs/1707.04609.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88518714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Round compression for parallel matching algorithms 并行匹配算法的轮压缩

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-07-11 DOI: 10.1145/3188745.3188764

A. Czumaj, Jakub Lacki, A. Madry, Slobodan Mitrovic, Krzysztof Onak, P. Sankowski

For over a decade now we have been witnessing the success of massive parallel computation (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms? A prominent example here is the maximum matching problem—one of the most classic graph problems. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in O(logn) rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. (SPAA 2011) showed that if each machine has n1+Ω(1) memory, this problem can also be solved 2-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly O(logn) rounds once we enter the (at most) near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power. In this paper, we finally refute that possibility. That is, we break the above O(logn) round complexity bound even in the case of slightly sublinear memory per machine. In fact, our improvement here is almost exponential: we are able to deliver a (2+є)-approximate maximum matching, for any fixed constant є>0, in O((loglogn)2) rounds. To establish our result we need to deviate from the previous work in two important ways that are crucial for exploiting the power of the MPC model, as compared to the PRAM model. Firstly, we use vertex–based graph partitioning, instead of the edge–based approaches that were utilized so far. Secondly, we develop a technique of round compression. This technique enables one to take a (distributed) algorithm that computes an O(1)-approximation of maximum matching in O(logn) independent PRAM phases and implement a super-constant number of these phases in only a constant number of MPC rounds.

十多年来，我们见证了大规模并行计算(MPC)框架的成功，比如MapReduce、Hadoop、Dryad或Spark。它们成功的原因之一是这些框架能够准确地捕捉大规模计算的本质。特别是，与经典的分布式算法或PRAM模型相比，这些框架允许更多的本地计算。然而，在这种情况下出现的基本问题是:我们能否利用这种额外的能力来获得更快的并行算法?这里一个突出的例子是最大匹配问题——最经典的图问题之一。众所周知，在PRAM模型中，可以在O(logn)轮中计算出2-近似最大匹配。然而，在MPC框架中，这个问题的确切复杂性仍远未被理解。Lattanzi et al. (SPAA 2011)表明，如果每台机器有n1+Ω(1)的内存，这个问题也可以在一个恒定的轮数中近似地解决。这些技术，以及在后续工作中开发的方法，似乎在大约O(logn)轮的基本方式中陷入困境，一旦我们进入(最多)近线性内存状态。因此，完全有可能在这种情况下，特别是在稀疏图计算的情况下，最佳MPC轮复杂度与PRAM模型中已经可以得到的复杂度相匹配，而不需要利用额外的局部计算能力。在本文中，我们最终驳斥了这种可能性。也就是说，即使在每台机器的内存略低于线性的情况下，我们也打破了上述O(logn)的复杂度界限。事实上，我们在这里的改进几乎是指数级的:我们能够在O((loglogn)2)轮中提供(2+ n) -近似最大匹配，对于任何固定常数n >0。为了建立我们的结果，我们需要在两个重要的方面偏离之前的工作，这对于利用MPC模型的力量至关重要，与PRAM模型相比。首先，我们使用基于顶点的图划分，而不是迄今为止使用的基于边缘的方法。其次，我们开发了一种圆压缩技术。这种技术使人们能够采用一种(分布式)算法，在O(logn)独立的PRAM阶段中计算最大匹配的O(1)近似值，并在仅常数次MPC轮中实现这些阶段的超常数次。

{"title":"Round compression for parallel matching algorithms","authors":"A. Czumaj, Jakub Lacki, A. Madry, Slobodan Mitrovic, Krzysztof Onak, P. Sankowski","doi":"10.1145/3188745.3188764","DOIUrl":"https://doi.org/10.1145/3188745.3188764","url":null,"abstract":"For over a decade now we have been witnessing the success of massive parallel computation (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms? A prominent example here is the maximum matching problem—one of the most classic graph problems. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in O(logn) rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. (SPAA 2011) showed that if each machine has n1+Ω(1) memory, this problem can also be solved 2-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly O(logn) rounds once we enter the (at most) near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power. In this paper, we finally refute that possibility. That is, we break the above O(logn) round complexity bound even in the case of slightly sublinear memory per machine. In fact, our improvement here is almost exponential: we are able to deliver a (2+є)-approximate maximum matching, for any fixed constant є>0, in O((loglogn)2) rounds. To establish our result we need to deviate from the previous work in two important ways that are crucial for exploiting the power of the MPC model, as compared to the PRAM model. Firstly, we use vertex–based graph partitioning, instead of the edge–based approaches that were utilized so far. Secondly, we develop a technique of round compression. This technique enables one to take a (distributed) algorithm that computes an O(1)-approximation of maximum matching in O(logn) independent PRAM phases and implement a super-constant number of these phases in only a constant number of MPC rounds.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"90 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80398881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 93

Learning geometric concepts with nasty noise 在恼人的噪音中学习几何概念

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-07-05 DOI: 10.1145/3188745.3188754

Ilias Diakonikolas, D. Kane, Alistair Stewart

We study the efficient learnability of geometric concept classes — specifically, low-degree polynomial threshold functions (PTFs) and intersections of halfspaces — when a fraction of the training data is adversarially corrupted. We give the first polynomial-time PAC learning algorithms for these concept classes with dimension-independent error guarantees in the presence of nasty noise under the Gaussian distribution. In the nasty noise model, an omniscient adversary can arbitrarily corrupt a small fraction of both the unlabeled data points and their labels. This model generalizes well-studied noise models, including the malicious noise model and the agnostic (adversarial label noise) model. Prior to our work, the only concept class for which efficient malicious learning algorithms were known was the class of origin-centered halfspaces. At the core of our results is an efficient algorithm to approximate the low-degree Chow-parameters of any bounded function in the presence of nasty noise. Our robust approximation algorithm for the Chow parameters provides near-optimal error guarantees for a range of distribution families satisfying mild concentration bounds and moment conditions. At the technical level, this algorithm employs an iterative “spectral” technique for outlier detection and removal inspired by recent work in robust unsupervised learning, which makes essential use of low-degree multivariate polynomials. Our robust learning algorithm for low-degree PTFs provides dimension-independent error guarantees for a class of tame distributions, including Gaussians and, more generally, any logconcave distribution with (approximately) known low-degree moments. For LTFs under the Gaussian distribution, using a refinement of the localization technique, we give a polynomial-time algorithm that achieves a near-optimal error of O(є), where є is the noise rate. Our robust learning algorithm for intersections of halfspaces proceeds by projecting down to an appropriate low-dimensional subspace. Its correctness makes essential use of a novel robust inverse independence lemma that is of independent interest.

我们研究几何概念类的有效学习性-特别是低次多项式阈值函数(ptf)和半空间交集-当一部分训练数据被对抗性破坏时。在高斯分布下，我们给出了这些具有维无关误差保证的概念类的第一个多项式时间PAC学习算法。在讨厌的噪声模型中，无所不知的对手可以任意破坏一小部分未标记的数据点及其标签。该模型推广了研究得很好的噪声模型，包括恶意噪声模型和不可知论(对抗性标签噪声)模型。在我们的工作之前，已知有效恶意学习算法的唯一概念类是以原点为中心的半空间类。我们的研究结果的核心是一种有效的算法，可以在存在严重噪声的情况下近似任何有界函数的低次周参数。我们对Chow参数的鲁棒近似算法为满足温和浓度边界和矩条件的分布族范围提供了接近最优的误差保证。在技术层面上，该算法采用迭代“谱”技术进行异常值检测和去除，灵感来自鲁棒无监督学习的最新工作，该工作重要地使用了低次多元多项式。我们针对低度ptf的鲁棒学习算法为一类温和分布提供了与维无关的误差保证，包括高斯分布，更一般地说，任何具有(近似)已知低度矩的对数凹分布。对于高斯分布下的ltf，使用一种改进的定位技术，我们给出了一个多项式时间算法，该算法实现了接近最优的误差O(k)，其中k是噪声率。我们的鲁棒学习算法的交叉点的半空间继续向下投影到一个适当的低维子空间。它的正确性需要用到一个新的鲁棒逆独立引理。

{"title":"Learning geometric concepts with nasty noise","authors":"Ilias Diakonikolas, D. Kane, Alistair Stewart","doi":"10.1145/3188745.3188754","DOIUrl":"https://doi.org/10.1145/3188745.3188754","url":null,"abstract":"We study the efficient learnability of geometric concept classes — specifically, low-degree polynomial threshold functions (PTFs) and intersections of halfspaces — when a fraction of the training data is adversarially corrupted. We give the first polynomial-time PAC learning algorithms for these concept classes with dimension-independent error guarantees in the presence of nasty noise under the Gaussian distribution. In the nasty noise model, an omniscient adversary can arbitrarily corrupt a small fraction of both the unlabeled data points and their labels. This model generalizes well-studied noise models, including the malicious noise model and the agnostic (adversarial label noise) model. Prior to our work, the only concept class for which efficient malicious learning algorithms were known was the class of origin-centered halfspaces. At the core of our results is an efficient algorithm to approximate the low-degree Chow-parameters of any bounded function in the presence of nasty noise. Our robust approximation algorithm for the Chow parameters provides near-optimal error guarantees for a range of distribution families satisfying mild concentration bounds and moment conditions. At the technical level, this algorithm employs an iterative “spectral” technique for outlier detection and removal inspired by recent work in robust unsupervised learning, which makes essential use of low-degree multivariate polynomials. Our robust learning algorithm for low-degree PTFs provides dimension-independent error guarantees for a class of tame distributions, including Gaussians and, more generally, any logconcave distribution with (approximately) known low-degree moments. For LTFs under the Gaussian distribution, using a refinement of the localization technique, we give a polynomial-time algorithm that achieves a near-optimal error of O(є), where є is the noise rate. Our robust learning algorithm for intersections of halfspaces proceeds by projecting down to an appropriate low-dimensional subspace. Its correctness makes essential use of a novel robust inverse independence lemma that is of independent interest.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"317 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80119317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 70

Dynamic matching in school choice: efficient seat reassignment after late cancellations (invited talk) 选择学校的动态匹配:在晚取消后有效地重新分配座位(特邀演讲)

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-06-27 DOI: 10.2139/ssrn.2993375

Irene Lo

In the school choice market, where scarce public school seats are assigned to students, a key issue is how to reassign seats that are vacated after an initial round of centralized assignment. Every year around 10% of students assigned a seat in the NYC public high school system eventually do not use it, and their vacated seats can be reassigned. Practical solutions to the reassignment problem must be simple to implement, truthful and efficient. I propose and axiomatically justify a class of reassignment mechanisms, the Per- muted Lottery Deferred Acceptance (PLDA) mechanisms, which generalize the commonly used Deferred Acceptance (DA) school choice mechanism to a two-round setting and retain its desirable in- centive and efficiency properties. I also provide guidance to school districts as to how to choose the appropriate mechanism in this class for their setting. Centralized admissions are typically conducted in a single round using Deferred Acceptance, with a lottery used to break ties in each school’s prioritization of students. Our proposed PLDA mechanisms reassign vacated seats using a second round of DA with a lottery based on a suitable permutation of the first-round lottery numbers. I demonstrate that under a natural order condition on aggregate student demand for schools, the second-round tie-breaking lottery can be correlated arbitrarily with that of the first round without affecting allocative welfare. I also show how the identifying char- acteristic of PLDA mechanisms, their permutation, can be chosen to control reallocation. vacated after the initial round are reassigned using decentralized waitlists that create significant student movement after the start of the school year, which is costly for both students and schools. I show that reversing the lottery order between rounds minimizes reassignment among all PLDA mechanisms, allowing us to alleviate costly student movement between schools without affecting the ef- ficiency of the final allocation. In a setting without school priorities, I also characterize PLDA mechanisms as the class of mechanisms that provide students with a guarantee at their first-round assign- ment, respect school priorities, and are strategy-proof, constrained Pareto efficient, and satisfy some mild symmetry properties. Finally, I provide simulations of the performance of different PLDA mecha- nisms in the presence of school priorities. All simulated PLDAs have similar allocative efficiency, while the PLDA based on reversing the tie-breaking lottery between rounds minimizes the number of reassigned students. These results support our theoretical findings. This is based on joint work with Itai Feigenbaum, Yash Kanoria, and Jay Sethuraman.

在稀缺的公立学校席位被分配给学生的择校市场中，如何重新分配第一轮集中分配后空出的席位是一个关键问题。每年，在纽约市公立高中系统中，约有10%的学生被分配了一个座位，但最终没有使用它，他们空出来的座位可以重新分配。重新分配问题的实际解决办法必须易于执行、真实和有效。我提出并证明了一类重新分配机制，即随机抽签延迟接受(PLDA)机制，它将常用的延迟接受(DA)学校选择机制推广到两轮设置，并保留了其理想的激励和效率特性。我还为学区提供指导，告诉他们如何根据自己的情况选择合适的课堂机制。集中式招生通常采用推迟录取的方式，通过抽签来打破每所学校对学生的优先顺序。我们提出的PLDA机制使用第二轮DA和基于第一轮摇号的适当排列的摇号来重新分配空出的席位。我证明了在学生对学校总需求的自然顺序条件下，第二轮抽签可以与第一轮抽签任意相关，而不会影响分配福利。我还展示了如何选择PLDA机制的识别特征，即它们的排列来控制再分配。在第一轮之后空出来的名额被重新分配，使用分散的候补名单，这在学年开始后产生了显著的学生流动，这对学生和学校来说都是昂贵的。我表明，在两轮抽签之间颠倒抽签顺序可以最大限度地减少所有PLDA机制之间的重新分配，使我们能够在不影响最终分配效率的情况下减轻学校之间昂贵的学生流动。在没有学校优先级的情况下，我还将PLDA机制描述为为学生提供第一轮分配保证的一类机制，尊重学校优先级，具有策略证明，约束帕累托效率，并满足一些轻微的对称性。最后，我提供了不同PLDA机制在学校优先级存在下的性能模拟。所有模拟的PLDA都具有相似的分配效率，而基于在回合之间反转平局抽签的PLDA最大限度地减少了重新分配的学生数量。这些结果支持了我们的理论发现。这是基于与Itai Feigenbaum, Yash Kanoria和Jay Sethuraman的合作。

{"title":"Dynamic matching in school choice: efficient seat reassignment after late cancellations (invited talk)","authors":"Irene Lo","doi":"10.2139/ssrn.2993375","DOIUrl":"https://doi.org/10.2139/ssrn.2993375","url":null,"abstract":"In the school choice market, where scarce public school seats are assigned to students, a key issue is how to reassign seats that are vacated after an initial round of centralized assignment. Every year around 10% of students assigned a seat in the NYC public high school system eventually do not use it, and their vacated seats can be reassigned. Practical solutions to the reassignment problem must be simple to implement, truthful and efficient. I propose and axiomatically justify a class of reassignment mechanisms, the Per- muted Lottery Deferred Acceptance (PLDA) mechanisms, which generalize the commonly used Deferred Acceptance (DA) school choice mechanism to a two-round setting and retain its desirable in- centive and efficiency properties. I also provide guidance to school districts as to how to choose the appropriate mechanism in this class for their setting. Centralized admissions are typically conducted in a single round using Deferred Acceptance, with a lottery used to break ties in each school’s prioritization of students. Our proposed PLDA mechanisms reassign vacated seats using a second round of DA with a lottery based on a suitable permutation of the first-round lottery numbers. I demonstrate that under a natural order condition on aggregate student demand for schools, the second-round tie-breaking lottery can be correlated arbitrarily with that of the first round without affecting allocative welfare. I also show how the identifying char- acteristic of PLDA mechanisms, their permutation, can be chosen to control reallocation. vacated after the initial round are reassigned using decentralized waitlists that create significant student movement after the start of the school year, which is costly for both students and schools. I show that reversing the lottery order between rounds minimizes reassignment among all PLDA mechanisms, allowing us to alleviate costly student movement between schools without affecting the ef- ficiency of the final allocation. In a setting without school priorities, I also characterize PLDA mechanisms as the class of mechanisms that provide students with a guarantee at their first-round assign- ment, respect school priorities, and are strategy-proof, constrained Pareto efficient, and satisfy some mild symmetry properties. Finally, I provide simulations of the performance of different PLDA mecha- nisms in the presence of school priorities. All simulated PLDAs have similar allocative efficiency, while the PLDA based on reversing the tie-breaking lottery between rounds minimizes the number of reassigned students. These results support our theoretical findings. This is based on joint work with Itai Feigenbaum, Yash Kanoria, and Jay Sethuraman.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88865165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Universal protocols for information dissemination using emergent signals 使用紧急信号的信息传播通用协议

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-05-27 DOI: 10.1145/3188745.3188818

Bartłomiej Dudek, A. Kosowski

We consider a population of n agents which communicate with each other in a decentralized manner, through random pairwise interactions. One or more agents in the population may act as authoritative sources of information, and the objective of the remaining agents is to obtain information from or about these source agents. We study two basic tasks: broadcasting, in which the agents are to learn the bit-state of an authoritative source which is present in the population, and source detection, in which the agents are required to decide if at least one source agent is present in the population or not. We focus on designing protocols which meet two natural conditions: (1) universality, i.e., independence of population size, and (2) rapid convergence to a correct global state after a reconfiguration, such as a change in the state of a source agent. Our main positive result is to show that both of these constraints can be met. For both the broadcasting problem and the source detection problem, we obtain solutions with an expected convergence time of O(logn), from any starting configuration. The solution to broadcasting is exact, which means that all agents reach the state broadcast by the source, while the solution to source detection admits one-sided error on a ε-fraction of the population (which is unavoidable for this problem). Both protocols are easy to implement in practice and are self-stabilizing, in the sense that the stated bounds on convergence time hold starting from any possible initial configuration of the system. Our protocols exploit the properties of self-organizing oscillatory dynamics. On the hardness side, our main structural insight is to prove that any protocol which meets the constraints of universality and of rapid convergence after reconfiguration must display a form of non-stationary behavior (of which oscillatory dynamics are an example). We also observe that the periodicity of the oscillatory behavior of the protocol, when present, must necessarily depend on the number #X of source agents present in the population. For instance, our protocols inherently rely on the emergence of a signal passing through the population, whose period is Θ(log(n/#X)) rounds for most starting configurations. The design of phase clocks with tunable frequency may be of independent interest, notably in modeling biological networks.

我们考虑一个由n个智能体组成的群体，它们通过随机两两交互以分散的方式相互通信。群体中的一个或多个代理可以作为权威信息来源，其余代理的目标是从这些源代理那里获得或关于这些源代理的信息。我们研究了两个基本任务:广播，其中代理学习种群中存在的权威源的位状态，以及源检测，其中代理需要确定种群中是否存在至少一个源代理。我们专注于设计满足两个自然条件的协议:(1)通用性，即群体大小的独立性;(2)在重新配置后快速收敛到正确的全局状态，例如源代理状态的变化。我们主要的积极结果是表明这两个约束都可以满足。对于广播问题和源检测问题，我们从任何起始配置中获得期望收敛时间为O(logn)的解。广播的解决方案是精确的，这意味着所有的代理都达到了源广播的状态，而源检测的解决方案承认在总体的ε-分数上的片面误差(这对于这个问题来说是不可避免的)。这两种协议在实践中都很容易实现，并且是自稳定的，也就是说，从系统的任何可能的初始配置开始，所述的收敛时间界限都保持不变。我们的协议利用自组织振荡动力学的特性。在硬度方面，我们的主要结构见解是证明任何协议满足普适性和重构后快速收敛的约束必须表现出一种非平稳行为形式(振荡动力学就是一个例子)。我们还观察到，当方案的振荡行为存在时，其周期性必然取决于总体中存在的源代理的数量#X。例如，我们的协议本质上依赖于信号穿过种群的出现，对于大多数初始配置，其周期为Θ(log(n/#X))轮。具有可调频率的相位时钟的设计可能是独立的兴趣，特别是在建模生物网络。

{"title":"Universal protocols for information dissemination using emergent signals","authors":"Bartłomiej Dudek, A. Kosowski","doi":"10.1145/3188745.3188818","DOIUrl":"https://doi.org/10.1145/3188745.3188818","url":null,"abstract":"We consider a population of n agents which communicate with each other in a decentralized manner, through random pairwise interactions. One or more agents in the population may act as authoritative sources of information, and the objective of the remaining agents is to obtain information from or about these source agents. We study two basic tasks: broadcasting, in which the agents are to learn the bit-state of an authoritative source which is present in the population, and source detection, in which the agents are required to decide if at least one source agent is present in the population or not. We focus on designing protocols which meet two natural conditions: (1) universality, i.e., independence of population size, and (2) rapid convergence to a correct global state after a reconfiguration, such as a change in the state of a source agent. Our main positive result is to show that both of these constraints can be met. For both the broadcasting problem and the source detection problem, we obtain solutions with an expected convergence time of O(logn), from any starting configuration. The solution to broadcasting is exact, which means that all agents reach the state broadcast by the source, while the solution to source detection admits one-sided error on a ε-fraction of the population (which is unavoidable for this problem). Both protocols are easy to implement in practice and are self-stabilizing, in the sense that the stated bounds on convergence time hold starting from any possible initial configuration of the system. Our protocols exploit the properties of self-organizing oscillatory dynamics. On the hardness side, our main structural insight is to prove that any protocol which meets the constraints of universality and of rapid convergence after reconfiguration must display a form of non-stationary behavior (of which oscillatory dynamics are an example). We also observe that the periodicity of the oscillatory behavior of the protocol, when present, must necessarily depend on the number #X of source agents present in the population. For instance, our protocols inherently rely on the emergence of a signal passing through the population, whose period is Θ(log(n/#X)) rounds for most starting configurations. The design of phase clocks with tunable frequency may be of independent interest, notably in modeling biological networks.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82037867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

An exponential lower bound for individualization-refinement algorithms for graph isomorphism 图同构的个性化改进算法的指数下界

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-05-09 DOI: 10.1145/3188745.3188900

Daniel Neuen, Pascal Schweitzer

The individualization-refinement paradigm provides a strong toolbox for testing isomorphism of two graphs and indeed, the currently fastest implementations of isomorphism solvers all follow this approach. While these solvers are fast in practice, from a theoretical point of view, no general lower bounds concerning the worst case complexity of these tools are known. In fact, it is an open question what the running time of individualization-refinement algorithms is. For all we know some of the algorithms could have polynomial running time. In this work we give a negative answer to this question and construct a family of graphs on which algorithms based on the individualization-refinement paradigm require exponential time. Contrary to a previous construction of Miyazaki, that only applies to a specific implementation within the individualization-refinement framework, our construction is immune to changing the cell selector, the refinement operator, the invariant that is used, or adding various heuristic invariants to the algorithm. In fact, our graphs also provide exponential lower bounds in the case when the k-dimensional Weisfeiler-Leman algorithm is used to replace the the 1-dimensional Weisfeiler-Leman algorithm (often called color refinement) that is normally used. Finally, the arguments even work when the entire automorphism group of the inputs is initially provided to the algorithm. The arguments apply to isomorphism testing algorithms as well as canonization algorithms within the framework.

个性化-细化范式为测试两个图的同构性提供了一个强大的工具箱，事实上，目前最快的同构求解器实现都遵循这种方法。虽然这些求解器在实践中速度很快，但从理论的角度来看，这些工具的最坏情况复杂性没有一般的下界。实际上，个性化优化算法的运行时间是一个悬而未决的问题。就我们所知，有些算法的运行时间可能是多项式。在这项工作中，我们给出了这个问题的否定答案，并构建了一组图，在这些图上，基于个性化-细化范式的算法需要指数时间。与先前的Miyazaki构造相反，它只适用于个性化细化框架中的特定实现，我们的构造不受更改单元选择器、细化操作符、使用的不变量或向算法添加各种启发式不变量的影响。事实上，当使用k维Weisfeiler-Leman算法来取代通常使用的一维Weisfeiler-Leman算法(通常称为颜色细化)时，我们的图也提供了指数下界。最后，当输入的整个自同构组最初提供给算法时，参数甚至可以工作。这些参数适用于框架内的同构测试算法以及规范化算法。

{"title":"An exponential lower bound for individualization-refinement algorithms for graph isomorphism","authors":"Daniel Neuen, Pascal Schweitzer","doi":"10.1145/3188745.3188900","DOIUrl":"https://doi.org/10.1145/3188745.3188900","url":null,"abstract":"The individualization-refinement paradigm provides a strong toolbox for testing isomorphism of two graphs and indeed, the currently fastest implementations of isomorphism solvers all follow this approach. While these solvers are fast in practice, from a theoretical point of view, no general lower bounds concerning the worst case complexity of these tools are known. In fact, it is an open question what the running time of individualization-refinement algorithms is. For all we know some of the algorithms could have polynomial running time. In this work we give a negative answer to this question and construct a family of graphs on which algorithms based on the individualization-refinement paradigm require exponential time. Contrary to a previous construction of Miyazaki, that only applies to a specific implementation within the individualization-refinement framework, our construction is immune to changing the cell selector, the refinement operator, the invariant that is used, or adding various heuristic invariants to the algorithm. In fact, our graphs also provide exponential lower bounds in the case when the k-dimensional Weisfeiler-Leman algorithm is used to replace the the 1-dimensional Weisfeiler-Leman algorithm (often called color refinement) that is normally used. Finally, the arguments even work when the entire automorphism group of the inputs is initially provided to the algorithm. The arguments apply to isomorphism testing algorithms as well as canonization algorithms within the framework.","PeriodicalId":20593,"journal":{"name":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","volume":"69 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83892992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

Near-optimal linear decision trees for k-SUM and related problems k-SUM的近最优线性决策树及相关问题

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing

Pub Date : 2017-05-04 DOI: 10.1145/3188745.3188770

D. Kane, Shachar Lovett, S. Moran

We construct near optimal linear decision trees for a variety of decision problems in combinatorics and discrete geometry. For example, for any constant k, we construct linear decision trees that solve the k-SUM problem on n elements using O(n log2 n) linear queries. Moreover, the queries we use are comparison queries, which compare the sums of two k-subsets; when viewed as linear queries, comparison queries are 2k-sparse and have only {−1,0,1} coefficients. We give similar constructions for sorting sumsets A+B and for solving the SUBSET-SUM problem, both with optimal number of queries, up to poly-logarithmic terms. Our constructions are based on the notion of “inference dimension”, recently introduced by the authors in the context of active classification with comparison queries. This can be viewed as another contribution to the fruitful link between machine learning and discrete geometry, which goes back to the discovery of the VC dimension.

针对组合学和离散几何中的各种决策问题，构造了近似最优线性决策树。例如，对于任意常数k，我们构建线性决策树，使用O(n log2 n)线性查询解决n个元素的k- sum问题。此外，我们使用的查询是比较查询，比较两个k子集的和;当被视为线性查询时，比较查询是2k-稀疏的，并且只有{−1,0,1}系数。我们给出了对集合A+B排序和求解子集- sum问题的类似结构，两者都具有最优查询数，直到多对数项。我们的结构基于“推理维度”的概念，该概念最近由作者在带有比较查询的主动分类上下文中引入。这可以被看作是机器学习和离散几何之间富有成效的联系的另一个贡献，这可以追溯到VC维的发现。

引用次数: 27