We give an algorithm for learning a mixture of unstructured distributions. This problem arises in various unsupervised learning scenarios, for example in learning topic models from a corpus of documents spanning several topics. We show how to learn the constituents of a mixture of k arbitrary distributions over a large discrete domain [n]={1, 2, ...,n} and the mixture weights, using O(n polylog n) samples. (In the topic-model learning setting, the mixture constituents correspond to the topic distributions.) This task is information-theoretically impossible for k > 1 under the usual sampling process from a mixture distribution. However, there are situations (such as the above-mentioned topic model case) in which each sample point consists of several observations from the same mixture constituent. This number of observations, which we call the "sampling aperture", is a crucial parameter of the problem. We obtain the first bounds for this mixture-learning problem without imposing any assumptions on the mixture constituents. We show that efficient learning is possible exactly at the information-theoretically least-possible aperture of 2k-1. Thus, we achieve near-optimal dependence on n and optimal aperture. While the sample-size required by our algorithm depends exponentially on k, we prove that such a dependence is unavoidable when one considers general mixtures. A sequence of tools contribute to the algorithm, such as concentration results for random matrices, dimension reduction, moment estimations, and sensitivity analysis.
{"title":"Learning mixtures of arbitrary distributions over large discrete domains","authors":"Y. Rabani, L. Schulman, Chaitanya Swamy","doi":"10.1145/2554797.2554818","DOIUrl":"https://doi.org/10.1145/2554797.2554818","url":null,"abstract":"We give an algorithm for learning a mixture of unstructured distributions. This problem arises in various unsupervised learning scenarios, for example in learning topic models from a corpus of documents spanning several topics. We show how to learn the constituents of a mixture of k arbitrary distributions over a large discrete domain [n]={1, 2, ...,n} and the mixture weights, using O(n polylog n) samples. (In the topic-model learning setting, the mixture constituents correspond to the topic distributions.) This task is information-theoretically impossible for k > 1 under the usual sampling process from a mixture distribution. However, there are situations (such as the above-mentioned topic model case) in which each sample point consists of several observations from the same mixture constituent. This number of observations, which we call the \"sampling aperture\", is a crucial parameter of the problem. We obtain the first bounds for this mixture-learning problem without imposing any assumptions on the mixture constituents. We show that efficient learning is possible exactly at the information-theoretically least-possible aperture of 2k-1. Thus, we achieve near-optimal dependence on n and optimal aperture. While the sample-size required by our algorithm depends exponentially on k, we prove that such a dependence is unavoidable when one considers general mixtures. A sequence of tools contribute to the algorithm, such as concentration results for random matrices, dimension reduction, moment estimations, and sensitivity analysis.","PeriodicalId":382856,"journal":{"name":"Proceedings of the 5th conference on Innovations in theoretical computer science","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123891366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Kearns, Mallesh M. Pai, Aaron Roth, Jonathan Ullman
We study the problem of implementing equilibria of complete information games in settings of incomplete information, and address this problem using "recommender mechanisms." A recommender mechanism is one that does not have the power to enforce outcomes or to force participation, rather it only has the power to suggestion outcomes on the basis of voluntary participation. We show that despite these restrictions, recommender mechanisms can implement equilibria of complete information games in settings of incomplete information under the condition that the game is large---i.e. that there are a large number of players, and any player's action affects any other's payoff by at most a small amount. Our result follows from a novel application of differential privacy. We show that any algorithm that computes a correlated equilibrium of a complete information game while satisfying a variant of differential privacy---which we call joint differential privacy---can be used as a recommender mechanism while satisfying our desired incentive properties. Our main technical result is an algorithm for computing a correlated equilibrium of a large game while satisfying joint differential privacy. Although our recommender mechanisms are designed to satisfy game-theoretic properties, our solution ends up satisfying a strong privacy property as well. No group of players can learn "much" about the type of any player outside the group from the recommendations of the mechanism, even if these players collude in an arbitrary way. As such, our algorithm is able to implement equilibria of complete information games, without revealing information about the realized types.
{"title":"Mechanism design in large games: incentives and privacy","authors":"Michael Kearns, Mallesh M. Pai, Aaron Roth, Jonathan Ullman","doi":"10.1145/2554797.2554834","DOIUrl":"https://doi.org/10.1145/2554797.2554834","url":null,"abstract":"We study the problem of implementing equilibria of complete information games in settings of incomplete information, and address this problem using \"recommender mechanisms.\" A recommender mechanism is one that does not have the power to enforce outcomes or to force participation, rather it only has the power to suggestion outcomes on the basis of voluntary participation. We show that despite these restrictions, recommender mechanisms can implement equilibria of complete information games in settings of incomplete information under the condition that the game is large---i.e. that there are a large number of players, and any player's action affects any other's payoff by at most a small amount. Our result follows from a novel application of differential privacy. We show that any algorithm that computes a correlated equilibrium of a complete information game while satisfying a variant of differential privacy---which we call joint differential privacy---can be used as a recommender mechanism while satisfying our desired incentive properties. Our main technical result is an algorithm for computing a correlated equilibrium of a large game while satisfying joint differential privacy. Although our recommender mechanisms are designed to satisfy game-theoretic properties, our solution ends up satisfying a strong privacy property as well. No group of players can learn \"much\" about the type of any player outside the group from the recommendations of the mechanism, even if these players collude in an arbitrary way. As such, our algorithm is able to implement equilibria of complete information games, without revealing information about the realized types.","PeriodicalId":382856,"journal":{"name":"Proceedings of the 5th conference on Innovations in theoretical computer science","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133327249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The seminal result of Impagliazzo and Rudich (STOC 1989) gave a black-box separation between one-way functions and public-key encryption: a public-key encryption scheme cannot be constructed using one-way functions in a black-box way. In addition, their result implied black-box separations between one-way functions and protocols for certain Secure Function Evaluation (SFE) functionalities (in particular, Oblivious Transfer). Surprisingly, however, since then there has been no further progress in separating one-way functions and SFE functionalities. In this work, we present the complete picture for finite deterministic 2-party SFE functionalities, vis a vis one-way functions. We show that in case of semi-honest adversaries, one-way functions are black-box separated from all such SFE functionalities, except the ones which have unconditionally secure protocols (and hence do not rely on any computational hardness). In the case of active adversaries, a black-box one-way function is indeed useful for SFE, but we show that it is useful only as much as access to an ideal commitment functionality is useful. Technically, our main result establishes the limitations of random oracles for secure computation. We show that a two-party deterministic functionality f has a secure protocol in the random oracle model that is (statistically) secure against semi-honest adversaries if and only if f has a protocol in the plain model that is (perfectly) secure against semi-honest adversaries. Further, in the case of active adversaries, a deterministic SFE functionality f has a (UC or standalone) statistically secure protocol in the random oracle model if and only if f has a (UC or standalone) statistically secure protocol in the commitment-hybrid model. Our proof is based on a "frontier analysis" of two-party protocols, combining it with (extensions of) the "independence learners" of Impagliazzo-Rudich/Barak-Mahmoody. We make essential use of a combinatorial property, originally discovered by Kushilevitz (FOCS 1989), of functions that have semi-honest secure protocols in the plain model (and hence our analysis applies only to functions of polynomial-sized domains, for which such a characterization is known). Our result could be seen as a first step towards proving a conjecture that we put forth in this work and call it the Many-Worlds Conjecture. For every 2-party SFE functionality f, one can consider a "world" where f can be semi-honest securely realized in the computational setting. Many-Worlds Conjecture states that there are infinitely many "distinct worlds" between minicrypt and cryptomania in the universe of Impagliazzo's Worlds.
{"title":"Limits of random oracles in secure computation","authors":"Mohammad Mahmoody, H. K. Maji, M. Prabhakaran","doi":"10.1145/2554797.2554801","DOIUrl":"https://doi.org/10.1145/2554797.2554801","url":null,"abstract":"The seminal result of Impagliazzo and Rudich (STOC 1989) gave a black-box separation between one-way functions and public-key encryption: a public-key encryption scheme cannot be constructed using one-way functions in a black-box way. In addition, their result implied black-box separations between one-way functions and protocols for certain Secure Function Evaluation (SFE) functionalities (in particular, Oblivious Transfer). Surprisingly, however, since then there has been no further progress in separating one-way functions and SFE functionalities. In this work, we present the complete picture for finite deterministic 2-party SFE functionalities, vis a vis one-way functions. We show that in case of semi-honest adversaries, one-way functions are black-box separated from all such SFE functionalities, except the ones which have unconditionally secure protocols (and hence do not rely on any computational hardness). In the case of active adversaries, a black-box one-way function is indeed useful for SFE, but we show that it is useful only as much as access to an ideal commitment functionality is useful. Technically, our main result establishes the limitations of random oracles for secure computation. We show that a two-party deterministic functionality f has a secure protocol in the random oracle model that is (statistically) secure against semi-honest adversaries if and only if f has a protocol in the plain model that is (perfectly) secure against semi-honest adversaries. Further, in the case of active adversaries, a deterministic SFE functionality f has a (UC or standalone) statistically secure protocol in the random oracle model if and only if f has a (UC or standalone) statistically secure protocol in the commitment-hybrid model. Our proof is based on a \"frontier analysis\" of two-party protocols, combining it with (extensions of) the \"independence learners\" of Impagliazzo-Rudich/Barak-Mahmoody. We make essential use of a combinatorial property, originally discovered by Kushilevitz (FOCS 1989), of functions that have semi-honest secure protocols in the plain model (and hence our analysis applies only to functions of polynomial-sized domains, for which such a characterization is known). Our result could be seen as a first step towards proving a conjecture that we put forth in this work and call it the Many-Worlds Conjecture. For every 2-party SFE functionality f, one can consider a \"world\" where f can be semi-honest securely realized in the computational setting. Many-Worlds Conjecture states that there are infinitely many \"distinct worlds\" between minicrypt and cryptomania in the universe of Impagliazzo's Worlds.","PeriodicalId":382856,"journal":{"name":"Proceedings of the 5th conference on Innovations in theoretical computer science","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115459450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the Chance-Constrained Integer Feasibility Problem, where the goal is to determine whether the random polytope P(A,b)={x ϵ Rn : Aix ≤ bi, i ϵ [m]} obtained by choosing the constraint matrix A and vector b from a known distribution is integer feasible with probability at least 1-ε. We consider the case when the entries of the constraint matrix A are i.i.d. Gaussian (equivalently are i.i.d. from any spherically symmetric distribution). The radius of the largest inscribed ball is closely related to the existence of integer points in the polytope. We find that for m up to 2O(√n) constraints (rows of A), there exist constants c0 < c1 such that with high probability (ɛ = 1 /poly(n)), random polytopes are integer feasible if the radius of the largest ball contained in the polytope is at least c1√log(m/n)); and integer infeasible if the largest ball contained in the polytope is centered at (1/2,...,1/2) and has radius at most c0√log(m/n)). Thus, random polytopes transition from having no integer points to being integer feasible within a constant factor increase in the radius of the largest inscribed ball. Integer feasibility is based on a randomized polynomial-time algorithm for finding an integer point in the polytope. Our main tool is a simple new connection between integer feasibility and linear discrepancy. We extend a recent algorithm for finding low-discrepancy solutions to give a constructive upper bound on the linear discrepancy of random Gaussian matrices. By our connection between discrepancy and integer feasibility, this upper bound on linear discrepancy translates to the radius bound that guarantees integer feasibility of random polytopes.
{"title":"Integer feasibility of random polytopes: random integer programs","authors":"Karthekeyan Chandrasekaran, S. Vempala","doi":"10.1145/2554797.2554838","DOIUrl":"https://doi.org/10.1145/2554797.2554838","url":null,"abstract":"We study the Chance-Constrained Integer Feasibility Problem, where the goal is to determine whether the random polytope P(A,b)={x ϵ Rn : Aix ≤ bi, i ϵ [m]} obtained by choosing the constraint matrix A and vector b from a known distribution is integer feasible with probability at least 1-ε. We consider the case when the entries of the constraint matrix A are i.i.d. Gaussian (equivalently are i.i.d. from any spherically symmetric distribution). The radius of the largest inscribed ball is closely related to the existence of integer points in the polytope. We find that for m up to 2O(√n) constraints (rows of A), there exist constants c0 < c1 such that with high probability (ɛ = 1 /poly(n)), random polytopes are integer feasible if the radius of the largest ball contained in the polytope is at least c1√log(m/n)); and integer infeasible if the largest ball contained in the polytope is centered at (1/2,...,1/2) and has radius at most c0√log(m/n)). Thus, random polytopes transition from having no integer points to being integer feasible within a constant factor increase in the radius of the largest inscribed ball. Integer feasibility is based on a randomized polynomial-time algorithm for finding an integer point in the polytope. Our main tool is a simple new connection between integer feasibility and linear discrepancy. We extend a recent algorithm for finding low-discrepancy solutions to give a constructive upper bound on the linear discrepancy of random Gaussian matrices. By our connection between discrepancy and integer feasibility, this upper bound on linear discrepancy translates to the radius bound that guarantees integer feasibility of random polytopes.","PeriodicalId":382856,"journal":{"name":"Proceedings of the 5th conference on Innovations in theoretical computer science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123180865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}