A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower’s best response. In this paper we present an approach to solving for the leader’s optimal strategy in certain Stackelberg games where the follower’s utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit—his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer’s utility function. However, he does have access to revealed preference feedback---he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower’s utility function is unknown, using only “revealed preference” access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader’s actions.
{"title":"Watch and learn: optimizing from revealed preferences feedback","authors":"Aaron Roth, Jonathan Ullman, Zhiwei Steven Wu","doi":"10.1145/2897518.2897579","DOIUrl":"https://doi.org/10.1145/2897518.2897579","url":null,"abstract":"A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower’s best response. In this paper we present an approach to solving for the leader’s optimal strategy in certain Stackelberg games where the follower’s utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit—his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer’s utility function. However, he does have access to revealed preference feedback---he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower’s utility function is unknown, using only “revealed preference” access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader’s actions.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129180384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Persuasion, defined as the act of exploiting an informational advantage in order to effect the decisions of others, is ubiquitous. Indeed, persuasive communication has been estimated to account for almost a third of all economic activity in the US. This paper examines persuasion through a computational lens, focusing on what is perhaps the most basic and fundamental model in this space: the celebrated Bayesian persuasion model of Kamenica and Gentzkow. Here there are two players, a sender and a receiver. The receiver must take one of a number of actions with a-priori unknown payoff, and the sender has access to additional information regarding the payoffs of the various actions for both players. The sender can commit to revealing a noisy signal regarding the realization of the payoffs of various actions, and would like to do so as to maximize her own payoff in expectation assuming that the receiver rationally acts to maximize his own payoff. When the payoffs of various actions follow a joint distribution (the common prior), the sender's problem is nontrivial, and its computational complexity depends on the representation of this prior. We examine the sender's optimization task in three of the most natural input models for this problem, and essentially pin down its computational complexity in each. When the payoff distributions of the different actions are i.i.d. and given explicitly, we exhibit a polynomial-time (exact) algorithmic solution, and a ``simple'' (1-1/e)-approximation algorithm. Our optimal scheme for the i.i.d. setting involves an analogy to auction theory, and makes use of Border's characterization of the space of reduced-forms for single-item auctions. When action payoffs are independent but non-identical with marginal distributions given explicitly, we show that it is #P-hard to compute the optimal expected sender utility. In doing so, we rule out a generalized Border's theorem, as defined by Gopalan et al, for this setting. Finally, we consider a general (possibly correlated) joint distribution of action payoffs presented by a black box sampling oracle, and exhibit a fully polynomial-time approximation scheme (FPTAS) with a bi-criteria guarantee. Our FPTAS is based on Monte-Carlo sampling, and its analysis relies on the principle of deferred decisions. Moreover, we show that this result is the best possible in the black-box model for information-theoretic reasons.
{"title":"Algorithmic Bayesian persuasion","authors":"S. Dughmi, Haifeng Xu","doi":"10.1145/2897518.2897583","DOIUrl":"https://doi.org/10.1145/2897518.2897583","url":null,"abstract":"Persuasion, defined as the act of exploiting an informational advantage in order to effect the decisions of others, is ubiquitous. Indeed, persuasive communication has been estimated to account for almost a third of all economic activity in the US. This paper examines persuasion through a computational lens, focusing on what is perhaps the most basic and fundamental model in this space: the celebrated Bayesian persuasion model of Kamenica and Gentzkow. Here there are two players, a sender and a receiver. The receiver must take one of a number of actions with a-priori unknown payoff, and the sender has access to additional information regarding the payoffs of the various actions for both players. The sender can commit to revealing a noisy signal regarding the realization of the payoffs of various actions, and would like to do so as to maximize her own payoff in expectation assuming that the receiver rationally acts to maximize his own payoff. When the payoffs of various actions follow a joint distribution (the common prior), the sender's problem is nontrivial, and its computational complexity depends on the representation of this prior. We examine the sender's optimization task in three of the most natural input models for this problem, and essentially pin down its computational complexity in each. When the payoff distributions of the different actions are i.i.d. and given explicitly, we exhibit a polynomial-time (exact) algorithmic solution, and a ``simple'' (1-1/e)-approximation algorithm. Our optimal scheme for the i.i.d. setting involves an analogy to auction theory, and makes use of Border's characterization of the space of reduced-forms for single-item auctions. When action payoffs are independent but non-identical with marginal distributions given explicitly, we show that it is #P-hard to compute the optimal expected sender utility. In doing so, we rule out a generalized Border's theorem, as defined by Gopalan et al, for this setting. Finally, we consider a general (possibly correlated) joint distribution of action payoffs presented by a black box sampling oracle, and exhibit a fully polynomial-time approximation scheme (FPTAS) with a bi-criteria guarantee. Our FPTAS is based on Monte-Carlo sampling, and its analysis relies on the principle of deferred decisions. Moreover, we show that this result is the best possible in the black-box model for information-theoretic reasons.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129441236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Feldman, Nicole Immorlica, Brendan Lucier, T. Roughgarden, Vasilis Syrgkanis
We present an analysis framework for bounding the price of anarchy (POA) in games that have many players, as in many of the games most pertinent to computer science applications. We use this framework to demonstrate that, in many of the models in which the POA has been studied, the POA in large games is much smaller than the worst-case bound. Our framework also differentiates between mechanisms with similar worst-case performance, such as simultaneous uniform-price auctions and greedy combinatorial auctions, thereby providing new insights about which mechanisms are likely to perform well in realistic settings.
{"title":"The price of anarchy in large games","authors":"M. Feldman, Nicole Immorlica, Brendan Lucier, T. Roughgarden, Vasilis Syrgkanis","doi":"10.1145/2897518.2897580","DOIUrl":"https://doi.org/10.1145/2897518.2897580","url":null,"abstract":"We present an analysis framework for bounding the price of anarchy (POA) in games that have many players, as in many of the games most pertinent to computer science applications. We use this framework to demonstrate that, in many of the models in which the POA has been studied, the POA in large games is much smaller than the worst-case bound. Our framework also differentiates between mechanisms with similar worst-case performance, such as simultaneous uniform-price auctions and greedy combinatorial auctions, thereby providing new insights about which mechanisms are likely to perform well in realistic settings.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127096706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the following natural generalization of Binary Search: in a given undirected, positively weighted graph, one vertex is a target. The algorithm’s task is to identify the target by adaptively querying vertices. In response to querying a node q, the algorithm learns either that q is the target, or is given an edge out of q that lies on a shortest path from q to the target. We study this problem in a general noisy model in which each query independently receives a correct answer with probability p > 1/2 (a known constant), and an (adversarial) incorrect one with probability 1 − p. Our main positive result is that when p = 1 (i.e., all answers are correct), log2 n queries are always sufficient. For general p, we give an (almost information-theoretically optimal) algorithm that uses, in expectation, no more than (1 − δ) logn/1 − H(p) + o(logn) + O(log2 (1/δ)) queries, and identifies the target correctly with probability at leas 1 − δ. Here, H(p) = −(p logp + (1 − p) log(1 − p)) denotes the entropy. The first bound is achieved by the algorithm that iteratively queries a 1-median of the nodes not ruled out yet; the second bound by careful repeated invocations of a multiplicative weights algorithm. Even for p = 1, we show several hardness results for the problem of determining whether a target can be found using K queries. Our upper bound of log2 n implies a quasipolynomial-time algorithm for undirected connected graphs; we show that this is best-possible under the Strong Exponential Time Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs with non-uniform node querying costs, the problem is PSPACE-complete. For a semi-adaptive version, in which one may query r nodes each in k rounds, we show membership in Σ2k−1 in the polynomial hierarchy, and hardness for Σ2k−5.
{"title":"Deterministic and probabilistic binary search in graphs","authors":"E. Emamjomeh-Zadeh, D. Kempe, Vikrant Singhal","doi":"10.1145/2897518.2897656","DOIUrl":"https://doi.org/10.1145/2897518.2897656","url":null,"abstract":"We consider the following natural generalization of Binary Search: in a given undirected, positively weighted graph, one vertex is a target. The algorithm’s task is to identify the target by adaptively querying vertices. In response to querying a node q, the algorithm learns either that q is the target, or is given an edge out of q that lies on a shortest path from q to the target. We study this problem in a general noisy model in which each query independently receives a correct answer with probability p > 1/2 (a known constant), and an (adversarial) incorrect one with probability 1 − p. Our main positive result is that when p = 1 (i.e., all answers are correct), log2 n queries are always sufficient. For general p, we give an (almost information-theoretically optimal) algorithm that uses, in expectation, no more than (1 − δ) logn/1 − H(p) + o(logn) + O(log2 (1/δ)) queries, and identifies the target correctly with probability at leas 1 − δ. Here, H(p) = −(p logp + (1 − p) log(1 − p)) denotes the entropy. The first bound is achieved by the algorithm that iteratively queries a 1-median of the nodes not ruled out yet; the second bound by careful repeated invocations of a multiplicative weights algorithm. Even for p = 1, we show several hardness results for the problem of determining whether a target can be found using K queries. Our upper bound of log2 n implies a quasipolynomial-time algorithm for undirected connected graphs; we show that this is best-possible under the Strong Exponential Time Hypothesis (SETH). Furthermore, for directed graphs, or for undirected graphs with non-uniform node querying costs, the problem is PSPACE-complete. For a semi-adaptive version, in which one may query r nodes each in k rounds, we show membership in Σ2k−1 in the polynomial hierarchy, and hardness for Σ2k−5.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114711095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}