We investigate properties of Thompson Sampling in the stochastic multi-armed bandit problem with delayed feedback. In a setting with i.i.d delays, we establish to our knowledge the first regret bounds for Thompson Sampling with arbitrary delay distributions, including ones with unbounded expectation. Our bounds are qualitatively comparable to the best available bounds derived via ad-hoc algorithms, and only depend on delays via selected quantiles of the delay distributions. Furthermore, in extensive simulation experiments, we find that Thompson Sampling outperforms a number of alternative proposals, including methods specifically designed for settings with delayed feedback.
{"title":"Thompson Sampling with Unrestricted Delays","authors":"Hang Wu, Stefan Wager","doi":"10.1145/3490486.3538376","DOIUrl":"https://doi.org/10.1145/3490486.3538376","url":null,"abstract":"We investigate properties of Thompson Sampling in the stochastic multi-armed bandit problem with delayed feedback. In a setting with i.i.d delays, we establish to our knowledge the first regret bounds for Thompson Sampling with arbitrary delay distributions, including ones with unbounded expectation. Our bounds are qualitatively comparable to the best available bounds derived via ad-hoc algorithms, and only depend on delays via selected quantiles of the delay distributions. Furthermore, in extensive simulation experiments, we find that Thompson Sampling outperforms a number of alternative proposals, including methods specifically designed for settings with delayed feedback.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114899383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online platforms have a wealth of data, run countless experiments and use industrial-scale algorithms to optimize user experience. Despite this, many users seem to regret the time they spend on these platforms. One possible explanation is that incentives are misaligned: platforms are not optimizing for user happiness. We suggest the problem runs deeper, transcending the specific incentives of any particular platform, and instead stems from a mistaken foundational assumption. To understand what users want, platforms look at what users do. This is a kind of revealed-preference assumption that is ubiquitous in the way user models are built. Yet research has demonstrated, and personal experience affirms, that we often make choices in the moment that are inconsistent with what we actually want. The behavioral economics and psychology literatures suggest, for example, that we can choose mindlessly or that we can be too myopic in our choices, behaviors that feel entirely familiar on online platforms. In this work, we develop a model of media consumption where users have inconsistent preferences. We consider an altruistic platform which simply wants to maximize user utility, but only observes behavioral data in the form of the user's engagement. We show how our model of users' preference inconsistencies produces phenomena that are familiar from everyday experience, but difficult to capture in traditional user interaction models. These phenomena include users who have long sessions on a platform but derive very little utility from it, and platform changes that steadily raise user engagement before abruptly causing users to go "cold turkey'' and quit. A key ingredient in our model is a formulation for how platforms determine what to show users: they optimize over a large set of potential content (the content manifold) parametrized by underlying features of the content. Whether improving engagement improves user welfare depends on the direction of movement in the content manifold: for certain directions of change, increasing engagement makes users less happy, while in other directions on the same manifold, increasing engagement makes users happier. We provide a characterization of the structure of content manifolds for which increasing engagement fails to increase user utility. By linking these effects to abstractions of platform design choices, our model thus creates a theoretical framework and vocabulary in which to explore interactions between design, behavioral science, and social media. A full version of this paper can be found at https://arxiv.org/pdf/2202.11776.pdf.
{"title":"The Challenge of Understanding What Users Want: Inconsistent Preferences and Engagement Optimization","authors":"J. Kleinberg, S. Mullainathan, Manish Raghavan","doi":"10.1145/3490486.3538365","DOIUrl":"https://doi.org/10.1145/3490486.3538365","url":null,"abstract":"Online platforms have a wealth of data, run countless experiments and use industrial-scale algorithms to optimize user experience. Despite this, many users seem to regret the time they spend on these platforms. One possible explanation is that incentives are misaligned: platforms are not optimizing for user happiness. We suggest the problem runs deeper, transcending the specific incentives of any particular platform, and instead stems from a mistaken foundational assumption. To understand what users want, platforms look at what users do. This is a kind of revealed-preference assumption that is ubiquitous in the way user models are built. Yet research has demonstrated, and personal experience affirms, that we often make choices in the moment that are inconsistent with what we actually want. The behavioral economics and psychology literatures suggest, for example, that we can choose mindlessly or that we can be too myopic in our choices, behaviors that feel entirely familiar on online platforms. In this work, we develop a model of media consumption where users have inconsistent preferences. We consider an altruistic platform which simply wants to maximize user utility, but only observes behavioral data in the form of the user's engagement. We show how our model of users' preference inconsistencies produces phenomena that are familiar from everyday experience, but difficult to capture in traditional user interaction models. These phenomena include users who have long sessions on a platform but derive very little utility from it, and platform changes that steadily raise user engagement before abruptly causing users to go \"cold turkey'' and quit. A key ingredient in our model is a formulation for how platforms determine what to show users: they optimize over a large set of potential content (the content manifold) parametrized by underlying features of the content. Whether improving engagement improves user welfare depends on the direction of movement in the content manifold: for certain directions of change, increasing engagement makes users less happy, while in other directions on the same manifold, increasing engagement makes users happier. We provide a characterization of the structure of content manifolds for which increasing engagement fails to increase user utility. By linking these effects to abstractions of platform design choices, our model thus creates a theoretical framework and vocabulary in which to explore interactions between design, behavioral science, and social media. A full version of this paper can be found at https://arxiv.org/pdf/2202.11776.pdf.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126141332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan, Haifeng Xu
In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long term interest with incentives of the gig service providers (e.g., drivers, hosts). This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs), in which a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximize the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Nevertheless, in the population level where the model is known, it turns out that we can efficiently determine the optimal (resp. ε-optimal) policy with finite (resp. infinite) states and outcomes, through a modified formulation of the Bellman equation that additionally takes persuasiveness into consideration. Our main technical contribution is to study the MPP under the online reinforcement learning (RL) setting, where the goal is to learn the optimal signaling policy by interacting with with the underlying MPP, without the knowledge of the sender's utility functions, prior distributions, and the Markov transition kernels. For such a problem, we design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles. In particular, we obtain optimistic estimates of the value functions to encourage exploration under the unknown environment, and additionally robustify the signaling policy with respect to the uncertainty of prior estimation to prevent receiver's detrimental equilibrium behavior. Our algorithm enjoys sample efficiency by achieving a sublinear √T-regret upper bound. Furthermore, both our algorithm and theory can be applied to MPPs with large space of outcomes and states via function approximation, and we showcase such a success under the linear setting.
在今天的经济中,互联网平台考虑顺序信息设计问题,使其长期利益与零工服务提供商(例如,司机,主机)的激励保持一致,这一点变得非常重要。本文提出了一种新的序列信息设计模型,即马尔可夫说服过程(MPPs),在该模型中,具有信息优势的发送者在具有不同先验函数和效用函数的有限视界马尔可夫环境中,试图说服一群短视的接收者采取行动,使发送者的累积效用最大化。因此,mpp的规划面临着一个独特的挑战,即寻找一种信号策略,既能说服短视的接收者,又能诱导发送者获得最佳的长期累积效用。然而,在已知模型的总体水平上,事实证明我们可以有效地确定最优(resp)。有限响应的ε-最优策略。无限)状态和结果,通过修改Bellman方程的公式,另外考虑了说服力。我们的主要技术贡献是研究在线强化学习(RL)设置下的MPP,其目标是通过与底层MPP交互来学习最佳信令策略,而不需要了解发送方的效用函数、先验分布和马尔可夫转换核。针对这一问题,我们设计了一种可证明高效的无遗憾学习算法——乐观-悲观说服过程原则(OP4),该算法将乐观原则和悲观原则结合在一起。特别是,我们获得了价值函数的乐观估计,以鼓励在未知环境下的探索,并且根据先验估计的不确定性对信号策略进行鲁棒化,以防止接收者的有害均衡行为。我们的算法通过实现次线性/ t -遗憾上界而享有样本效率。此外,我们的算法和理论都可以通过函数逼近应用于具有大结果和状态空间的mpp,并在线性设置下取得了成功。
{"title":"Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning","authors":"Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan, Haifeng Xu","doi":"10.1145/3490486.3538313","DOIUrl":"https://doi.org/10.1145/3490486.3538313","url":null,"abstract":"In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long term interest with incentives of the gig service providers (e.g., drivers, hosts). This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs), in which a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximize the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Nevertheless, in the population level where the model is known, it turns out that we can efficiently determine the optimal (resp. ε-optimal) policy with finite (resp. infinite) states and outcomes, through a modified formulation of the Bellman equation that additionally takes persuasiveness into consideration. Our main technical contribution is to study the MPP under the online reinforcement learning (RL) setting, where the goal is to learn the optimal signaling policy by interacting with with the underlying MPP, without the knowledge of the sender's utility functions, prior distributions, and the Markov transition kernels. For such a problem, we design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles. In particular, we obtain optimistic estimates of the value functions to encourage exploration under the unknown environment, and additionally robustify the signaling policy with respect to the uncertainty of prior estimation to prevent receiver's detrimental equilibrium behavior. Our algorithm enjoys sample efficiency by achieving a sublinear √T-regret upper bound. Furthermore, both our algorithm and theory can be applied to MPPs with large space of outcomes and states via function approximation, and we showcase such a success under the linear setting.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117108474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Apportionment is the problem of distributing h indivisible seats across states in proportion to the states' populations. In the context of the US House of Representatives, this problem has a rich history and is a prime example of interactions between mathematical analysis and political practice. Grimmett [2004] suggested to apportion seats in a randomized way such that each state receives exactly their proportional share qi of seats in expectation (ex ante proportionality) and receives either ↾qi↿ or ⇂qi⇃ many seats ex post (quota). However, there is a vast space of randomized apportionment methods satisfying these two axioms, and so we additionally consider prominent axioms from the apportionment literature. Our main result is a randomized method satisfying quota, ex ante proportionality and house monotonicity — a property that prevents paradoxes when the number of seats changes and which we require to hold ex post. This result is based on a generalization of dependent rounding on bipartite graphs, which we call cumulative rounding and which might be of independent interest, as we demonstrate via applications beyond apportionment. The full version of this paper is available at urlhttps://arxiv.org/pdf/2202.11061.pdf.
{"title":"In This Apportionment Lottery, the House Always Wins","authors":"Paul Golz, Dominik Peters, A. Procaccia","doi":"10.1145/3490486.3538299","DOIUrl":"https://doi.org/10.1145/3490486.3538299","url":null,"abstract":"Apportionment is the problem of distributing h indivisible seats across states in proportion to the states' populations. In the context of the US House of Representatives, this problem has a rich history and is a prime example of interactions between mathematical analysis and political practice. Grimmett [2004] suggested to apportion seats in a randomized way such that each state receives exactly their proportional share qi of seats in expectation (ex ante proportionality) and receives either ↾qi↿ or ⇂qi⇃ many seats ex post (quota). However, there is a vast space of randomized apportionment methods satisfying these two axioms, and so we additionally consider prominent axioms from the apportionment literature. Our main result is a randomized method satisfying quota, ex ante proportionality and house monotonicity — a property that prevents paradoxes when the number of seats changes and which we require to hold ex post. This result is based on a generalization of dependent rounding on bipartite graphs, which we call cumulative rounding and which might be of independent interest, as we demonstrate via applications beyond apportionment. The full version of this paper is available at urlhttps://arxiv.org/pdf/2202.11061.pdf.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121896019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study hidden-action principal-agent problems in which a principal commits to an outcome-dependent payment scheme (called contract) so as to incentivize the agent to take a costly, unobservable action leading to favorable outcomes. In particular, we focus on Bayesian settings where the agent has private information. This is collectively encoded by the agent's type, which is unknown to the principal, but randomly drawn according to a finitely-supported, commonly-known probability distribution. In our model, the agent's type determines both the probability distribution over outcomes and the cost associated with each agent's action. In Bayesian principal-agent problems, the principal may be better off by committing to a menu of contracts specifying a contract for each agent's type, rater than committing to a single contract. This induces a two-stage process that resembles interactions studied in classical mechanism design: after the principal has committed to a menu, the agent first reports a type to the principal, and, then, the latter puts in place the contract in the menu that corresponds to the reported type. Thus, the principal's computational problem boils down to designing a menu of contracts that incentivizes the agent to report their true type and maximizes expected utility. Previous works showed that, in Bayesian principal-agent problems, computing an optimal menu of contracts or an optimal (single) contract is APX-hard, which is in sharp contrast from what happens in non-Bayesian settings, where an optimal contract can be computed efficiently. Crucially, previous works focus on menus of deterministic contracts. Surprisingly, in this paper we show that, if one instead considers menus of randomized contracts defined as probability distributions over payment vectors, then an optimal menu can be computed in polynomial time. Besides this main result, we also close several gaps in the computational complexity analysis of the problem of computing menus of deterministic contracts. In particular, we prove that the problem cannot be approximated up to within any multiplicative factor and it does not admit an additive FPTAS unless P = NP, even in basic instances with a constant number of actions and only four outcomes. This considerably extends previously-known negative results. Then, we show that our hardness result is tight, by providing an additive PTAS that works in instances with a constant number of outcomes. We complete our analysis by showing that an optimal menu of deterministic contracts can be computed in polynomial time when either there are only two outcomes or there is a constant number of types.
{"title":"Designing Menus of Contracts Efficiently: The Power of Randomization","authors":"Matteo Castiglioni, A. Marchesi, N. Gatti","doi":"10.1145/3490486.3538270","DOIUrl":"https://doi.org/10.1145/3490486.3538270","url":null,"abstract":"We study hidden-action principal-agent problems in which a principal commits to an outcome-dependent payment scheme (called contract) so as to incentivize the agent to take a costly, unobservable action leading to favorable outcomes. In particular, we focus on Bayesian settings where the agent has private information. This is collectively encoded by the agent's type, which is unknown to the principal, but randomly drawn according to a finitely-supported, commonly-known probability distribution. In our model, the agent's type determines both the probability distribution over outcomes and the cost associated with each agent's action. In Bayesian principal-agent problems, the principal may be better off by committing to a menu of contracts specifying a contract for each agent's type, rater than committing to a single contract. This induces a two-stage process that resembles interactions studied in classical mechanism design: after the principal has committed to a menu, the agent first reports a type to the principal, and, then, the latter puts in place the contract in the menu that corresponds to the reported type. Thus, the principal's computational problem boils down to designing a menu of contracts that incentivizes the agent to report their true type and maximizes expected utility. Previous works showed that, in Bayesian principal-agent problems, computing an optimal menu of contracts or an optimal (single) contract is APX-hard, which is in sharp contrast from what happens in non-Bayesian settings, where an optimal contract can be computed efficiently. Crucially, previous works focus on menus of deterministic contracts. Surprisingly, in this paper we show that, if one instead considers menus of randomized contracts defined as probability distributions over payment vectors, then an optimal menu can be computed in polynomial time. Besides this main result, we also close several gaps in the computational complexity analysis of the problem of computing menus of deterministic contracts. In particular, we prove that the problem cannot be approximated up to within any multiplicative factor and it does not admit an additive FPTAS unless P = NP, even in basic instances with a constant number of actions and only four outcomes. This considerably extends previously-known negative results. Then, we show that our hardness result is tight, by providing an additive PTAS that works in instances with a constant number of outcomes. We complete our analysis by showing that an optimal menu of deterministic contracts can be computed in polynomial time when either there are only two outcomes or there is a constant number of types.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131964280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In delegation problems, a principal does not have the resources necessary to complete a particular task, so they delegate the task to an untrusted agent whose interests may differ from their own. Given any family of such problems and space of mechanisms for the principal to choose from, the delegation gap is the worst-case ratio of the principal's optimal utility when they delegate versus their optimal utility when solving the problem on their own. In this work, we consider the delegation gap of the generalized Pandora's box problem, a search problem in which searching for solutions incurs known costs and solutions are restricted by some downward-closed constraint. First, we show that there is a special case when all random variables have binary support for which there exist constant-factor delegation gaps for matroid constraints. However, there is no constant-factor delegation gap for even simple non-binary instances of the problem. Getting around this impossibility, we consider two variants: the free-agent model, in which the agent doesn't pay the cost of probing elements, and discounted-cost approximations, in which we discount all costs and aim for a bicriteria approximation of the discount factor and delegation gap. We show that there are constant-factor delegation gaps in the free-agent model with discounted-cost approximations for certain downward closed constraints and constant discount factors. However, constant delegation gaps can not be achieved under either variant alone. Finally, we consider another variant called the shared-cost model, in which the principal can choose how costs will be shared between them and the agent before delegating the search problem. We show that the shared-cost model exhibits a constant-factor delegation gap for certain downward closed constraints.
{"title":"Delegated Pandora's Box","authors":"Curtis Bechtel, S. Dughmi, Neel Patel","doi":"10.1145/3490486.3538267","DOIUrl":"https://doi.org/10.1145/3490486.3538267","url":null,"abstract":"In delegation problems, a principal does not have the resources necessary to complete a particular task, so they delegate the task to an untrusted agent whose interests may differ from their own. Given any family of such problems and space of mechanisms for the principal to choose from, the delegation gap is the worst-case ratio of the principal's optimal utility when they delegate versus their optimal utility when solving the problem on their own. In this work, we consider the delegation gap of the generalized Pandora's box problem, a search problem in which searching for solutions incurs known costs and solutions are restricted by some downward-closed constraint. First, we show that there is a special case when all random variables have binary support for which there exist constant-factor delegation gaps for matroid constraints. However, there is no constant-factor delegation gap for even simple non-binary instances of the problem. Getting around this impossibility, we consider two variants: the free-agent model, in which the agent doesn't pay the cost of probing elements, and discounted-cost approximations, in which we discount all costs and aim for a bicriteria approximation of the discount factor and delegation gap. We show that there are constant-factor delegation gaps in the free-agent model with discounted-cost approximations for certain downward closed constraints and constant discount factors. However, constant delegation gaps can not be achieved under either variant alone. Finally, we consider another variant called the shared-cost model, in which the principal can choose how costs will be shared between them and the agent before delegating the search problem. We show that the shared-cost model exhibits a constant-factor delegation gap for certain downward closed constraints.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121682874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Bergemann, Yang Cai, Grigoris Velegkas, Mingfei Zhao
We study the problem of selling information to a data-buyer who faces a decision problem under uncertainty. We consider the classic Bayesian decision-theoretic model pioneered by Blackwell. Initially, the data buyer has only partial information about the payoff-relevant state of the world. A data seller offers additional information about the state of the world. The information is revealed through signaling schemes, also referred to as experiments. In the single-agent setting, any mechanism can be represented as a menu of experiments. A recent paper by Bergemann et al.[8] present a complete characterization of the revenue-optimal mechanism in a binary state and binary action environment. By contrast, no characterization is known for the case with more actions. In this paper, we consider more general environments and study arguably the simplest mechanism, which only sells the fully informative experiment. In the environment with binary state and m≥3 actions, we provide an $O(m)$-approximation to the optimal revenue by selling only the fully informative experiment and show that the approximation ratio is tight up to an absolute constant factor. An important corollary of our lower bound is that the size of the optimal menu must grow at least linearly in the number of available actions, so no universal upper bound exists for the size of the optimal menu in the general single-dimensional setting. We also provide a sufficient condition under which selling only the fully informative experiment achieves the optimal revenue. For multi-dimensional environments, we prove that even in arguably the simplest matching utility environment with 3 states and 3 actions, the ratio between the optimal revenue and the revenue by selling only the fully informative experiment can grow immediately to a polynomial of the number of agent types. Nonetheless, if the distribution is uniform, we show that selling only the fully informative experiment is indeed the optimal mechanism.
{"title":"Is Selling Complete Information (Approximately) Optimal?","authors":"D. Bergemann, Yang Cai, Grigoris Velegkas, Mingfei Zhao","doi":"10.1145/3490486.3538304","DOIUrl":"https://doi.org/10.1145/3490486.3538304","url":null,"abstract":"We study the problem of selling information to a data-buyer who faces a decision problem under uncertainty. We consider the classic Bayesian decision-theoretic model pioneered by Blackwell. Initially, the data buyer has only partial information about the payoff-relevant state of the world. A data seller offers additional information about the state of the world. The information is revealed through signaling schemes, also referred to as experiments. In the single-agent setting, any mechanism can be represented as a menu of experiments. A recent paper by Bergemann et al.[8] present a complete characterization of the revenue-optimal mechanism in a binary state and binary action environment. By contrast, no characterization is known for the case with more actions. In this paper, we consider more general environments and study arguably the simplest mechanism, which only sells the fully informative experiment. In the environment with binary state and m≥3 actions, we provide an $O(m)$-approximation to the optimal revenue by selling only the fully informative experiment and show that the approximation ratio is tight up to an absolute constant factor. An important corollary of our lower bound is that the size of the optimal menu must grow at least linearly in the number of available actions, so no universal upper bound exists for the size of the optimal menu in the general single-dimensional setting. We also provide a sufficient condition under which selling only the fully informative experiment achieves the optimal revenue. For multi-dimensional environments, we prove that even in arguably the simplest matching utility environment with 3 states and 3 actions, the ratio between the optimal revenue and the revenue by selling only the fully informative experiment can grow immediately to a polynomial of the number of agent types. Nonetheless, if the distribution is uniform, we show that selling only the fully informative experiment is indeed the optimal mechanism.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133572560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Feldman, Vasilis Gkatzelis, N. Gravin, Daniel Schoepflin
In a single-parameter mechanism design problem, a provider is looking to sell some service to a group of potential buyers. Each buyer i has a private value vi for receiving this service, and some feasibility constraint restricts which subsets of buyers can be served simultaneously. Recent work in economics introduced (deferred-acceptance) clock auctions as a superior class of auctions for this problem, due to their transparency, simplicity, and very strong incentive guarantees. Subsequent work in computer science focused on evaluating these auctions with respect to their social welfare approximation guarantees, leading to strong impossibility results: in the absence of prior information regarding the buyers' values, no deterministic clock auction can achieve a bounded approximation, even for simple feasibility constraints with only two maximal feasible sets. We show that these negative results can be circumvented either by using access to prior information or by leveraging randomization. In particular, we provide clock auctions that give a O(log log k) approximation for general downward-closed feasibility constraints with k maximal feasible sets, for three different information models, ranging from full access to the value distributions to complete absence of information. The more information the seller has, the simpler and more practical these auctions are. Under full access, we use a particularly simple deterministic clock auction, called a single-price clock auction, which is only slightly more complex than posted price mechanisms. In this auction, each buyer is offered a single price, then a feasible set is selected among those who accept their offers. In the other extreme, where no prior information is available, this approximation guarantee is obtained using a complex randomized clock auction. In addition to our main results, we propose a parameterization that interpolates between single-price clock auctions and general clock auctions, paving the way for an exciting line of future research.
{"title":"Bayesian and Randomized Clock Auctions","authors":"M. Feldman, Vasilis Gkatzelis, N. Gravin, Daniel Schoepflin","doi":"10.1145/3490486.3538247","DOIUrl":"https://doi.org/10.1145/3490486.3538247","url":null,"abstract":"In a single-parameter mechanism design problem, a provider is looking to sell some service to a group of potential buyers. Each buyer i has a private value vi for receiving this service, and some feasibility constraint restricts which subsets of buyers can be served simultaneously. Recent work in economics introduced (deferred-acceptance) clock auctions as a superior class of auctions for this problem, due to their transparency, simplicity, and very strong incentive guarantees. Subsequent work in computer science focused on evaluating these auctions with respect to their social welfare approximation guarantees, leading to strong impossibility results: in the absence of prior information regarding the buyers' values, no deterministic clock auction can achieve a bounded approximation, even for simple feasibility constraints with only two maximal feasible sets. We show that these negative results can be circumvented either by using access to prior information or by leveraging randomization. In particular, we provide clock auctions that give a O(log log k) approximation for general downward-closed feasibility constraints with k maximal feasible sets, for three different information models, ranging from full access to the value distributions to complete absence of information. The more information the seller has, the simpler and more practical these auctions are. Under full access, we use a particularly simple deterministic clock auction, called a single-price clock auction, which is only slightly more complex than posted price mechanisms. In this auction, each buyer is offered a single price, then a feasible set is selected among those who accept their offers. In the other extreme, where no prior information is available, this approximation guarantee is obtained using a complex randomized clock auction. In addition to our main results, we propose a parameterization that interpolates between single-price clock auctions and general clock auctions, paving the way for an exciting line of future research.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130608121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A century ago, Emile Borel published his seminal paper on the theory of play and integral equations with skew-symmetric kernels[1]. Borel describes what is now called the Blotto game: a resource-allocation game in which two players compete for over n different battlefields by simultaneously allocating resources to each battlefield. The following two additional characteristics are perhaps the most salient features of the Blotto game: Winner-takes-all: For each battlefield, the player allocating the most resources to a given battlefield wins the battlefield. Fixed budget: each player is subject to a fixed---and deterministic---budget that mixed strategies should satisfy almost surely. Despite its century-long existence, Nash equilibria for the Blotto game are only known under various restrictions on the main parameters of the problem: the budget of each player and the value given to each battlefield. Moreover, previous solutions for two-player games have consisted in constructing explicit solutions. Because of the budget constraints, these strategies can be decomposed into two parts: marginal distributions that indicate which (random) strategy to play on each battlefield and a coupling that correlates the marginal strategies in such a way to ensure that the budget constraint is satisfied almost surely. The first part may be studied independently of the second by considering what is known as the (General) Lotto game. In this game, the budget constraint needs only be enforced in expectation with respect to the randomization of the mixed strategies. While this setup lacks a defining characteristic of the Blotto game (fixed budget), it has the advantage of lending itself to more amenable computations. Indeed, unlike the Blotto game, a complete solution to the Lotto game was recently proposed in [2] where the authors describe an explicit Nash equilibrium in the most general case: asymmetric budget, asymmetric and heterogeneous values. In light of this progress, a natural question is whether the marginal solutions discovered in [2] could be coupled in such a way that the budget constraint is satisfied almost surely. We provide a positive answer to this question by appealing to an existing result from the theory of joint mixability [5]. Mixability asks the following question: Can n random variables X1, ..., Xn with prescribed marginal distributions Xi ~ Pi, be coupled in such a way that var(X1+ ··· + Xn)=0. Joint mixability is precisely the step required to go from a Lotto solution to a Blotto one by coupling the marginals of the Lotto solution in such a way that the budget constraint is satisfied. In this paper, we exploit a simple connection between joint mixability and the theory of multi-marginal couplings. We propose an algorithmic solution to the Blotto problem by efficiently constructing a coupling that satisfies the budget constraint almost surely and can be easily sampled from. Our construction relies on three key steps: first, we reduce the prob
{"title":"An Algorithmic Solution to the Blotto Game using Multi-marginal Couplings","authors":"Vianney Perchet, P. Rigollet, Thibaut Le Gouic","doi":"10.1145/3490486.3538240","DOIUrl":"https://doi.org/10.1145/3490486.3538240","url":null,"abstract":"A century ago, Emile Borel published his seminal paper on the theory of play and integral equations with skew-symmetric kernels[1]. Borel describes what is now called the Blotto game: a resource-allocation game in which two players compete for over n different battlefields by simultaneously allocating resources to each battlefield. The following two additional characteristics are perhaps the most salient features of the Blotto game: Winner-takes-all: For each battlefield, the player allocating the most resources to a given battlefield wins the battlefield. Fixed budget: each player is subject to a fixed---and deterministic---budget that mixed strategies should satisfy almost surely. Despite its century-long existence, Nash equilibria for the Blotto game are only known under various restrictions on the main parameters of the problem: the budget of each player and the value given to each battlefield. Moreover, previous solutions for two-player games have consisted in constructing explicit solutions. Because of the budget constraints, these strategies can be decomposed into two parts: marginal distributions that indicate which (random) strategy to play on each battlefield and a coupling that correlates the marginal strategies in such a way to ensure that the budget constraint is satisfied almost surely. The first part may be studied independently of the second by considering what is known as the (General) Lotto game. In this game, the budget constraint needs only be enforced in expectation with respect to the randomization of the mixed strategies. While this setup lacks a defining characteristic of the Blotto game (fixed budget), it has the advantage of lending itself to more amenable computations. Indeed, unlike the Blotto game, a complete solution to the Lotto game was recently proposed in [2] where the authors describe an explicit Nash equilibrium in the most general case: asymmetric budget, asymmetric and heterogeneous values. In light of this progress, a natural question is whether the marginal solutions discovered in [2] could be coupled in such a way that the budget constraint is satisfied almost surely. We provide a positive answer to this question by appealing to an existing result from the theory of joint mixability [5]. Mixability asks the following question: Can n random variables X1, ..., Xn with prescribed marginal distributions Xi ~ Pi, be coupled in such a way that var(X1+ ··· + Xn)=0. Joint mixability is precisely the step required to go from a Lotto solution to a Blotto one by coupling the marginals of the Lotto solution in such a way that the budget constraint is satisfied. In this paper, we exploit a simple connection between joint mixability and the theory of multi-marginal couplings. We propose an algorithmic solution to the Blotto problem by efficiently constructing a coupling that satisfies the budget constraint almost surely and can be easily sampled from. Our construction relies on three key steps: first, we reduce the prob","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124138270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivated by online advertising auctions, we study auction design in repeated auctions played by simple Artificial Intelligence algorithms (Q-learning). We find that first-price auctions with no additional feedback lead to tacit-collusive outcomes (bids lower than values), while second-price auctions do not. We show that the difference is driven by the incentive in first-price auctions to outbid opponents by just one bid increment. This facilitates re-coordination on low bids after a phase of experimentation. We also show that providing information about the lowest bid to win, as introduced by Google at the time of the switch to first-price auctions, increases competitiveness of auctions.
{"title":"Artificial Intelligence and Auction Design","authors":"M. Banchio, Andrzej Skrzypacz","doi":"10.1145/3490486.3538244","DOIUrl":"https://doi.org/10.1145/3490486.3538244","url":null,"abstract":"Motivated by online advertising auctions, we study auction design in repeated auctions played by simple Artificial Intelligence algorithms (Q-learning). We find that first-price auctions with no additional feedback lead to tacit-collusive outcomes (bids lower than values), while second-price auctions do not. We show that the difference is driven by the incentive in first-price auctions to outbid opponents by just one bid increment. This facilitates re-coordination on low bids after a phase of experimentation. We also show that providing information about the lowest bid to win, as introduced by Google at the time of the switch to first-price auctions, increases competitiveness of auctions.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123527281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}