Watch and learn: optimizing from revealed preferences feedback

Aaron Roth, Jonathan Ullman, Zhiwei Steven Wu
{"title":"Watch and learn: optimizing from revealed preferences feedback","authors":"Aaron Roth, Jonathan Ullman, Zhiwei Steven Wu","doi":"10.1145/2897518.2897579","DOIUrl":null,"url":null,"abstract":"A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower’s best response. In this paper we present an approach to solving for the leader’s optimal strategy in certain Stackelberg games where the follower’s utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit—his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer’s utility function. However, he does have access to revealed preference feedback---he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower’s utility function is unknown, using only “revealed preference” access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader’s actions.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2897518.2897579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 58

Abstract

A Stackelberg game is played between a leader and a follower. The leader first chooses an action, then the follower plays his best response. The goal of the leader is to pick the action that will maximize his payoff given the follower’s best response. In this paper we present an approach to solving for the leader’s optimal strategy in certain Stackelberg games where the follower’s utility function (and thus the subsequent best response of the follower) is unknown. Stackelberg games capture, for example, the following interaction between a producer and a consumer. The producer chooses the prices of the goods he produces, and then a consumer chooses to buy a utility maximizing bundle of goods. The goal of the seller here is to set prices to maximize his profit—his revenue, minus the production cost of the purchased bundle. It is quite natural that the seller in this example should not know the buyer’s utility function. However, he does have access to revealed preference feedback---he can set prices, and then observe the purchased bundle and his own profit. We give algorithms for efficiently solving, in terms of both computational and query complexity, a broad class of Stackelberg games in which the follower’s utility function is unknown, using only “revealed preference” access to it. This class includes in particular the profit maximization problem, as well as the optimal tolling problem in nonatomic congestion games, when the latency functions are unknown. Surprisingly, we are able to solve these problems even though the optimization problems are non-convex in the leader’s actions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
观察和学习:从显示的偏好反馈中优化
Stackelberg游戏是领导者和追随者之间的游戏。领导者首先选择一个行动,然后追随者采取他的最佳对策。领导者的目标是在追随者做出最佳反应的情况下,选择能使自己收益最大化的行动。在本文中,我们提出了在某些Stackelberg博弈中求解领导者最优策略的方法,其中追随者的效用函数(因此追随者的后续最佳对策)是未知的。例如,Stackelberg游戏捕捉了生产者和消费者之间的以下互动。生产者选择他生产的商品的价格,然后消费者选择购买效用最大化的一束商品。卖家的目标是设定价格,以使利润最大化——他的收入减去所购买产品的生产成本。很自然,在这个例子中,卖方不应该知道买方的效用函数。然而,他确实可以获得显示的偏好反馈——他可以设定价格,然后观察购买的捆绑包和自己的利润。在计算和查询复杂度方面,我们给出了有效解决一类广泛的Stackelberg博弈的算法,其中追随者的效用函数是未知的,只使用“显示偏好”访问它。这个类特别包括利润最大化问题,以及在延迟函数未知的非原子拥塞博弈中的最优收费问题。令人惊讶的是,我们能够解决这些问题,即使优化问题在领导者的行动中是非凸的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exponential separation of communication and external information Proceedings of the forty-eighth annual ACM symposium on Theory of Computing Explicit two-source extractors and resilient functions Constant-rate coding for multiparty interactive communication is impossible Approximating connectivity domination in weighted bounded-genus graphs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1