Learning Equilibria in Matching Markets with Bandit Feedback

IF 2.3 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of the ACM Pub Date : 2023-05-24 DOI:https://dl.acm.org/doi/10.1145/3583681
Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob Steinhardt
{"title":"Learning Equilibria in Matching Markets with Bandit Feedback","authors":"Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, Jacob Steinhardt","doi":"https://dl.acm.org/doi/10.1145/3583681","DOIUrl":null,"url":null,"abstract":"<p>Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. Classical notions of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) are, unfortunately, of limited value in the learning setting, given that preferences are inherently uncertain and destabilizing while they are being learned. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them. We design an incentive-aware learning objective that captures the distance of a market outcome from equilibrium. Using this objective, we analyze the complexity of learning as a function of preference structure, casting learning as a stochastic multi-armed bandit problem. Algorithmically, we show that “optimism in the face of uncertainty,” the principle underlying many bandit algorithms, applies to a primal-dual formulation of matching with transfers and leads to near-optimal regret bounds. Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.</p>","PeriodicalId":50022,"journal":{"name":"Journal of the ACM","volume":"24 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ACM","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3583681","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. Classical notions of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) are, unfortunately, of limited value in the learning setting, given that preferences are inherently uncertain and destabilizing while they are being learned. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them. We design an incentive-aware learning objective that captures the distance of a market outcome from equilibrium. Using this objective, we analyze the complexity of learning as a function of preference structure, casting learning as a stochastic multi-armed bandit problem. Algorithmically, we show that “optimism in the face of uncertainty,” the principle underlying many bandit algorithms, applies to a primal-dual formulation of matching with transfers and leads to near-optimal regret bounds. Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
强盗反馈下匹配市场的学习均衡
大规模的双边匹配平台必须找到符合用户偏好的市场结果,同时从数据中学习这些偏好。稳定性的经典概念(Gale and Shapley, 1962;Shapley和Shubik, 1971)不幸的是,在学习环境中价值有限,因为偏好在学习过程中具有固有的不确定性和不稳定性。为了弥补这一差距,我们开发了一个框架和算法来学习不确定性下的稳定市场结果。我们的主要设置是与可转让的公用事业相匹配,平台既匹配代理,又设置他们之间的货币转移。我们设计了一个激励意识学习目标,捕捉市场结果与均衡的距离。基于这一目标,我们分析了学习的复杂性作为偏好结构的函数,将学习作为一个随机的多臂强盗问题。在算法上,我们展示了“面对不确定性的乐观主义”,这是许多强盗算法的基本原则,适用于与转移匹配的原始对偶公式,并导致接近最优的后悔界限。我们的工作向阐明稳定匹配何时以及如何在大型数据驱动的市场中出现迈出了第一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of the ACM
Journal of the ACM 工程技术-计算机:理论方法
CiteScore
7.50
自引率
0.00%
发文量
51
审稿时长
3 months
期刊介绍: The best indicator of the scope of the journal is provided by the areas covered by its Editorial Board. These areas change from time to time, as the field evolves. The following areas are currently covered by a member of the Editorial Board: Algorithms and Combinatorial Optimization; Algorithms and Data Structures; Algorithms, Combinatorial Optimization, and Games; Artificial Intelligence; Complexity Theory; Computational Biology; Computational Geometry; Computer Graphics and Computer Vision; Computer-Aided Verification; Cryptography and Security; Cyber-Physical, Embedded, and Real-Time Systems; Database Systems and Theory; Distributed Computing; Economics and Computation; Information Theory; Logic and Computation; Logic, Algorithms, and Complexity; Machine Learning and Computational Learning Theory; Networking; Parallel Computing and Architecture; Programming Languages; Quantum Computing; Randomized Algorithms and Probabilistic Analysis of Algorithms; Scientific Computing and High Performance Computing; Software Engineering; Web Algorithms and Data Mining
期刊最新文献
Query lower bounds for log-concave sampling Transaction Fee Mechanism Design Sparse Higher Order Čech Filtrations Killing a Vortex Separations in Proof Complexity and TFNP
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1