学习玩家如何学习:路由博弈中学习动力学的估计

Kiet Lam, W. Krichene, A. Bayen
{"title":"学习玩家如何学习:路由博弈中学习动力学的估计","authors":"Kiet Lam, W. Krichene, A. Bayen","doi":"10.1145/3078620","DOIUrl":null,"url":null,"abstract":"The routing game models congestion in transportation networks, communication networks, and other cyber physical systems in which agents compete for shared resources. We consider an online learning model of player dynamics: at each iteration, every player chooses a route (or a probability distribution over routes, which corresponds to a flow allocation over the physical network), then the joint decision of all players determines the costs of each path, which are then revealed to the players. We pose the following estimation problem: given a sequence of player decisions and the corresponding costs, we would like to estimate the learning model parameters. We consider in particular entropic mirror descent dynamics, reduce the problem to estimating the learning rates of each player. We demonstrate this method using data collected from a routing game experiment, played by human participants: We develop a web application to implement the routing game. When players log in, they are assigned an origin and destination on the graph. They can choose, at each iteration, a distribution over their available routes, and each player seeks to minimize her own cost. We collect a data set using this interface, then apply the proposed method to estimate the learning model parameters. We observe in particular that after an exploration phase, the joint decision of the players remains within a small distance of the Nash equilibrium. We also use the estimated model parameters to predict the flow distribution over routes, and compare these predictions to the actual distribution. Finally, we discuss some of the qualitative implications of the experiments, and give directions for future research.","PeriodicalId":6619,"journal":{"name":"2016 ACM/IEEE 7th International Conference on Cyber-Physical Systems (ICCPS)","volume":"58 1","pages":"1-10"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"On Learning How Players Learn: Estimation of Learning Dynamics in the Routing Game\",\"authors\":\"Kiet Lam, W. Krichene, A. Bayen\",\"doi\":\"10.1145/3078620\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The routing game models congestion in transportation networks, communication networks, and other cyber physical systems in which agents compete for shared resources. We consider an online learning model of player dynamics: at each iteration, every player chooses a route (or a probability distribution over routes, which corresponds to a flow allocation over the physical network), then the joint decision of all players determines the costs of each path, which are then revealed to the players. We pose the following estimation problem: given a sequence of player decisions and the corresponding costs, we would like to estimate the learning model parameters. We consider in particular entropic mirror descent dynamics, reduce the problem to estimating the learning rates of each player. We demonstrate this method using data collected from a routing game experiment, played by human participants: We develop a web application to implement the routing game. When players log in, they are assigned an origin and destination on the graph. They can choose, at each iteration, a distribution over their available routes, and each player seeks to minimize her own cost. We collect a data set using this interface, then apply the proposed method to estimate the learning model parameters. We observe in particular that after an exploration phase, the joint decision of the players remains within a small distance of the Nash equilibrium. We also use the estimated model parameters to predict the flow distribution over routes, and compare these predictions to the actual distribution. Finally, we discuss some of the qualitative implications of the experiments, and give directions for future research.\",\"PeriodicalId\":6619,\"journal\":{\"name\":\"2016 ACM/IEEE 7th International Conference on Cyber-Physical Systems (ICCPS)\",\"volume\":\"58 1\",\"pages\":\"1-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 ACM/IEEE 7th International Conference on Cyber-Physical Systems (ICCPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3078620\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 ACM/IEEE 7th International Conference on Cyber-Physical Systems (ICCPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3078620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

摘要

路由博弈模型模拟了交通网络、通信网络和其他网络物理系统中的拥塞,在这些系统中,智能体竞争共享资源。我们考虑玩家动态的在线学习模型:在每次迭代中,每个玩家选择一条路径(或路径上的概率分布,对应于物理网络上的流量分配),然后所有玩家的共同决策决定每条路径的成本,然后向玩家透露。我们提出了以下估计问题:给定一系列玩家决策和相应的成本,我们想要估计学习模型参数。我们特别考虑了熵镜下降动力学,将问题简化为估计每个玩家的学习率。我们使用从人类参与者玩的路由游戏实验中收集的数据来演示这种方法:我们开发了一个web应用程序来实现路由游戏。当玩家登录时,他们会在图表上被分配一个起点和目的地。他们可以在每次迭代中选择可用路线的分布,每个参与者都寻求最小化自己的成本。我们使用该接口收集数据集,然后应用该方法估计学习模型参数。我们特别观察到,在一个探索阶段之后,参与者的联合决策保持在纳什均衡的一小段距离内。我们还使用估计的模型参数来预测路线上的流量分布,并将这些预测与实际分布进行比较。最后,我们讨论了实验的一些定性意义,并给出了未来研究的方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
On Learning How Players Learn: Estimation of Learning Dynamics in the Routing Game
The routing game models congestion in transportation networks, communication networks, and other cyber physical systems in which agents compete for shared resources. We consider an online learning model of player dynamics: at each iteration, every player chooses a route (or a probability distribution over routes, which corresponds to a flow allocation over the physical network), then the joint decision of all players determines the costs of each path, which are then revealed to the players. We pose the following estimation problem: given a sequence of player decisions and the corresponding costs, we would like to estimate the learning model parameters. We consider in particular entropic mirror descent dynamics, reduce the problem to estimating the learning rates of each player. We demonstrate this method using data collected from a routing game experiment, played by human participants: We develop a web application to implement the routing game. When players log in, they are assigned an origin and destination on the graph. They can choose, at each iteration, a distribution over their available routes, and each player seeks to minimize her own cost. We collect a data set using this interface, then apply the proposed method to estimate the learning model parameters. We observe in particular that after an exploration phase, the joint decision of the players remains within a small distance of the Nash equilibrium. We also use the estimated model parameters to predict the flow distribution over routes, and compare these predictions to the actual distribution. Finally, we discuss some of the qualitative implications of the experiments, and give directions for future research.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ICCPS '21: ACM/IEEE 12th International Conference on Cyber-Physical Systems, Nashville, Tennessee, USA, May 19-21, 2021 Demo Abstract: SURE: An Experimentation and Evaluation Testbed for CPS Security and Resilience Poster Abstract: Thermal Side-Channel Forensics in Additive Manufacturing Systems Exploiting Wireless Channel Randomness to Generate Keys for Automotive Cyber-Physical System Security WiP Abstract: Platform for Designing and Managing Resilient and Extensible CPS
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1