The Role of Reinforcement Learning in the Emergence of Conventions: Simulation Experiments with the Repeated Volunteer's Dilemma

H. Nunner, W. Przepiorka, Chris Janssen
{"title":"The Role of Reinforcement Learning in the Emergence of Conventions: Simulation Experiments with the Repeated Volunteer's Dilemma","authors":"H. Nunner, W. Przepiorka, Chris Janssen","doi":"10.18564/jasss.4771","DOIUrl":null,"url":null,"abstract":"We use reinforcement learning models to investigate the role of cognitive mechanisms in the emergence of conventions in the repeated volunteer’s dilemma (VOD). The VOD is amulti-person, binary choice collective goods game in which the contribution of only one individual is necessary and su icient to produce a benefit for the entire group. Behavioral experiments show that in the symmetric VOD,where all groupmembers have the same costs of volunteering, a turn-taking convention emerges, whereas in the asymmetric VOD,where one “strong” group member has lower costs of volunteering, a solitary-volunteering convention emerges with the strong member volunteering most of the time. We compare three di erent classes of reinforcement learningmodels in their ability to replicate these empirical findings. Our results confirm that reinforcement learning models canprovide aparsimonious account of howhumans tacitly agreeonone course of actionwhenencountering each other repeatedly in the same interaction situation. We find that considering contextual clues (i.e., reward structures) for strategy design (i.e., sequences of actions) and strategy selection (i.e., favoring equal distribution of costs) facilitate coordinationwhenoptimaare less salient. Furthermore, ourmodels producebetter fits with the empirical datawhen agents actmyopically (favoring current over expected future rewards) and the rewards for adhering to conventions are not delayed.","PeriodicalId":14675,"journal":{"name":"J. Artif. Soc. Soc. Simul.","volume":"45 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Artif. Soc. Soc. Simul.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18564/jasss.4771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

We use reinforcement learning models to investigate the role of cognitive mechanisms in the emergence of conventions in the repeated volunteer’s dilemma (VOD). The VOD is amulti-person, binary choice collective goods game in which the contribution of only one individual is necessary and su icient to produce a benefit for the entire group. Behavioral experiments show that in the symmetric VOD,where all groupmembers have the same costs of volunteering, a turn-taking convention emerges, whereas in the asymmetric VOD,where one “strong” group member has lower costs of volunteering, a solitary-volunteering convention emerges with the strong member volunteering most of the time. We compare three di erent classes of reinforcement learningmodels in their ability to replicate these empirical findings. Our results confirm that reinforcement learning models canprovide aparsimonious account of howhumans tacitly agreeonone course of actionwhenencountering each other repeatedly in the same interaction situation. We find that considering contextual clues (i.e., reward structures) for strategy design (i.e., sequences of actions) and strategy selection (i.e., favoring equal distribution of costs) facilitate coordinationwhenoptimaare less salient. Furthermore, ourmodels producebetter fits with the empirical datawhen agents actmyopically (favoring current over expected future rewards) and the rewards for adhering to conventions are not delayed.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
强化学习在约定产生中的作用:重复志愿者困境的模拟实验
我们使用强化学习模型来研究认知机制在重复志愿者困境(VOD)中约定产生中的作用。VOD是一种多人、二元选择的集体商品博弈,在这种博弈中,只有一个人的贡献是必要的,足以为整个群体产生利益。行为实验表明,在对称的视频点播中,当所有成员的志愿服务成本相同时,就会出现轮流约定;而在非对称的视频点播中,当一个“强”成员的志愿服务成本较低时,就会出现一个“强”成员大多数时间都志愿服务的“孤独-志愿约定”。我们比较了三种不同类型的强化学习模型复制这些实证结果的能力。我们的研究结果证实,强化学习模型可以为人类在相同的互动情况下反复遇到彼此时如何默认一个行动过程提供简洁的解释。我们发现,考虑策略设计(即行动序列)和策略选择(即倾向于成本的平均分配)的上下文线索(即奖励结构)有助于在优化不太突出时进行协调。此外,当代理行为短视(偏好当前而非预期的未来奖励)并且遵守约定的奖励不会延迟时,我们的模型与经验数据的拟合更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Conflicting Information and Compliance with COVID-19 Behavioral Recommendations Particle Swarm Optimization for Calibration in Spatially Explicit Agent-Based Modeling The Role of Reinforcement Learning in the Emergence of Conventions: Simulation Experiments with the Repeated Volunteer's Dilemma Generation of Synthetic Populations in Social Simulations: A Review of Methods and Practices An Integrated Ecological-Social Simulation Model of Farmer Decisions and Cropping System Performance in the Rolling Pampas (Argentina)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1