具有非线性效用的多目标公共物品博弈中的学习

Nicole Orzan, Erman Acar, Davide Grossi, Patrick Mannion, Roxana Rădulescu
{"title":"具有非线性效用的多目标公共物品博弈中的学习","authors":"Nicole Orzan, Erman Acar, Davide Grossi, Patrick Mannion, Roxana Rădulescu","doi":"arxiv-2408.00682","DOIUrl":null,"url":null,"abstract":"Addressing the question of how to achieve optimal decision-making under risk\nand uncertainty is crucial for enhancing the capabilities of artificial agents\nthat collaborate with or support humans. In this work, we address this question\nin the context of Public Goods Games. We study learning in a novel\nmulti-objective version of the Public Goods Game where agents have different\nrisk preferences, by means of multi-objective reinforcement learning. We\nintroduce a parametric non-linear utility function to model risk preferences at\nthe level of individual agents, over the collective and individual reward\ncomponents of the game. We study the interplay between such preference\nmodelling and environmental uncertainty on the incentive alignment level in the\ngame. We demonstrate how different combinations of individual preferences and\nenvironmental uncertainties sustain the emergence of cooperative patterns in\nnon-cooperative environments (i.e., where competitive strategies are dominant),\nwhile others sustain competitive patterns in cooperative environments (i.e.,\nwhere cooperative strategies are dominant).","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning in Multi-Objective Public Goods Games with Non-Linear Utilities\",\"authors\":\"Nicole Orzan, Erman Acar, Davide Grossi, Patrick Mannion, Roxana Rădulescu\",\"doi\":\"arxiv-2408.00682\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Addressing the question of how to achieve optimal decision-making under risk\\nand uncertainty is crucial for enhancing the capabilities of artificial agents\\nthat collaborate with or support humans. In this work, we address this question\\nin the context of Public Goods Games. We study learning in a novel\\nmulti-objective version of the Public Goods Game where agents have different\\nrisk preferences, by means of multi-objective reinforcement learning. We\\nintroduce a parametric non-linear utility function to model risk preferences at\\nthe level of individual agents, over the collective and individual reward\\ncomponents of the game. We study the interplay between such preference\\nmodelling and environmental uncertainty on the incentive alignment level in the\\ngame. We demonstrate how different combinations of individual preferences and\\nenvironmental uncertainties sustain the emergence of cooperative patterns in\\nnon-cooperative environments (i.e., where competitive strategies are dominant),\\nwhile others sustain competitive patterns in cooperative environments (i.e.,\\nwhere cooperative strategies are dominant).\",\"PeriodicalId\":501315,\"journal\":{\"name\":\"arXiv - CS - Multiagent Systems\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Multiagent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.00682\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.00682","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

解决如何在风险和不确定性条件下实现最优决策的问题,对于提高与人类合作或为人类提供支持的人工智能的能力至关重要。在这项工作中,我们以公共物品游戏为背景来解决这个问题。我们通过多目标强化学习,研究了一种新型多目标版本的公共物品博弈中的学习,在这种博弈中,代理具有不同的风险偏好。我们引入了一个参数化非线性效用函数,以模拟个体博弈者对博弈中集体和个体奖励部分的风险偏好。我们研究了这种偏好建模与环境不确定性在博弈激励调整层面上的相互作用。我们证明了个体偏好和环境不确定性的不同组合如何在非合作环境(即竞争策略占主导地位的环境)中维持合作模式的出现,而其他组合又如何在合作环境(即合作策略占主导地位的环境)中维持竞争模式的出现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learning in Multi-Objective Public Goods Games with Non-Linear Utilities
Addressing the question of how to achieve optimal decision-making under risk and uncertainty is crucial for enhancing the capabilities of artificial agents that collaborate with or support humans. In this work, we address this question in the context of Public Goods Games. We study learning in a novel multi-objective version of the Public Goods Game where agents have different risk preferences, by means of multi-objective reinforcement learning. We introduce a parametric non-linear utility function to model risk preferences at the level of individual agents, over the collective and individual reward components of the game. We study the interplay between such preference modelling and environmental uncertainty on the incentive alignment level in the game. We demonstrate how different combinations of individual preferences and environmental uncertainties sustain the emergence of cooperative patterns in non-cooperative environments (i.e., where competitive strategies are dominant), while others sustain competitive patterns in cooperative environments (i.e., where cooperative strategies are dominant).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning HARP: Human-Assisted Regrouping with Permutation Invariant Critic for Multi-Agent Reinforcement Learning On-policy Actor-Critic Reinforcement Learning for Multi-UAV Exploration CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark Multi-agent Path Finding in Continuous Environment
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1