Learning Effective Value Function Factorization via Attentional Communication

Bo Wu, Xiaoya Yang, Chuxiong Sun, Rui Wang, Xiaohui Hu, Yan Hu
{"title":"Learning Effective Value Function Factorization via Attentional Communication","authors":"Bo Wu, Xiaoya Yang, Chuxiong Sun, Rui Wang, Xiaohui Hu, Yan Hu","doi":"10.1109/SMC42975.2020.9283355","DOIUrl":null,"url":null,"abstract":"How to achieve efficient cooperation among agents in partially observed environments remains an overarching problem in multi-agent reinforcement learning (MARL). Value function factorization learning is a promising way as it can efficiently address multi-agent credit assignment problem. However, existing value function factorization methods have been focusing on learning fully decentralized value functions, which are not effective for some complex tasks. To address this limitation, we propose a framework which enhances value function factorization by allowing communication during execution. Communication introduces extra information to help agents understand the complex environment and learn sophisticated factorization. Furthermore, the proposed mechanism of communication differs from existing methods since we additionally design a descriptive key along with the message. By the descriptive key, agents can dynamically measure the importance of different messages and achieve attentional communication. We evaluate our framework on a challenging set of StarCraft II micromanagement tasks, and show that it significantly outperforms existing value function factorization methods.","PeriodicalId":6718,"journal":{"name":"2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)","volume":"46 1","pages":"629-634"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMC42975.2020.9283355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

How to achieve efficient cooperation among agents in partially observed environments remains an overarching problem in multi-agent reinforcement learning (MARL). Value function factorization learning is a promising way as it can efficiently address multi-agent credit assignment problem. However, existing value function factorization methods have been focusing on learning fully decentralized value functions, which are not effective for some complex tasks. To address this limitation, we propose a framework which enhances value function factorization by allowing communication during execution. Communication introduces extra information to help agents understand the complex environment and learn sophisticated factorization. Furthermore, the proposed mechanism of communication differs from existing methods since we additionally design a descriptive key along with the message. By the descriptive key, agents can dynamically measure the importance of different messages and achieve attentional communication. We evaluate our framework on a challenging set of StarCraft II micromanagement tasks, and show that it significantly outperforms existing value function factorization methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过注意沟通学习有效的价值函数分解
如何在部分可观察的环境中实现智能体之间的高效合作一直是多智能体强化学习(MARL)的首要问题。价值函数分解学习可以有效地解决多智能体信用分配问题,是一种很有前途的学习方法。然而,现有的价值函数分解方法主要集中在学习完全分散的价值函数上,对于一些复杂的任务并不有效。为了解决这一限制,我们提出了一个框架,通过允许在执行期间进行通信来增强价值函数分解。通信引入了额外的信息,以帮助代理理解复杂的环境并学习复杂的分解。此外,所提出的通信机制与现有方法不同,因为我们在消息中额外设计了一个描述性密钥。通过描述键,代理可以动态地度量不同消息的重要性,实现注意力交流。我们在一组具有挑战性的《星际争霸2》微管理任务中评估了我们的框架,并表明它明显优于现有的价值函数分解方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
At-the-Edge Data Processing for Low Latency High Throughput Machine Learning Algorithms Machine Learning for First Principles Calculations of Material Properties for Ferromagnetic Materials Mobility Aware Computation Offloading Model for Edge Computing Toward an Autonomous Workflow for Single Crystal Neutron Diffraction Virtual Infrastructure Twins: Software Testing Platforms for Computing-Instrument Ecosystems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1