Learning Effective Value Function Factorization via Attentional Communication

2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC) Pub Date : 2020-10-11 DOI:10.1109/SMC42975.2020.9283355

Bo Wu, Xiaoya Yang, Chuxiong Sun, Rui Wang, Xiaohui Hu, Yan Hu

{"title":"Learning Effective Value Function Factorization via Attentional Communication","authors":"Bo Wu, Xiaoya Yang, Chuxiong Sun, Rui Wang, Xiaohui Hu, Yan Hu","doi":"10.1109/SMC42975.2020.9283355","DOIUrl":null,"url":null,"abstract":"How to achieve efficient cooperation among agents in partially observed environments remains an overarching problem in multi-agent reinforcement learning (MARL). Value function factorization learning is a promising way as it can efficiently address multi-agent credit assignment problem. However, existing value function factorization methods have been focusing on learning fully decentralized value functions, which are not effective for some complex tasks. To address this limitation, we propose a framework which enhances value function factorization by allowing communication during execution. Communication introduces extra information to help agents understand the complex environment and learn sophisticated factorization. Furthermore, the proposed mechanism of communication differs from existing methods since we additionally design a descriptive key along with the message. By the descriptive key, agents can dynamically measure the importance of different messages and achieve attentional communication. We evaluate our framework on a challenging set of StarCraft II micromanagement tasks, and show that it significantly outperforms existing value function factorization methods.","PeriodicalId":6718,"journal":{"name":"2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)","volume":"46 1","pages":"629-634"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMC42975.2020.9283355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

How to achieve efficient cooperation among agents in partially observed environments remains an overarching problem in multi-agent reinforcement learning (MARL). Value function factorization learning is a promising way as it can efficiently address multi-agent credit assignment problem. However, existing value function factorization methods have been focusing on learning fully decentralized value functions, which are not effective for some complex tasks. To address this limitation, we propose a framework which enhances value function factorization by allowing communication during execution. Communication introduces extra information to help agents understand the complex environment and learn sophisticated factorization. Furthermore, the proposed mechanism of communication differs from existing methods since we additionally design a descriptive key along with the message. By the descriptive key, agents can dynamically measure the importance of different messages and achieve attentional communication. We evaluate our framework on a challenging set of StarCraft II micromanagement tasks, and show that it significantly outperforms existing value function factorization methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过注意沟通学习有效的价值函数分解

如何在部分可观察的环境中实现智能体之间的高效合作一直是多智能体强化学习(MARL)的首要问题。价值函数分解学习可以有效地解决多智能体信用分配问题，是一种很有前途的学习方法。然而，现有的价值函数分解方法主要集中在学习完全分散的价值函数上，对于一些复杂的任务并不有效。为了解决这一限制，我们提出了一个框架，通过允许在执行期间进行通信来增强价值函数分解。通信引入了额外的信息，以帮助代理理解复杂的环境并学习复杂的分解。此外，所提出的通信机制与现有方法不同，因为我们在消息中额外设计了一个描述性密钥。通过描述键，代理可以动态地度量不同消息的重要性，实现注意力交流。我们在一组具有挑战性的《星际争霸2》微管理任务中评估了我们的框架，并表明它明显优于现有的价值函数分解方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

自引率

0.00%

发文量