Enhancing collaboration in multi-agent reinforcement learning with correlated trajectories

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Knowledge-Based Systems Pub Date : 2024-10-28 DOI:10.1016/j.knosys.2024.112665
Siying Wang , Hongfei Du , Yang Zhou , Zhitong Zhao , Ruoning Zhang , Wenyu Chen
{"title":"Enhancing collaboration in multi-agent reinforcement learning with correlated trajectories","authors":"Siying Wang ,&nbsp;Hongfei Du ,&nbsp;Yang Zhou ,&nbsp;Zhitong Zhao ,&nbsp;Ruoning Zhang ,&nbsp;Wenyu Chen","doi":"10.1016/j.knosys.2024.112665","DOIUrl":null,"url":null,"abstract":"<div><div>Collaborative behaviors in human social activities can be modeled with multi-agent reinforcement learning and used to train the collaborative policies of agents to achieve efficient cooperation. In general, agents with similar behaviors have a certain behavioral common cognition and are more likely to understand the intentions of both parties then to form cooperative policies. Traditional approaches focus on the collaborative allocation process between agents, ignoring the effects of similar behaviors and common cognition characteristics in collaborative interactions. In order to better establish collaborative relationships between agents, we propose a novel multi-agent reinforcement learning collaborative algorithm based on the similarity of agents’ behavioral features. In this model, the interactions of agents are established as a graph neural network. Specifically, the Pearson correlation coefficient is proposed to compute the similarity of the history trajectories of the agents as a means of determining their behavioral common cognition, which is used to establish the weights of the edges in the modeled graph neural network. In addition, we design a transformer-encoder structured state information complementation module to enhance the decision representation of the agents. The experimental results on Predator–Prey and StarCraft II show that the proposed method can effectively enhance the collaborative behaviors between agents and improve the training efficiency of collaborative models.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"305 ","pages":"Article 112665"},"PeriodicalIF":7.2000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124012991","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Collaborative behaviors in human social activities can be modeled with multi-agent reinforcement learning and used to train the collaborative policies of agents to achieve efficient cooperation. In general, agents with similar behaviors have a certain behavioral common cognition and are more likely to understand the intentions of both parties then to form cooperative policies. Traditional approaches focus on the collaborative allocation process between agents, ignoring the effects of similar behaviors and common cognition characteristics in collaborative interactions. In order to better establish collaborative relationships between agents, we propose a novel multi-agent reinforcement learning collaborative algorithm based on the similarity of agents’ behavioral features. In this model, the interactions of agents are established as a graph neural network. Specifically, the Pearson correlation coefficient is proposed to compute the similarity of the history trajectories of the agents as a means of determining their behavioral common cognition, which is used to establish the weights of the edges in the modeled graph neural network. In addition, we design a transformer-encoder structured state information complementation module to enhance the decision representation of the agents. The experimental results on Predator–Prey and StarCraft II show that the proposed method can effectively enhance the collaborative behaviors between agents and improve the training efficiency of collaborative models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用相关轨迹加强多代理强化学习中的协作
人类社会活动中的合作行为可以用多代理强化学习来建模,并用于训练代理的合作策略,以实现高效合作。一般来说,行为相似的代理具有一定的行为共同认知,更容易理解双方的意图,进而形成合作策略。传统方法侧重于代理之间的协作分配过程,忽视了相似行为和共同认知特征在协作互动中的影响。为了更好地建立代理之间的合作关系,我们提出了一种基于代理行为特征相似性的新型多代理强化学习合作算法。在这个模型中,代理之间的互动关系被建立为一个图神经网络。具体来说,我们提出用皮尔逊相关系数来计算各代理历史轨迹的相似性,以此来确定它们的行为共同认知,并以此来建立模型图神经网络中各条边的权重。此外,我们还设计了一个变换器-编码器结构的状态信息互补模块,以增强代理的决策表征。捕食者-猎物》和《星际争霸 II》的实验结果表明,所提出的方法能有效增强代理之间的协作行为,提高协作模型的训练效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
期刊最新文献
Progressive de-preference task-specific processing for generalizable person re-identification GKA-GPT: Graphical knowledge aggregation for multiturn dialog generation A novel spatio-temporal feature interleaved contrast learning neural network from a robustness perspective PSNet: A non-uniform illumination correction method for underwater images based pseudo-siamese network A novel domain-private-suppress meta-recognition network based universal domain generalization for machinery fault diagnosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1