Hybrid knowledge transfer for MARL based on action advising and experience sharing

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Frontiers in Neurorobotics Pub Date : 2024-04-10 DOI:10.3389/fnbot.2024.1364587

Feng Liu, Dongqi Li, Jian Gao

{"title":"Hybrid knowledge transfer for MARL based on action advising and experience sharing","authors":"Feng Liu, Dongqi Li, Jian Gao","doi":"10.3389/fnbot.2024.1364587","DOIUrl":null,"url":null,"abstract":"Multiagent Reinforcement Learning (MARL) has been well adopted due to its exceptional ability to solve multiagent decision-making problems. To further enhance learning efficiency, knowledge transfer algorithms have been developed, among which experience-sharing-based and action-advising-based transfer strategies share the mainstream. However, it is notable that, although there exist many successful applications of both strategies, they are not flawless. For the long-developed action-advising-based methods (namely KT-AA, short for knowledge transfer based on action advising), their data efficiency and scalability are not satisfactory. As for the newly proposed experience-sharing-based knowledge transfer methods (KT-ES), although the shortcomings of KT-AA have been partially overcome, they are incompetent to correct specific bad decisions in the later learning stage. To leverage the superiority of both KT-AA and KT-ES, this study proposes KT-Hybrid, a hybrid knowledge transfer approach. In the early learning phase, KT-ES methods are employed, expecting better data efficiency from KT-ES to enhance the policy to a basic level as soon as possible. Later, we focus on correcting specific errors made by the basic policy, trying to use KT-AA methods to further improve the performance. Simulations demonstrate that the proposed KT-Hybrid outperforms well-received action-advising- and experience-sharing-based methods.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"14 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Neurorobotics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3389/fnbot.2024.1364587","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Multiagent Reinforcement Learning (MARL) has been well adopted due to its exceptional ability to solve multiagent decision-making problems. To further enhance learning efficiency, knowledge transfer algorithms have been developed, among which experience-sharing-based and action-advising-based transfer strategies share the mainstream. However, it is notable that, although there exist many successful applications of both strategies, they are not flawless. For the long-developed action-advising-based methods (namely KT-AA, short for knowledge transfer based on action advising), their data efficiency and scalability are not satisfactory. As for the newly proposed experience-sharing-based knowledge transfer methods (KT-ES), although the shortcomings of KT-AA have been partially overcome, they are incompetent to correct specific bad decisions in the later learning stage. To leverage the superiority of both KT-AA and KT-ES, this study proposes KT-Hybrid, a hybrid knowledge transfer approach. In the early learning phase, KT-ES methods are employed, expecting better data efficiency from KT-ES to enhance the policy to a basic level as soon as possible. Later, we focus on correcting specific errors made by the basic policy, trying to use KT-AA methods to further improve the performance. Simulations demonstrate that the proposed KT-Hybrid outperforms well-received action-advising- and experience-sharing-based methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于行动建议和经验分享的混合型 MARL 知识转让

多代理强化学习（MARL）因其解决多代理决策问题的卓越能力而被广泛采用。为了进一步提高学习效率，人们开发了知识迁移算法，其中基于经验分享的迁移策略和基于行动建议的迁移策略成为主流。然而，值得注意的是，尽管这两种策略都有许多成功的应用，但它们并非完美无缺。就开发已久的基于行动建议的方法（即 KT-AA，是基于行动建议的知识转移的简称）而言，其数据效率和可扩展性并不令人满意。至于新近提出的基于经验分享的知识转移方法（KT-ES），虽然部分克服了 KT-AA 的缺点，但却无法纠正后期学习阶段的具体错误决策。为了充分利用 KT-AA 和 KT-ES 的优势，本研究提出了混合知识转移方法 KT-Hybrid。在早期学习阶段，我们采用 KT-ES 方法，期望 KT-ES 能提供更好的数据效率，从而尽快将政策提升到基本水平。之后，我们将重点放在纠正基本策略的特定错误上，尝试使用 KT-AA 方法进一步提高性能。模拟结果表明，所提出的 KT-Hybrid 方法优于广受好评的基于行动建议和经验分享的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Frontiers in Neurorobotics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCER-ROBOTICS

CiteScore

5.20

自引率

6.50%

发文量

250

审稿时长

14 weeks

期刊介绍： Frontiers in Neurorobotics publishes rigorously peer-reviewed research in the science and technology of embodied autonomous neural systems. Specialty Chief Editors Alois C. Knoll and Florian Röhrbein at the Technische Universität München are supported by an outstanding Editorial Board of international experts. This multidisciplinary open-access journal is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics and the public worldwide. Neural systems include brain-inspired algorithms (e.g. connectionist networks), computational models of biological neural networks (e.g. artificial spiking neural nets, large-scale simulations of neural microcircuits) and actual biological systems (e.g. in vivo and in vitro neural nets). The focus of the journal is the embodiment of such neural systems in artificial software and hardware devices, machines, robots or any other form of physical actuation. This also includes prosthetic devices, brain machine interfaces, wearable systems, micro-machines, furniture, home appliances, as well as systems for managing micro and macro infrastructures. Frontiers in Neurorobotics also aims to publish radically new tools and methods to study plasticity and development of autonomous self-learning systems that are capable of acquiring knowledge in an open-ended manner. Models complemented with experimental studies revealing self-organizing principles of embodied neural systems are welcome. Our journal also publishes on the micro and macro engineering and mechatronics of robotic devices driven by neural systems, as well as studies on the impact that such systems will have on our daily life.