A hybrid P2P and master-slave cooperative distributed multi-agent reinforcement learning technique with asynchronously triggered exploratory trials and clutter-index-based selected sub-goals

D. Megherbi, Minsuk Kim
{"title":"A hybrid P2P and master-slave cooperative distributed multi-agent reinforcement learning technique with asynchronously triggered exploratory trials and clutter-index-based selected sub-goals","authors":"D. Megherbi, Minsuk Kim","doi":"10.1109/CIVEMSA.2016.7524249","DOIUrl":null,"url":null,"abstract":"In many large infrastructures, such as military battlefields, transportation and maritime systems spanning hundreds of miles at a time, collaborative multi-agent based monitoring is important. Agent Reinforcement Learning (RL), in general, becomes more challenging in a dynamic complex cluttered environment for autonomous path planning, where agents could be moving randomly to reach their respective goals. In our previous work we presented a hybrid master-slave and peer-to-peer system architecture, where each distributed agent knows only of a given master node, is only concerned with its assigned work load, has a limited knowledge of the environment and can, collaboratively with other agents, share learned information of the environment over a communication network. In this paper we extend our previous work and focus on (a) the study of the performance of said system and the effect of the agents' random walks on the overall system agent learning speed, when each of the distributed agents, after the random walk phase, starts its exploratory trials independently of the other agents, asynchronously, and immediately after it finishes its first exploratory trial towards a sub-goal or after its random walk phase, without waiting for the slowest agent to finish its first random walk or its first exploratory phase toward a sub-goal. (b) the effect on the agent learning speed, of using an environment-clutter-index to select agent sub-goals with the aim of reducing the agent initial random walk steps and (c) the effect of agent sharing/or not sharing environment information on the agent learning speed in such scenarios.","PeriodicalId":244122,"journal":{"name":"2016 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIVEMSA.2016.7524249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In many large infrastructures, such as military battlefields, transportation and maritime systems spanning hundreds of miles at a time, collaborative multi-agent based monitoring is important. Agent Reinforcement Learning (RL), in general, becomes more challenging in a dynamic complex cluttered environment for autonomous path planning, where agents could be moving randomly to reach their respective goals. In our previous work we presented a hybrid master-slave and peer-to-peer system architecture, where each distributed agent knows only of a given master node, is only concerned with its assigned work load, has a limited knowledge of the environment and can, collaboratively with other agents, share learned information of the environment over a communication network. In this paper we extend our previous work and focus on (a) the study of the performance of said system and the effect of the agents' random walks on the overall system agent learning speed, when each of the distributed agents, after the random walk phase, starts its exploratory trials independently of the other agents, asynchronously, and immediately after it finishes its first exploratory trial towards a sub-goal or after its random walk phase, without waiting for the slowest agent to finish its first random walk or its first exploratory phase toward a sub-goal. (b) the effect on the agent learning speed, of using an environment-clutter-index to select agent sub-goals with the aim of reducing the agent initial random walk steps and (c) the effect of agent sharing/or not sharing environment information on the agent learning speed in such scenarios.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种异步触发探索性试验和基于杂波索引选择子目标的混合型P2P和主从合作分布式多智能体强化学习技术
在许多大型基础设施中,例如军事战场、运输和海上系统,一次跨越数百英里,基于多代理的协作监控非常重要。一般来说,智能体强化学习(RL)在动态复杂混乱的环境中变得更具挑战性,因为智能体可以随机移动以达到各自的目标。在我们之前的工作中,我们提出了一个主从和点对点的混合系统架构,其中每个分布式代理只知道一个给定的主节点,只关心其分配的工作负载,对环境的了解有限,并且可以与其他代理协作,通过通信网络共享环境的学习信息。在本文中,我们扩展了之前的工作,并专注于(a)研究所述系统的性能以及智能体随机行走对整个系统智能体学习速度的影响,当每个分布式智能体在随机行走阶段之后,独立于其他智能体异步地开始其探索性试验,并且在完成其针对子目标的第一次探索性试验之后或在其随机行走阶段之后,无需等待最慢的智能体完成第一次随机漫步或第一次探索阶段。(b)使用环境-杂乱指数来选择代理子目标以减少代理初始随机行走步数对代理学习速度的影响;(c)在这种情况下,代理共享/不共享环境信息对代理学习速度的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A machine learning based prognostic prediction of cervical myelopathy using diffusion tensor imaging Measurement classification using hybrid weighted Naive Bayes On the comparison of an interval Type-2 Fuzzy interpolation system and other interpolation methods used in industrial modeless robotic calibrations A novel hybrid of S2DPCA and SVM for knee osteoarthritis classification A hybrid P2P and master-slave cooperative distributed multi-agent reinforcement learning technique with asynchronously triggered exploratory trials and clutter-index-based selected sub-goals
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1