{"title":"基于层次化 DDPG 的短通信距离多代理集体运动强化学习框架","authors":"Jiaxin Li;Peng Yi;Tong Duan;Zhen Zhang;Tao Hu","doi":"10.1109/TMLCN.2024.3400059","DOIUrl":null,"url":null,"abstract":"Collective motion is an important research content in the multi-agent control field. However, existing multi-agent collective motion methods typically assume large communication ranges of individual agents; in the scenario of leader-follower control with short communication ranges, if the leader dynamically changes its velocity without considering the followers’ states, the communication topology may be easily disconnected, making multi-agent collective motion more challenging. In this work, a novel Hierarchical DeepDeterministic PolicyGradient (HDDPG) based reinforcement learning framework is proposed to realize multi-agent collective motion with short communication ranges, ensuring the communication topology connected as much as possible. In H-DDPG, multiple agents with one single leader and numerous followers are dynamically divided into several hierarchies to conduct distributed control when the leader’s velocity changes. Two algorithms based on DDPG and the hierarchical strategy are designed to train followers in the first layer and followers in layers other than the first layer separately, which ensures that the agents form a tight swarm from scattered distribution and all followers can track the leader effectively. The experimental results demonstrate that with short communication ranges, H-DDPG outperforms the hierarchical flocking method in keeping the communication topology connection and shaping a tighter swarm.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"633-644"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10529296","citationCount":"0","resultStr":"{\"title\":\"Hierarchical DDPG Based Reinforcement Learning Framework for Multi-Agent Collective Motion With Short Communication Ranges\",\"authors\":\"Jiaxin Li;Peng Yi;Tong Duan;Zhen Zhang;Tao Hu\",\"doi\":\"10.1109/TMLCN.2024.3400059\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collective motion is an important research content in the multi-agent control field. However, existing multi-agent collective motion methods typically assume large communication ranges of individual agents; in the scenario of leader-follower control with short communication ranges, if the leader dynamically changes its velocity without considering the followers’ states, the communication topology may be easily disconnected, making multi-agent collective motion more challenging. In this work, a novel Hierarchical DeepDeterministic PolicyGradient (HDDPG) based reinforcement learning framework is proposed to realize multi-agent collective motion with short communication ranges, ensuring the communication topology connected as much as possible. In H-DDPG, multiple agents with one single leader and numerous followers are dynamically divided into several hierarchies to conduct distributed control when the leader’s velocity changes. Two algorithms based on DDPG and the hierarchical strategy are designed to train followers in the first layer and followers in layers other than the first layer separately, which ensures that the agents form a tight swarm from scattered distribution and all followers can track the leader effectively. The experimental results demonstrate that with short communication ranges, H-DDPG outperforms the hierarchical flocking method in keeping the communication topology connection and shaping a tighter swarm.\",\"PeriodicalId\":100641,\"journal\":{\"name\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"volume\":\"2 \",\"pages\":\"633-644\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10529296\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10529296/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10529296/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hierarchical DDPG Based Reinforcement Learning Framework for Multi-Agent Collective Motion With Short Communication Ranges
Collective motion is an important research content in the multi-agent control field. However, existing multi-agent collective motion methods typically assume large communication ranges of individual agents; in the scenario of leader-follower control with short communication ranges, if the leader dynamically changes its velocity without considering the followers’ states, the communication topology may be easily disconnected, making multi-agent collective motion more challenging. In this work, a novel Hierarchical DeepDeterministic PolicyGradient (HDDPG) based reinforcement learning framework is proposed to realize multi-agent collective motion with short communication ranges, ensuring the communication topology connected as much as possible. In H-DDPG, multiple agents with one single leader and numerous followers are dynamically divided into several hierarchies to conduct distributed control when the leader’s velocity changes. Two algorithms based on DDPG and the hierarchical strategy are designed to train followers in the first layer and followers in layers other than the first layer separately, which ensures that the agents form a tight swarm from scattered distribution and all followers can track the leader effectively. The experimental results demonstrate that with short communication ranges, H-DDPG outperforms the hierarchical flocking method in keeping the communication topology connection and shaping a tighter swarm.