Dingwei Wu, Kaifang Wan, Xiao-guang Gao, Zijian Hu
{"title":"复杂环境下基于深度强化学习的多智能体运动规划","authors":"Dingwei Wu, Kaifang Wan, Xiao-guang Gao, Zijian Hu","doi":"10.1109/ICCRE51898.2021.9435656","DOIUrl":null,"url":null,"abstract":"When agents in a multiagent system implement motion planning in complex and dynamic environments, model-based planning algorithms have poor adaptability, while intelligent algorithms, such as MADDPG, encounter difficulty in converging when training multiple agents, and the resulting control model has poor stability and robustness. To address the above challenges, this paper proposes a mixed experience multiagent deep deterministic policy gradient algorithm referred to as ME-MADDPG. The algorithm increases the high-quality experience obtained by artificial potential field method and uses dynamic probability to sample from different replay buffers. Simulation experiments have proven that compared to MADDPG, ME-MADDPG greatly improves convergence speed, convergence effect and stability and that ME-MADDPG can efficiently provide shorter and more convenient paths for multiagent systems.","PeriodicalId":382619,"journal":{"name":"2021 6th International Conference on Control and Robotics Engineering (ICCRE)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Multiagent Motion Planning Based on Deep Reinforcement Learning in Complex Environments\",\"authors\":\"Dingwei Wu, Kaifang Wan, Xiao-guang Gao, Zijian Hu\",\"doi\":\"10.1109/ICCRE51898.2021.9435656\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When agents in a multiagent system implement motion planning in complex and dynamic environments, model-based planning algorithms have poor adaptability, while intelligent algorithms, such as MADDPG, encounter difficulty in converging when training multiple agents, and the resulting control model has poor stability and robustness. To address the above challenges, this paper proposes a mixed experience multiagent deep deterministic policy gradient algorithm referred to as ME-MADDPG. The algorithm increases the high-quality experience obtained by artificial potential field method and uses dynamic probability to sample from different replay buffers. Simulation experiments have proven that compared to MADDPG, ME-MADDPG greatly improves convergence speed, convergence effect and stability and that ME-MADDPG can efficiently provide shorter and more convenient paths for multiagent systems.\",\"PeriodicalId\":382619,\"journal\":{\"name\":\"2021 6th International Conference on Control and Robotics Engineering (ICCRE)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 6th International Conference on Control and Robotics Engineering (ICCRE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCRE51898.2021.9435656\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 6th International Conference on Control and Robotics Engineering (ICCRE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCRE51898.2021.9435656","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multiagent Motion Planning Based on Deep Reinforcement Learning in Complex Environments
When agents in a multiagent system implement motion planning in complex and dynamic environments, model-based planning algorithms have poor adaptability, while intelligent algorithms, such as MADDPG, encounter difficulty in converging when training multiple agents, and the resulting control model has poor stability and robustness. To address the above challenges, this paper proposes a mixed experience multiagent deep deterministic policy gradient algorithm referred to as ME-MADDPG. The algorithm increases the high-quality experience obtained by artificial potential field method and uses dynamic probability to sample from different replay buffers. Simulation experiments have proven that compared to MADDPG, ME-MADDPG greatly improves convergence speed, convergence effect and stability and that ME-MADDPG can efficiently provide shorter and more convenient paths for multiagent systems.