{"title":"使用强化学习在未知图中高效、基于群的路径查找","authors":"M. Aurangzeb, F. Lewis, M. Huber","doi":"10.2316/Journal.201.2014.3.201-2583","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of steering a swarm of autonomous agents out of an unknown maze to some goal located at an unknown location. This is particularly the case in situations where no direct communication between the agents is possible and all information exchange between agents has to occur indirectly through information “deposited” in the environment. To address this task, an ε-greedy, collaborative reinforcement learning method using only local information exchanges is introduced in this paper to balance exploitation and exploration in the unknown maze and to optimize the ability of the swarm to exit from the maze. The learning and routing algorithm given here provides a mechanism for storing data needed to represent the collaborative utility function based on the experiences of previous agents visiting a node that results in routing decisions that improve with time. Two theorems show the theoretical soundness of the proposed learning method and illustrate the importance of the stored information in improving decision-making for routing. Simulation examples show that the introduced simple rules of learning from past experience significantly improve performance over random search and search based on Ant Colony Optimization, a metaheuristic algorithm.","PeriodicalId":336534,"journal":{"name":"2013 10th IEEE International Conference on Control and Automation (ICCA)","volume":"172 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Efficient, swarm-based path finding in unknown graphs using reinforcement learning\",\"authors\":\"M. Aurangzeb, F. Lewis, M. Huber\",\"doi\":\"10.2316/Journal.201.2014.3.201-2583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the problem of steering a swarm of autonomous agents out of an unknown maze to some goal located at an unknown location. This is particularly the case in situations where no direct communication between the agents is possible and all information exchange between agents has to occur indirectly through information “deposited” in the environment. To address this task, an ε-greedy, collaborative reinforcement learning method using only local information exchanges is introduced in this paper to balance exploitation and exploration in the unknown maze and to optimize the ability of the swarm to exit from the maze. The learning and routing algorithm given here provides a mechanism for storing data needed to represent the collaborative utility function based on the experiences of previous agents visiting a node that results in routing decisions that improve with time. Two theorems show the theoretical soundness of the proposed learning method and illustrate the importance of the stored information in improving decision-making for routing. Simulation examples show that the introduced simple rules of learning from past experience significantly improve performance over random search and search based on Ant Colony Optimization, a metaheuristic algorithm.\",\"PeriodicalId\":336534,\"journal\":{\"name\":\"2013 10th IEEE International Conference on Control and Automation (ICCA)\",\"volume\":\"172 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 10th IEEE International Conference on Control and Automation (ICCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2316/Journal.201.2014.3.201-2583\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 10th IEEE International Conference on Control and Automation (ICCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2316/Journal.201.2014.3.201-2583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient, swarm-based path finding in unknown graphs using reinforcement learning
This paper addresses the problem of steering a swarm of autonomous agents out of an unknown maze to some goal located at an unknown location. This is particularly the case in situations where no direct communication between the agents is possible and all information exchange between agents has to occur indirectly through information “deposited” in the environment. To address this task, an ε-greedy, collaborative reinforcement learning method using only local information exchanges is introduced in this paper to balance exploitation and exploration in the unknown maze and to optimize the ability of the swarm to exit from the maze. The learning and routing algorithm given here provides a mechanism for storing data needed to represent the collaborative utility function based on the experiences of previous agents visiting a node that results in routing decisions that improve with time. Two theorems show the theoretical soundness of the proposed learning method and illustrate the importance of the stored information in improving decision-making for routing. Simulation examples show that the introduced simple rules of learning from past experience significantly improve performance over random search and search based on Ant Colony Optimization, a metaheuristic algorithm.