Yaroslav Zhurba, A. Filchenkov, A. Azarov, A. Shalyto
{"title":"基于多智能体深度强化学习的输送带路径连续控制算法","authors":"Yaroslav Zhurba, A. Filchenkov, A. Azarov, A. Shalyto","doi":"10.31799/1684-8853-2022-6-10-19","DOIUrl":null,"url":null,"abstract":"Introduction: We consider the problem of routing of piece cargo by a conveyor system. When moving cargo pieces, it is necessary not only to minimize the time of transportation, but also to minimize the energy spent on it. Purpose: Development of a routing algorithm that is adaptive to changes in the topology of the routing graph and is able to optimize the delivery time and the consumed energy. Results: We propose an algorithm based on multi-agent deep reinforcement learning that places agents at the vertices of a conveyor network graph and uses a new state value function. The algorithm has two tunable parameters: the length of the path along which the state value function is calculated, and the learning coefficient. Through the selection of parameters, we have revealed that the optimal values are 2 and 1, respectively. An experimental study of the algorithm using a simulation model has shown that it allows to reduce the number of collisions of moving objects to zero, demonstrates stable results for both optimized scores, and also leads to a lower energy consumption compared with the method used as a baseline. Practical relevance: The proposed algorithm can be used to reduce delivery time and energy when managing conveyor systems.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning\",\"authors\":\"Yaroslav Zhurba, A. Filchenkov, A. Azarov, A. Shalyto\",\"doi\":\"10.31799/1684-8853-2022-6-10-19\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: We consider the problem of routing of piece cargo by a conveyor system. When moving cargo pieces, it is necessary not only to minimize the time of transportation, but also to minimize the energy spent on it. Purpose: Development of a routing algorithm that is adaptive to changes in the topology of the routing graph and is able to optimize the delivery time and the consumed energy. Results: We propose an algorithm based on multi-agent deep reinforcement learning that places agents at the vertices of a conveyor network graph and uses a new state value function. The algorithm has two tunable parameters: the length of the path along which the state value function is calculated, and the learning coefficient. Through the selection of parameters, we have revealed that the optimal values are 2 and 1, respectively. An experimental study of the algorithm using a simulation model has shown that it allows to reduce the number of collisions of moving objects to zero, demonstrates stable results for both optimized scores, and also leads to a lower energy consumption compared with the method used as a baseline. Practical relevance: The proposed algorithm can be used to reduce delivery time and energy when managing conveyor systems.\",\"PeriodicalId\":36977,\"journal\":{\"name\":\"Informatsionno-Upravliaiushchie Sistemy\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatsionno-Upravliaiushchie Sistemy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31799/1684-8853-2022-6-10-19\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatsionno-Upravliaiushchie Sistemy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31799/1684-8853-2022-6-10-19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning
Introduction: We consider the problem of routing of piece cargo by a conveyor system. When moving cargo pieces, it is necessary not only to minimize the time of transportation, but also to minimize the energy spent on it. Purpose: Development of a routing algorithm that is adaptive to changes in the topology of the routing graph and is able to optimize the delivery time and the consumed energy. Results: We propose an algorithm based on multi-agent deep reinforcement learning that places agents at the vertices of a conveyor network graph and uses a new state value function. The algorithm has two tunable parameters: the length of the path along which the state value function is calculated, and the learning coefficient. Through the selection of parameters, we have revealed that the optimal values are 2 and 1, respectively. An experimental study of the algorithm using a simulation model has shown that it allows to reduce the number of collisions of moving objects to zero, demonstrates stable results for both optimized scores, and also leads to a lower energy consumption compared with the method used as a baseline. Practical relevance: The proposed algorithm can be used to reduce delivery time and energy when managing conveyor systems.