基于多智能体深度强化学习的输送带路径连续控制算法

Yaroslav Zhurba, A. Filchenkov, A. Azarov, A. Shalyto
{"title":"基于多智能体深度强化学习的输送带路径连续控制算法","authors":"Yaroslav Zhurba, A. Filchenkov, A. Azarov, A. Shalyto","doi":"10.31799/1684-8853-2022-6-10-19","DOIUrl":null,"url":null,"abstract":"Introduction: We consider the problem of routing of piece cargo by a conveyor system. When moving cargo pieces, it is necessary not only to minimize the time of transportation, but also to minimize the energy spent on it. Purpose: Development of a routing algorithm that is adaptive to changes in the topology of the routing graph and is able to optimize the delivery time and the consumed energy. Results: We propose an algorithm based on multi-agent deep reinforcement learning that places agents at the vertices of a conveyor network graph and uses a new state value function. The algorithm has two tunable parameters: the length of the path along which the state value function is calculated, and the learning coefficient. Through the selection of parameters, we have revealed that the optimal values are 2 and 1, respectively. An experimental study of the algorithm using a simulation model has shown that it allows to reduce the number of collisions of moving objects to zero, demonstrates stable results for both optimized scores, and also leads to a lower energy consumption compared with the method used as a baseline. Practical relevance: The proposed algorithm can be used to reduce delivery time and energy when managing conveyor systems.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning\",\"authors\":\"Yaroslav Zhurba, A. Filchenkov, A. Azarov, A. Shalyto\",\"doi\":\"10.31799/1684-8853-2022-6-10-19\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: We consider the problem of routing of piece cargo by a conveyor system. When moving cargo pieces, it is necessary not only to minimize the time of transportation, but also to minimize the energy spent on it. Purpose: Development of a routing algorithm that is adaptive to changes in the topology of the routing graph and is able to optimize the delivery time and the consumed energy. Results: We propose an algorithm based on multi-agent deep reinforcement learning that places agents at the vertices of a conveyor network graph and uses a new state value function. The algorithm has two tunable parameters: the length of the path along which the state value function is calculated, and the learning coefficient. Through the selection of parameters, we have revealed that the optimal values are 2 and 1, respectively. An experimental study of the algorithm using a simulation model has shown that it allows to reduce the number of collisions of moving objects to zero, demonstrates stable results for both optimized scores, and also leads to a lower energy consumption compared with the method used as a baseline. Practical relevance: The proposed algorithm can be used to reduce delivery time and energy when managing conveyor systems.\",\"PeriodicalId\":36977,\"journal\":{\"name\":\"Informatsionno-Upravliaiushchie Sistemy\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatsionno-Upravliaiushchie Sistemy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31799/1684-8853-2022-6-10-19\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatsionno-Upravliaiushchie Sistemy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31799/1684-8853-2022-6-10-19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

摘要

简介:我们考虑的问题是由一个输送系统的成件货物的路线。在搬运货物件时,不仅要尽量减少运输时间,而且要尽量减少在运输上花费的能量。目的:开发一种能够适应路由图拓扑变化、优化交付时间和能耗的路由算法。结果:我们提出了一种基于多智能体深度强化学习的算法,该算法将智能体放置在传送带网络图的顶点上,并使用新的状态值函数。该算法有两个可调参数:计算状态值函数的路径长度和学习系数。通过参数的选择,我们发现最优值分别为2和1。使用仿真模型对该算法进行的实验研究表明,该算法可以将运动物体的碰撞次数减少到零,优化得分的结果稳定,并且与用作基线的方法相比,能耗更低。实际意义:提出的算法可以用来减少运输时间和能源时,管理输送系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning
Introduction: We consider the problem of routing of piece cargo by a conveyor system. When moving cargo pieces, it is necessary not only to minimize the time of transportation, but also to minimize the energy spent on it. Purpose: Development of a routing algorithm that is adaptive to changes in the topology of the routing graph and is able to optimize the delivery time and the consumed energy. Results: We propose an algorithm based on multi-agent deep reinforcement learning that places agents at the vertices of a conveyor network graph and uses a new state value function. The algorithm has two tunable parameters: the length of the path along which the state value function is calculated, and the learning coefficient. Through the selection of parameters, we have revealed that the optimal values are 2 and 1, respectively. An experimental study of the algorithm using a simulation model has shown that it allows to reduce the number of collisions of moving objects to zero, demonstrates stable results for both optimized scores, and also leads to a lower energy consumption compared with the method used as a baseline. Practical relevance: The proposed algorithm can be used to reduce delivery time and energy when managing conveyor systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Informatsionno-Upravliaiushchie Sistemy
Informatsionno-Upravliaiushchie Sistemy Mathematics-Control and Optimization
CiteScore
1.40
自引率
0.00%
发文量
35
期刊最新文献
Modeling of bumping routes in the RSK algorithm and analysis of their approach to limit shapes Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning Fully integrated optical sensor system with intensity interrogation Decoding of linear codes for single error bursts correction based on the determination of certain events Backend Bug Finder — a platform for effective compiler fuzzing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1