COMPARATIVE EVALUATION OF THE EFFICIENCY OF REINFORCEMENT MACHINE LEARNING ALGORITHMS AND COMPARATIVE EVALUATION OF TRANSPORT PROCESSES IN THE ENVIRONMENT ANYLOGIC

System analysis and logistics Pub Date : 2022-06-01 DOI:10.31799/2077-5687-2022-2-22-41

S. A. Andronov, M. Prokofieva

{"title":"COMPARATIVE EVALUATION OF THE EFFICIENCY OF REINFORCEMENT MACHINE LEARNING ALGORITHMS AND COMPARATIVE EVALUATION OF TRANSPORT PROCESSES IN THE ENVIRONMENT ANYLOGIC","authors":"S. A. Andronov, M. Prokofieva","doi":"10.31799/2077-5687-2022-2-22-41","DOIUrl":null,"url":null,"abstract":"The task of increasing the throughput of sections of the transport network of the metropolis with the existing infrastructure is solved by means of automated traffic control systems, the purpose of which is to form control actions on the objects of the transport system in real time. An adequate response to load changes in the transport network is implemented by controlling traffic light objects using built-in adaptive algorithms, among which artificial intelligence technologies, in particular, neural networks, are increasingly being used. The noted approaches compete with the widely used optimization-based adaptation method (phase duration, displacement, etc.). The proposed article examines the issue of comparing the efficiency of an optimization algorithm with reinforcement machine learning algorithms by the criterion of the time spent by vehicles in the intersection system in the Anylogic simulation environment. This study will help determine a more efficient solution to the problem of setting the duration of traffic light control phases. It was shown that in the reinforcement learning algorithm, the ability to adapt to input intensities within the schedule is higher compared to the optimization algorithm. However, the reinforcement algorithm is more sensitive to the type of schedule than the optimization algorithm, which outperformed the latter by about 20% at the optimal point for the weekday schedule. The advantage of the reinforcement algorithm was more pronounced on the 2nd schedule, which features a tendency to increase traffic intensity, namely: 61% compared to the optimization and 70% compared to the base setting. It turned out to be practically insensitive to the “detuning” of the input data relative to the optimal policy when changing the intensity levels within this schedule. Thus, it was shown that the results of the regulation of the traffic process at the studied real intersection, obtained by modeling using reinforcement learning, are superior to the optimization approach, but are sensitive to the given intensity schedule.","PeriodicalId":329114,"journal":{"name":"System analysis and logistics","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"System analysis and logistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31799/2077-5687-2022-2-22-41","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The task of increasing the throughput of sections of the transport network of the metropolis with the existing infrastructure is solved by means of automated traffic control systems, the purpose of which is to form control actions on the objects of the transport system in real time. An adequate response to load changes in the transport network is implemented by controlling traffic light objects using built-in adaptive algorithms, among which artificial intelligence technologies, in particular, neural networks, are increasingly being used. The noted approaches compete with the widely used optimization-based adaptation method (phase duration, displacement, etc.). The proposed article examines the issue of comparing the efficiency of an optimization algorithm with reinforcement machine learning algorithms by the criterion of the time spent by vehicles in the intersection system in the Anylogic simulation environment. This study will help determine a more efficient solution to the problem of setting the duration of traffic light control phases. It was shown that in the reinforcement learning algorithm, the ability to adapt to input intensities within the schedule is higher compared to the optimization algorithm. However, the reinforcement algorithm is more sensitive to the type of schedule than the optimization algorithm, which outperformed the latter by about 20% at the optimal point for the weekday schedule. The advantage of the reinforcement algorithm was more pronounced on the 2nd schedule, which features a tendency to increase traffic intensity, namely: 61% compared to the optimization and 70% compared to the base setting. It turned out to be practically insensitive to the “detuning” of the input data relative to the optimal policy when changing the intensity levels within this schedule. Thus, it was shown that the results of the regulation of the traffic process at the studied real intersection, obtained by modeling using reinforcement learning, are superior to the optimization approach, but are sensitive to the given intensity schedule.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

强化机器学习算法效率的比较评价和环境anylogic中运输过程的比较评价

在现有基础设施的基础上，通过自动交通控制系统来解决大都市交通网络部分路段的吞吐量增加问题，其目的是实时形成对交通系统对象的控制动作。通过使用内置的自适应算法控制交通灯对象来实现对交通网络负载变化的充分响应，其中人工智能技术，特别是神经网络，正在越来越多地使用。这些方法与广泛使用的基于优化的自适应方法(相位持续时间、位移等)相竞争。在Anylogic仿真环境中，通过车辆在交叉口系统中花费的时间标准，研究了比较优化算法与强化机器学习算法效率的问题。这项研究将有助于确定一个更有效的解决方法来设置交通灯控制阶段的持续时间。结果表明，与优化算法相比，强化学习算法在调度范围内对输入强度的适应能力更高。然而，强化算法对调度类型的敏感性高于优化算法，在工作日调度的最优点优于优化算法约20%。强化算法的优势在第2个时间表上更为明显，有增加交通强度的趋势，与优化相比有61%的优势，与基本设置相比有70%的优势。事实证明，当改变该调度中的强度级别时，它实际上对相对于最优策略的输入数据的“失谐”不敏感。研究结果表明，利用强化学习建模得到的实际交叉口交通过程调节结果优于优化方法，但对给定的强度调度较为敏感。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

System analysis and logistics

自引率

0.00%

发文量