Transferring experiences in k-nearest neighbors based multiagent reinforcement learning: an application to traffic signal control

IF 1.4 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE AI Communications Pub Date : 2023-09-27 DOI:10.3233/aic-220305

Ana Lucia C. Bazzan, Vicente N. de Almeida, Monireh Abdoos

{"title":"Transferring experiences in k-nearest neighbors based multiagent reinforcement learning: an application to traffic signal control","authors":"Ana Lucia C. Bazzan, Vicente N. de Almeida, Monireh Abdoos","doi":"10.3233/aic-220305","DOIUrl":null,"url":null,"abstract":"The increasing demand for mobility in our society poses various challenges to traffic engineering, computer science in general, and artificial intelligence in particular. Increasing the capacity of road networks is not always possible, thus a more efficient use of the available transportation infrastructure is required. Another issue is that many problems in traffic management and control are inherently decentralized and/or require adaptation to the traffic situation. Hence, there is a close relationship to multiagent reinforcement learning. However, using reinforcement learning poses the challenge that the state space is normally large and continuous, thus it is necessary to find appropriate schemes to deal with discretization of the state space. To address these issues, a multiagent system with agents learning independently via a learning algorithm was proposed, which is based on estimating Q-values from k-nearest neighbors. In the present paper, we extend this approach and include transfer of experiences among the agents, especially when an agent does not have a good set of k experiences. We deal with traffic signal control, running experiments on a traffic network in which we vary the traffic situation along time, and compare our approach to two baselines (one involving reinforcement learning and one based on fixed times). Our results show that the extended method pays off when an agent returns to an already experienced traffic situation.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"69 1","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/aic-220305","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The increasing demand for mobility in our society poses various challenges to traffic engineering, computer science in general, and artificial intelligence in particular. Increasing the capacity of road networks is not always possible, thus a more efficient use of the available transportation infrastructure is required. Another issue is that many problems in traffic management and control are inherently decentralized and/or require adaptation to the traffic situation. Hence, there is a close relationship to multiagent reinforcement learning. However, using reinforcement learning poses the challenge that the state space is normally large and continuous, thus it is necessary to find appropriate schemes to deal with discretization of the state space. To address these issues, a multiagent system with agents learning independently via a learning algorithm was proposed, which is based on estimating Q-values from k-nearest neighbors. In the present paper, we extend this approach and include transfer of experiences among the agents, especially when an agent does not have a good set of k experiences. We deal with traffic signal control, running experiments on a traffic network in which we vary the traffic situation along time, and compare our approach to two baselines (one involving reinforcement learning and one based on fixed times). Our results show that the extended method pays off when an agent returns to an already experienced traffic situation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于k近邻的多智能体强化学习的经验传递:在交通信号控制中的应用

社会对移动性的需求日益增长，对交通工程、计算机科学、特别是人工智能提出了各种挑战。增加道路网络的容量并不总是可能的，因此需要更有效地利用现有的运输基础设施。另一个问题是，交通管理和控制中的许多问题本质上是分散的和/或需要适应交通状况。因此，这与多智能体强化学习有着密切的关系。然而，使用强化学习带来了状态空间通常很大且连续的挑战，因此有必要找到适当的方案来处理状态空间的离散化。为了解决这些问题，提出了一种基于k近邻估计q值的学习算法的多智能体系统。在本文中，我们扩展了这种方法，并包括代理之间的经验转移，特别是当一个代理没有k个良好的经验集时。我们处理交通信号控制，在交通网络上运行实验，其中我们随时间变化交通状况，并将我们的方法与两个基线(一个涉及强化学习，另一个基于固定时间)进行比较。我们的结果表明，当agent返回到已经经历过的交通状况时，扩展方法是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AI Communications 工程技术-计算机：人工智能

CiteScore

2.30

自引率

12.50%

发文量

审稿时长

4.5 months

期刊介绍： AI Communications is a journal on artificial intelligence (AI) which has a close relationship to EurAI (European Association for Artificial Intelligence, formerly ECCAI). It covers the whole AI community: Scientific institutions as well as commercial and industrial companies. AI Communications aims to enhance contacts and information exchange between AI researchers and developers, and to provide supranational information to those concerned with AI and advanced information processing. AI Communications publishes refereed articles concerning scientific and technical AI procedures, provided they are of sufficient interest to a large readership of both scientific and practical background. In addition it contains high-level background material, both at the technical level as well as the level of opinions, policies and news.