{"title":"Leveraging reinforcement learning for dynamic traffic control: A survey and challenges for field implementation","authors":"Yu Han , Meng Wang , Ludovic Leclercq","doi":"10.1016/j.commtr.2023.100104","DOIUrl":null,"url":null,"abstract":"<div><p>In recent years, the advancement of artificial intelligence techniques has led to significant interest in reinforcement learning (RL) within the traffic and transportation community. Dynamic traffic control has emerged as a prominent application field for RL in traffic systems. This paper presents a comprehensive survey of RL studies in dynamic traffic control, addressing the challenges associated with implementing RL-based traffic control strategies in practice, and identifying promising directions for future research. The first part of this paper provides a comprehensive overview of existing studies on RL-based traffic control strategies, encompassing their model designs, training algorithms, and evaluation methods. It is found that only a few studies have isolated the training and testing environments while evaluating their RL controllers. Subsequently, we examine the challenges involved in implementing existing RL-based traffic control strategies. We investigate the learning costs associated with online RL methods and the transferability of offline RL methods through simulation experiments. The simulation results reveal that online training methods with random exploration suffer from high exploration and learning costs. Additionally, the performance of offline RL methods is highly reliant on the accuracy of the training simulator. These limitations hinder the practical implementation of existing RL-based traffic control strategies. The final part of this paper summarizes and discusses a few existing efforts which attempt to overcome these challenges. This review highlights a rising volume of studies dedicated to mitigating the limitations of RL strategies, with the specific aim of enhancing their practical implementation in recent years.</p></div>","PeriodicalId":100292,"journal":{"name":"Communications in Transportation Research","volume":null,"pages":null},"PeriodicalIF":12.5000,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S277242472300015X/pdfft?md5=127199f7739f428aa7133722ddb48d9f&pid=1-s2.0-S277242472300015X-main.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications in Transportation Research","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S277242472300015X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 1
Abstract
In recent years, the advancement of artificial intelligence techniques has led to significant interest in reinforcement learning (RL) within the traffic and transportation community. Dynamic traffic control has emerged as a prominent application field for RL in traffic systems. This paper presents a comprehensive survey of RL studies in dynamic traffic control, addressing the challenges associated with implementing RL-based traffic control strategies in practice, and identifying promising directions for future research. The first part of this paper provides a comprehensive overview of existing studies on RL-based traffic control strategies, encompassing their model designs, training algorithms, and evaluation methods. It is found that only a few studies have isolated the training and testing environments while evaluating their RL controllers. Subsequently, we examine the challenges involved in implementing existing RL-based traffic control strategies. We investigate the learning costs associated with online RL methods and the transferability of offline RL methods through simulation experiments. The simulation results reveal that online training methods with random exploration suffer from high exploration and learning costs. Additionally, the performance of offline RL methods is highly reliant on the accuracy of the training simulator. These limitations hinder the practical implementation of existing RL-based traffic control strategies. The final part of this paper summarizes and discusses a few existing efforts which attempt to overcome these challenges. This review highlights a rising volume of studies dedicated to mitigating the limitations of RL strategies, with the specific aim of enhancing their practical implementation in recent years.