{"title":"XLight: An interpretable multi-agent reinforcement learning approach for traffic signal control","authors":"Sibin Cai , Jie Fang , Mengyun Xu","doi":"10.1016/j.eswa.2025.126938","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, deep reinforcement learning (DRL)-based traffic signal control (TSC) methods have garnered significant attention among researchers, achieving substantial progress. However, current research often focuses on performance improvement, neglecting interpretability. DRL-based TSC methods often face challenges in interpretability. This limitation poses significant obstacles to practical deployment, given the liability and regulatory constraints faced by governmental authorities responsible for traffic management and control. On the other hand, interpretable RL-based TSC methods offer greater flexibility to meet specific requirements. For instance, prioritizing the clearance of vehicles in a particular movement can be easily achieved by assigning higher weights to the state variables associated with that movement. To address this issue, we propose <strong><em>Xlight</em></strong>, an interpretable multi-agent reinforcement learning (MARL) approach for TSC, which enhances interpretability in three key aspects: (a) meticulously designing and selecting the state space, action space, and reward function. Especially, we propose an interpretable reward function for network-wide TSC and prove that maximizing this reward is equivalent to minimizing the average travel time (ATT) in the road network; (b) introducing more practical regulatable (i.e., interpretable) functions as TSC controllers; and (c) employing maximum entropy policy optimization, which simultaneously enhances interpretability and improves transferability. Next, to better align with practical applications of network-wide TSC, we propose several interpretable MARL-based methods. Among these, Multi-Agent Regulatable Soft Actor-Critic (MARSAC) not only possesses interpretability but also achieves superior performance. Finally, comprehensive experiments conducted across various TSC scenarios, including isolated intersection, synthetic network-wide intersections, and real-world network-wide intersections, demonstrate the effectiveness. For example, in terms of the ATT metric, our proposed method achieves improvements of 9.55%, 34.17%, 3.98%, and 42.93% compared to the Actuated Traffic Signal Control (ATSC) across a synthetic road network and 3 real-world road networks. Furthermore, in the synthetic network, our method demonstrates improvements of 4.04% and 3.21% in the Safety Score and Fuel Consumption metrics, respectively, when compared to the ATSC.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"273 ","pages":"Article 126938"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425005603","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, deep reinforcement learning (DRL)-based traffic signal control (TSC) methods have garnered significant attention among researchers, achieving substantial progress. However, current research often focuses on performance improvement, neglecting interpretability. DRL-based TSC methods often face challenges in interpretability. This limitation poses significant obstacles to practical deployment, given the liability and regulatory constraints faced by governmental authorities responsible for traffic management and control. On the other hand, interpretable RL-based TSC methods offer greater flexibility to meet specific requirements. For instance, prioritizing the clearance of vehicles in a particular movement can be easily achieved by assigning higher weights to the state variables associated with that movement. To address this issue, we propose Xlight, an interpretable multi-agent reinforcement learning (MARL) approach for TSC, which enhances interpretability in three key aspects: (a) meticulously designing and selecting the state space, action space, and reward function. Especially, we propose an interpretable reward function for network-wide TSC and prove that maximizing this reward is equivalent to minimizing the average travel time (ATT) in the road network; (b) introducing more practical regulatable (i.e., interpretable) functions as TSC controllers; and (c) employing maximum entropy policy optimization, which simultaneously enhances interpretability and improves transferability. Next, to better align with practical applications of network-wide TSC, we propose several interpretable MARL-based methods. Among these, Multi-Agent Regulatable Soft Actor-Critic (MARSAC) not only possesses interpretability but also achieves superior performance. Finally, comprehensive experiments conducted across various TSC scenarios, including isolated intersection, synthetic network-wide intersections, and real-world network-wide intersections, demonstrate the effectiveness. For example, in terms of the ATT metric, our proposed method achieves improvements of 9.55%, 34.17%, 3.98%, and 42.93% compared to the Actuated Traffic Signal Control (ATSC) across a synthetic road network and 3 real-world road networks. Furthermore, in the synthetic network, our method demonstrates improvements of 4.04% and 3.21% in the Safety Score and Fuel Consumption metrics, respectively, when compared to the ATSC.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.