An adaptive traffic signal control scheme with Proximal Policy Optimization based on deep reinforcement learning for a single intersection

IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Engineering Applications of Artificial Intelligence Pub Date : 2025-03-11 DOI:10.1016/j.engappai.2025.110440
Lijuan Wang , Guoshan Zhang , Qiaoli Yang , Tianyang Han
{"title":"An adaptive traffic signal control scheme with Proximal Policy Optimization based on deep reinforcement learning for a single intersection","authors":"Lijuan Wang ,&nbsp;Guoshan Zhang ,&nbsp;Qiaoli Yang ,&nbsp;Tianyang Han","doi":"10.1016/j.engappai.2025.110440","DOIUrl":null,"url":null,"abstract":"<div><div>Adaptive traffic signal control (ATSC) is an important means to alleviate traffic congestion and improve the quality of road traffic. Although deep reinforcement learning (DRL) technology has shown great potential in solving traffic signal control problems, the state representation and reward design, as well as action interval time, still need to be further studied. The advantages of policy learning have not been fully applied in TSC. To address the aforementioned issues, we propose a DRL-based traffic signal control scheme with Poximal Policy Optimization (PPO-TSC). We use the waiting time of vehicles and the queue length of lanes represented the spatiotemporal characteristics of traffic flow to design the simplified traffic states feature vectors, and define the reward function that is consistent with the state. Additionally, we compare and analyze the performance indexes obtained by various methods using action intervals of 5s, 10s, and 15s. The algorithm is implemented based on the Actor-Critic architecture, using the advantage estimation and the clip mechanism to constrain the range of gradient updates. We validate the proposed scheme at a single intersection in Simulation of Urban MObility (SUMO) under two different traffic demand patterns of flat traffic and peak traffic. The experimental results show that the proposed method is significantly better than other compared methods. Specifically, PPO-TSC demonstrates a reduction of 24% in average travel time (ATT), a decrease of 45% in the average time loss (ATL), and an increase of 16% in average speed (AS) compared with the existing methods under peak traffic condition.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"149 ","pages":"Article 110440"},"PeriodicalIF":8.0000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625004403","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Adaptive traffic signal control (ATSC) is an important means to alleviate traffic congestion and improve the quality of road traffic. Although deep reinforcement learning (DRL) technology has shown great potential in solving traffic signal control problems, the state representation and reward design, as well as action interval time, still need to be further studied. The advantages of policy learning have not been fully applied in TSC. To address the aforementioned issues, we propose a DRL-based traffic signal control scheme with Poximal Policy Optimization (PPO-TSC). We use the waiting time of vehicles and the queue length of lanes represented the spatiotemporal characteristics of traffic flow to design the simplified traffic states feature vectors, and define the reward function that is consistent with the state. Additionally, we compare and analyze the performance indexes obtained by various methods using action intervals of 5s, 10s, and 15s. The algorithm is implemented based on the Actor-Critic architecture, using the advantage estimation and the clip mechanism to constrain the range of gradient updates. We validate the proposed scheme at a single intersection in Simulation of Urban MObility (SUMO) under two different traffic demand patterns of flat traffic and peak traffic. The experimental results show that the proposed method is significantly better than other compared methods. Specifically, PPO-TSC demonstrates a reduction of 24% in average travel time (ATT), a decrease of 45% in the average time loss (ATL), and an increase of 16% in average speed (AS) compared with the existing methods under peak traffic condition.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于深度强化学习的单交叉口近端策略优化自适应交通信号控制方案
自适应交通信号控制(ATSC)是缓解交通拥堵、提高道路交通质量的重要手段。虽然深度强化学习(DRL)技术在解决交通信号控制问题上已经显示出巨大的潜力,但其状态表示和奖励设计,以及动作间隔时间等方面仍有待进一步研究。政策学习的优势并没有在TSC中得到充分的发挥。为了解决上述问题,我们提出了一种基于drl的最优策略优化(PPO-TSC)交通信号控制方案。利用车辆等待时间和车道排队长度表征交通流时空特征,设计简化的交通状态特征向量,并定义与状态相一致的奖励函数。此外,我们还使用5s、10s和15s的动作间隔对各种方法获得的性能指标进行了比较和分析。该算法基于Actor-Critic架构,利用优势估计和剪辑机制来约束梯度更新的范围。在城市交通仿真(SUMO)中,以单个交叉口为例,在平坦交通和高峰交通两种不同的交通需求模式下对该方案进行了验证。实验结果表明,该方法明显优于其他比较方法。具体而言,在高峰交通条件下,PPO-TSC与现有方法相比,平均行驶时间(ATT)降低24%,平均时间损失(ATL)降低45%,平均速度(AS)提高16%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence 工程技术-工程:电子与电气
CiteScore
9.60
自引率
10.00%
发文量
505
审稿时长
68 days
期刊介绍: Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.
期刊最新文献
Multiphysics response and internal leakage prediction of seismic hydraulic systems considering structural clearance effects Machine learning-based prediction of ductility of strain-hardening fiber-reinforced cementitious composites Neighborhood constrained attention for lightweight image super-resolution A quantum group decision-making model for patient-capital project selection integrating cumulative prospect theory under linear Diophantine fuzzy uncertainty Forecast-enhanced bilevel real-time pricing for microgrids via hybrid-action reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1