利用强化学习在混合交通网络中制定生态驾驶策略。

IF 2.6 4区综合性期刊 Q2 MULTIDISCIPLINARY SCIENCES Science Progress Pub Date : 2024-07-01 DOI:10.1177/00368504241263406

Umar Jamil, Mostafa Malmir, Alan Chen, Monika Filipovska, Mimi Xie, Caiwen Ding, Yu-Fang Jin

{"title":"利用强化学习在混合交通网络中制定生态驾驶策略。","authors":"Umar Jamil, Mostafa Malmir, Alan Chen, Monika Filipovska, Mimi Xie, Caiwen Ding, Yu-Fang Jin","doi":"10.1177/00368504241263406","DOIUrl":null,"url":null,"abstract":"Eco-driving has garnered considerable research attention owing to its potential socio-economic impact, including enhanced public health and mitigated climate change effects through the reduction of greenhouse gas emissions. With an expectation of more autonomous vehicles (AVs) on the road, an eco-driving strategy in hybrid traffic networks encompassing AV and human-driven vehicles (HDVs) with the coordination of traffic lights is a challenging task. The challenge is partially due to the insufficient infrastructure for collecting, transmitting, and sharing real-time traffic data among vehicles, facilities, and traffic control centers, and the following decision-making of agents involved in traffic control. Additionally, the intricate nature of the existing traffic network, with its diverse array of vehicles and facilities, contributes to the challenge by hindering the development of a mathematical model for accurately characterizing the traffic network. In this study, we utilized the Simulation of Urban Mobility (SUMO) simulator to tackle the first challenge through computational analysis. To address the second challenge, we employed a model-free reinforcement learning (RL) algorithm, proximal policy optimization, to decide the actions of AV and traffic light signals in a traffic network. A novel eco-driving strategy was proposed by introducing different percentages of AV into the traffic flow and collaborating with traffic light signals using RL to control the overall speed of the vehicles, resulting in improved fuel consumption efficiency. Average rewards with different penetration rates of AV (5%, 10%, and 20% of total vehicles) were compared to the situation without any AV in the traffic flow (0% penetration rate). The 10% penetration rate of AV showed a minimum time of convergence to achieve average reward, leading to a significant reduction in fuel consumption and total delay of all vehicles.","PeriodicalId":56061,"journal":{"name":"Science Progress","volume":"107 3","pages":"368504241263406"},"PeriodicalIF":2.6000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11320699/pdf/","citationCount":"0","resultStr":"{\"title\":\"Developing an eco-driving strategy in a hybrid traffic network using reinforcement learning.\",\"authors\":\"Umar Jamil, Mostafa Malmir, Alan Chen, Monika Filipovska, Mimi Xie, Caiwen Ding, Yu-Fang Jin\",\"doi\":\"10.1177/00368504241263406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Eco-driving has garnered considerable research attention owing to its potential socio-economic impact, including enhanced public health and mitigated climate change effects through the reduction of greenhouse gas emissions. With an expectation of more autonomous vehicles (AVs) on the road, an eco-driving strategy in hybrid traffic networks encompassing AV and human-driven vehicles (HDVs) with the coordination of traffic lights is a challenging task. The challenge is partially due to the insufficient infrastructure for collecting, transmitting, and sharing real-time traffic data among vehicles, facilities, and traffic control centers, and the following decision-making of agents involved in traffic control. Additionally, the intricate nature of the existing traffic network, with its diverse array of vehicles and facilities, contributes to the challenge by hindering the development of a mathematical model for accurately characterizing the traffic network. In this study, we utilized the Simulation of Urban Mobility (SUMO) simulator to tackle the first challenge through computational analysis. To address the second challenge, we employed a model-free reinforcement learning (RL) algorithm, proximal policy optimization, to decide the actions of AV and traffic light signals in a traffic network. A novel eco-driving strategy was proposed by introducing different percentages of AV into the traffic flow and collaborating with traffic light signals using RL to control the overall speed of the vehicles, resulting in improved fuel consumption efficiency. Average rewards with different penetration rates of AV (5%, 10%, and 20% of total vehicles) were compared to the situation without any AV in the traffic flow (0% penetration rate). The 10% penetration rate of AV showed a minimum time of convergence to achieve average reward, leading to a significant reduction in fuel consumption and total delay of all vehicles.\",\"PeriodicalId\":56061,\"journal\":{\"name\":\"Science Progress\",\"volume\":\"107 3\",\"pages\":\"368504241263406\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11320699/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science Progress\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1177/00368504241263406\",\"RegionNum\":4,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Progress","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1177/00368504241263406","RegionNum":4,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

生态驾驶具有潜在的社会经济影响，包括通过减少温室气体排放来提高公众健康水平和减轻气候变化影响，因此受到了相当多的研究关注。随着更多自动驾驶车辆（AV）有望上路，在包括自动驾驶车辆和人类驾驶车辆（HDV）的混合交通网络中，生态驾驶战略与交通信号灯的协调是一项具有挑战性的任务。造成这一挑战的部分原因是，在车辆、设施和交通控制中心之间收集、传输和共享实时交通数据的基础设施不足，以及参与交通控制的代理决策不足。此外，现有交通网络错综复杂，车辆和设施种类繁多，阻碍了准确描述交通网络特征的数学模型的开发，从而加剧了这一挑战。在本研究中，我们利用城市交通仿真（SUMO）模拟器，通过计算分析来应对第一个挑战。为了应对第二个挑战，我们采用了一种无模型强化学习（RL）算法--近端策略优化，来决定交通网络中 AV 和交通信号灯的行动。我们提出了一种新颖的生态驾驶策略，即在交通流中引入不同比例的自动驾驶汽车，并利用 RL 与交通信号灯合作控制车辆的总体速度，从而提高燃油消耗效率。将不同普及率（占车辆总数的 5%、10% 和 20%）的自动驾驶汽车的平均回报与交通流中没有任何自动驾驶汽车的情况（普及率为 0%）进行了比较。10% 的 AV 渗透率显示，实现平均奖励的收敛时间最短，从而显著降低了所有车辆的燃油消耗和总延迟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Developing an eco-driving strategy in a hybrid traffic network using reinforcement learning.

Eco-driving has garnered considerable research attention owing to its potential socio-economic impact, including enhanced public health and mitigated climate change effects through the reduction of greenhouse gas emissions. With an expectation of more autonomous vehicles (AVs) on the road, an eco-driving strategy in hybrid traffic networks encompassing AV and human-driven vehicles (HDVs) with the coordination of traffic lights is a challenging task. The challenge is partially due to the insufficient infrastructure for collecting, transmitting, and sharing real-time traffic data among vehicles, facilities, and traffic control centers, and the following decision-making of agents involved in traffic control. Additionally, the intricate nature of the existing traffic network, with its diverse array of vehicles and facilities, contributes to the challenge by hindering the development of a mathematical model for accurately characterizing the traffic network. In this study, we utilized the Simulation of Urban Mobility (SUMO) simulator to tackle the first challenge through computational analysis. To address the second challenge, we employed a model-free reinforcement learning (RL) algorithm, proximal policy optimization, to decide the actions of AV and traffic light signals in a traffic network. A novel eco-driving strategy was proposed by introducing different percentages of AV into the traffic flow and collaborating with traffic light signals using RL to control the overall speed of the vehicles, resulting in improved fuel consumption efficiency. Average rewards with different penetration rates of AV (5%, 10%, and 20% of total vehicles) were compared to the situation without any AV in the traffic flow (0% penetration rate). The 10% penetration rate of AV showed a minimum time of convergence to achieve average reward, leading to a significant reduction in fuel consumption and total delay of all vehicles.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Science Progress Multidisciplinary-Multidisciplinary

CiteScore

3.80

自引率

0.00%

发文量

119

期刊介绍： Science Progress has for over 100 years been a highly regarded review publication in science, technology and medicine. Its objective is to excite the readers'' interest in areas with which they may not be fully familiar but which could facilitate their interest, or even activity, in a cognate field.