基于深度强化学习的电动汽车充电时空动态导航

IF 2.3 4区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IET Intelligent Transport Systems Pub Date : 2024-11-10 DOI:10.1049/itr2.12588

Ali Can Erüst, Fatma Yıldız Taşcıkaraoğlu

{"title":"基于深度强化学习的电动汽车充电时空动态导航","authors":"Ali Can Erüst, Fatma Yıldız Taşcıkaraoğlu","doi":"10.1049/itr2.12588","DOIUrl":null,"url":null,"abstract":"<p>This paper considers the real-time spatio-temporal electric vehicle charging navigation problem in a dynamic environment by utilizing a shortest path-based reinforcement learning approach. In a data sharing system including transportation network, an electric vehicle (EV) and EV charging stations (EVCSs), it is aimed to determine the most convenient EVCS and the optimal path for reducing the travel, charging and waiting costs. To estimate the waiting times at EVCSs, Gaussian process regression algorithm is integrated using a real-time dataset comprising of state-of-charge and arrival-departure times of EVs. The optimization problem is modelled as a Markov decision process with unknown transition probability to overcome the uncertainties arising from time-varying variables. A recently proposed on-policy actor–critic method, phasic policy gradient (PPG) which extends the proximal policy optimization algorithm with an auxiliary optimization phase to improve training by distilling features from the critic to the actor network, is used to make EVCS decisions on the network where EV travels through the optimal path from origin node to EVCS by considering dynamic traffic conditions, unit value of EV owner and time-of-use charging price. Three case studies are carried out for 24 nodes Sioux-Falls benchmark network. It is shown that phasic policy gradient achieves an average of 9% better reward compared to proximal policy optimization and the total time decreases by 7–10% when EV owner cost is considered.</p>","PeriodicalId":50381,"journal":{"name":"IET Intelligent Transport Systems","volume":"18 12","pages":"2520-2531"},"PeriodicalIF":2.3000,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/itr2.12588","citationCount":"0","resultStr":"{\"title\":\"Spatio-temporal dynamic navigation for electric vehicle charging using deep reinforcement learning\",\"authors\":\"Ali Can Erüst, Fatma Yıldız Taşcıkaraoğlu\",\"doi\":\"10.1049/itr2.12588\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>This paper considers the real-time spatio-temporal electric vehicle charging navigation problem in a dynamic environment by utilizing a shortest path-based reinforcement learning approach. In a data sharing system including transportation network, an electric vehicle (EV) and EV charging stations (EVCSs), it is aimed to determine the most convenient EVCS and the optimal path for reducing the travel, charging and waiting costs. To estimate the waiting times at EVCSs, Gaussian process regression algorithm is integrated using a real-time dataset comprising of state-of-charge and arrival-departure times of EVs. The optimization problem is modelled as a Markov decision process with unknown transition probability to overcome the uncertainties arising from time-varying variables. A recently proposed on-policy actor–critic method, phasic policy gradient (PPG) which extends the proximal policy optimization algorithm with an auxiliary optimization phase to improve training by distilling features from the critic to the actor network, is used to make EVCS decisions on the network where EV travels through the optimal path from origin node to EVCS by considering dynamic traffic conditions, unit value of EV owner and time-of-use charging price. Three case studies are carried out for 24 nodes Sioux-Falls benchmark network. It is shown that phasic policy gradient achieves an average of 9% better reward compared to proximal policy optimization and the total time decreases by 7–10% when EV owner cost is considered.</p>\",\"PeriodicalId\":50381,\"journal\":{\"name\":\"IET Intelligent Transport Systems\",\"volume\":\"18 12\",\"pages\":\"2520-2531\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/itr2.12588\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Intelligent Transport Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/itr2.12588\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Intelligent Transport Systems","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/itr2.12588","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

利用基于最短路径的强化学习方法，研究了动态环境下电动汽车充电的实时时空导航问题。在包括交通网络、电动汽车（EV）和电动汽车充电站（EVCS）在内的数据共享系统中，旨在确定最方便的EVCS和降低出行、充电和等待成本的最佳路径。为了估计evcs的等待时间，利用包含电动汽车充电状态和到达-离开时间的实时数据集，将高斯过程回归算法集成到evcs。为了克服时变变量带来的不确定性，将优化问题建模为具有未知转移概率的马尔可夫决策过程。最近提出了一种基于策略的参与者-批评者方法——相位策略梯度（PPG），该方法扩展了近端策略优化算法，并通过将批评者的特征提取到参与者网络中来辅助优化阶段，以提高训练效果。该方法在考虑动态交通条件、电动汽车车主单位价值和分时充电价格的情况下，在电动汽车从起始节点到EVCS的最优路径上进行EVCS决策。对苏-福尔斯24节点基准网络进行了三个案例研究。研究表明，考虑电动汽车车主成本时，相位政策梯度比近端政策优化平均多获得9%的回报，总时间减少7-10%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Spatio-temporal dynamic navigation for electric vehicle charging using deep reinforcement learning

This paper considers the real-time spatio-temporal electric vehicle charging navigation problem in a dynamic environment by utilizing a shortest path-based reinforcement learning approach. In a data sharing system including transportation network, an electric vehicle (EV) and EV charging stations (EVCSs), it is aimed to determine the most convenient EVCS and the optimal path for reducing the travel, charging and waiting costs. To estimate the waiting times at EVCSs, Gaussian process regression algorithm is integrated using a real-time dataset comprising of state-of-charge and arrival-departure times of EVs. The optimization problem is modelled as a Markov decision process with unknown transition probability to overcome the uncertainties arising from time-varying variables. A recently proposed on-policy actor–critic method, phasic policy gradient (PPG) which extends the proximal policy optimization algorithm with an auxiliary optimization phase to improve training by distilling features from the critic to the actor network, is used to make EVCS decisions on the network where EV travels through the optimal path from origin node to EVCS by considering dynamic traffic conditions, unit value of EV owner and time-of-use charging price. Three case studies are carried out for 24 nodes Sioux-Falls benchmark network. It is shown that phasic policy gradient achieves an average of 9% better reward compared to proximal policy optimization and the total time decreases by 7–10% when EV owner cost is considered.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IET Intelligent Transport Systems 工程技术-运输科技

CiteScore

6.50

自引率

7.40%

发文量

159

审稿时长

3 months

期刊介绍： IET Intelligent Transport Systems is an interdisciplinary journal devoted to research into the practical applications of ITS and infrastructures. The scope of the journal includes the following: Sustainable traffic solutions Deployments with enabling technologies Pervasive monitoring Applications; demonstrations and evaluation Economic and behavioural analyses of ITS services and scenario Data Integration and analytics Information collection and processing; image processing applications in ITS ITS aspects of electric vehicles Autonomous vehicles; connected vehicle systems; In-vehicle ITS, safety and vulnerable road user aspects Mobility as a service systems Traffic management and control Public transport systems technologies Fleet and public transport logistics Emergency and incident management Demand management and electronic payment systems Traffic related air pollution management Policy and institutional issues Interoperability, standards and architectures Funding scenarios Enforcement Human machine interaction Education, training and outreach Current Special Issue Call for papers: Intelligent Transportation Systems in Smart Cities for Sustainable Environment - https://digital-library.theiet.org/files/IET_ITS_CFP_ITSSCSE.pdf Sustainably Intelligent Mobility (SIM) - https://digital-library.theiet.org/files/IET_ITS_CFP_SIM.pdf Traffic Theory and Modelling in the Era of Artificial Intelligence and Big Data (in collaboration with World Congress for Transport Research, WCTR 2019) - https://digital-library.theiet.org/files/IET_ITS_CFP_WCTR.pdf