Multi-Timescale Reward-Based DRL Energy Management for Regenerative Braking Energy Storage System

IF 8.3 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Transportation Electrification Pub Date : 2025-01-10 DOI:10.1109/TTE.2025.3528255

Junyu Chen;Yue Zhao;Minghao Wang;Kai Yang;Yinbo Ge;Ke Wang;Hongjian Lin;Pengyu Pan;Haitao Hu;Zhengyou He;Zhao Xu

{"title":"Multi-Timescale Reward-Based DRL Energy Management for Regenerative Braking Energy Storage System","authors":"Junyu Chen;Yue Zhao;Minghao Wang;Kai Yang;Yinbo Ge;Ke Wang;Hongjian Lin;Pengyu Pan;Haitao Hu;Zhengyou He;Zhao Xu","doi":"10.1109/TTE.2025.3528255","DOIUrl":null,"url":null,"abstract":"The traditional model-based energy management strategy (EMS) for regenerative braking energy storage systems (RBESSs) is obsoleting in the face of increasingly complex and uncertain operation conditions in railway power systems (RPSs). In this article, a model-free deep reinforcement learning (DRL) method is proposed. First, the multiobjective energy management problem for RBESS is formulated to concurrently achieve the regenerative braking energy (RBE) utilization and power demand shaving of RPS. Then, this problem is modeled as a Markov decision process (MDP) to be solved by the DRL-based method. Specifically, the RBESS controller is modeled as an agent to interact with the environment modeled as the RPS integrated with RBESS. To coordinate the agent to learn the optimal strategies regarding multiple energy management objectives in different timescales, a multistage reward function (MSRF) involving the step reward and final reward is designed. Based on the above elements, the double deep <italic>Q</i>-learning algorithm is applied to train the agent for optimizing the EMS. Finally, the proposed DRL-based EMS is tested on the OPAL-RT experimental platform by using the field load data. Case studies have demonstrated that the proposed method outperforms the traditional rule-based and optimization-based methods by over 5% in the energy management objective.","PeriodicalId":56269,"journal":{"name":"IEEE Transactions on Transportation Electrification","volume":"11 3","pages":"7488-7500"},"PeriodicalIF":8.3000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Transportation Electrification","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10836947/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

The traditional model-based energy management strategy (EMS) for regenerative braking energy storage systems (RBESSs) is obsoleting in the face of increasingly complex and uncertain operation conditions in railway power systems (RPSs). In this article, a model-free deep reinforcement learning (DRL) method is proposed. First, the multiobjective energy management problem for RBESS is formulated to concurrently achieve the regenerative braking energy (RBE) utilization and power demand shaving of RPS. Then, this problem is modeled as a Markov decision process (MDP) to be solved by the DRL-based method. Specifically, the RBESS controller is modeled as an agent to interact with the environment modeled as the RPS integrated with RBESS. To coordinate the agent to learn the optimal strategies regarding multiple energy management objectives in different timescales, a multistage reward function (MSRF) involving the step reward and final reward is designed. Based on the above elements, the double deep Q-learning algorithm is applied to train the agent for optimizing the EMS. Finally, the proposed DRL-based EMS is tested on the OPAL-RT experimental platform by using the field load data. Case studies have demonstrated that the proposed method outperforms the traditional rule-based and optimization-based methods by over 5% in the energy management objective.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于多时间尺度奖励的再生制动储能系统DRL能量管理

面对铁路电力系统日益复杂和不确定的运行条件，传统的基于模型的再生制动储能系统能量管理策略已经过时。本文提出了一种无模型深度强化学习（DRL）方法。首先，提出了RBESS的多目标能量管理问题，以同时实现RPS的再生制动能量利用和电力需求削减；然后，将该问题建模为马尔可夫决策过程（MDP），采用基于drl的方法求解。具体来说，RBESS控制器被建模为一个代理，与被建模为与RBESS集成的RPS的环境进行交互。为了协调智能体在不同时间尺度下对多个能量管理目标的最优策略学习，设计了一个包含步骤奖励和最终奖励的多阶段奖励函数（MSRF）。基于以上要素，采用双深度q -学习算法训练agent，对EMS进行优化。最后，利用现场载荷数据，在OPAL-RT实验平台上对基于drl的EMS进行了测试。实例研究表明，该方法在能源管理目标上比传统的基于规则和基于优化的方法高出5%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Transportation Electrification Engineering-Electrical and Electronic Engineering

CiteScore

12.20

自引率

15.70%

发文量

449

期刊介绍： IEEE Transactions on Transportation Electrification is focused on components, sub-systems, systems, standards, and grid interface technologies related to power and energy conversion, propulsion, and actuation for all types of electrified vehicles including on-road, off-road, off-highway, and rail vehicles, airplanes, and ships.