Adaptive energy management strategy for FCHEV based on improved proximal policy optimization in deep reinforcement learning algorithm

IF 9.9 1区工程技术 Q1 ENERGY & FUELS Energy Conversion and Management Pub Date : 2024-09-06 DOI:10.1016/j.enconman.2024.118977

{"title":"Adaptive energy management strategy for FCHEV based on improved proximal policy optimization in deep reinforcement learning algorithm","authors":"","doi":"10.1016/j.enconman.2024.118977","DOIUrl":null,"url":null,"abstract":"<div><p>In order to reduce hydrogen consumption, relieve degradation of fuel cells, and improve the operational efficiency of fuel cell hybrid electric vehicles, an energy management strategy is proposed. This strategy is based on an improved proximal policy optimization (PPO) deep reinforcement learning algorithm. A hierarchical optimal control strategy with adaptive driving pattern is constructed by using clipping strategy instead of divergence penalty. This strategy can reduce the number of iterations of the algorithm. To improve the accuracy of driving pattern identification, in the identification layer, a multilayer perceptron neural network model is used for offline training of driving pattern recognition. Online driving pattern recognition is carried out using a sliding recognition window method. At the policy layer, in order to improve system response speed and reduce the impact of peak power on the fuel cell, dynamic programming is used for offline optimization of lithium battery state of charge values under different driving patterns. Then, based on the optimized objective of equivalent hydrogen consumption normalized by actual hydrogen consumption and fuel cell degradation, the lithium battery is charged and discharged online according to the corresponding optimal state-of-charge (SoC) value for each driving pattern. The experimental results show that the accuracy of driving pattern recognition can reach 90.75%, and the economic performance has improved by 3.11%. Therefore, this study can effectively balance hydrogen consumption and fuel cell degradation to achieve hydrogen conservation and delay in fuel cell lifespan degradation.</p></div>","PeriodicalId":11664,"journal":{"name":"Energy Conversion and Management","volume":null,"pages":null},"PeriodicalIF":9.9000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Conversion and Management","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S019689042400918X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

Abstract

In order to reduce hydrogen consumption, relieve degradation of fuel cells, and improve the operational efficiency of fuel cell hybrid electric vehicles, an energy management strategy is proposed. This strategy is based on an improved proximal policy optimization (PPO) deep reinforcement learning algorithm. A hierarchical optimal control strategy with adaptive driving pattern is constructed by using clipping strategy instead of divergence penalty. This strategy can reduce the number of iterations of the algorithm. To improve the accuracy of driving pattern identification, in the identification layer, a multilayer perceptron neural network model is used for offline training of driving pattern recognition. Online driving pattern recognition is carried out using a sliding recognition window method. At the policy layer, in order to improve system response speed and reduce the impact of peak power on the fuel cell, dynamic programming is used for offline optimization of lithium battery state of charge values under different driving patterns. Then, based on the optimized objective of equivalent hydrogen consumption normalized by actual hydrogen consumption and fuel cell degradation, the lithium battery is charged and discharged online according to the corresponding optimal state-of-charge (SoC) value for each driving pattern. The experimental results show that the accuracy of driving pattern recognition can reach 90.75%, and the economic performance has improved by 3.11%. Therefore, this study can effectively balance hydrogen consumption and fuel cell degradation to achieve hydrogen conservation and delay in fuel cell lifespan degradation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于深度强化学习算法中改进的近端策略优化的 FCHEV 自适应能源管理策略

为了减少氢气消耗、缓解燃料电池的衰减并提高燃料电池混合电动汽车的运行效率，本文提出了一种能源管理策略。该策略基于改进的近端策略优化（PPO）深度强化学习算法。通过使用剪切策略代替发散惩罚，构建了具有自适应驾驶模式的分层优化控制策略。这种策略可以减少算法的迭代次数。为了提高驾驶模式识别的准确性，在识别层中使用了多层感知器神经网络模型进行离线驾驶模式识别训练。在线驾驶模式识别采用滑动识别窗口法。在策略层，为了提高系统响应速度，降低峰值功率对燃料电池的影响，采用动态编程法对不同驾驶模式下的锂电池充电状态值进行离线优化。然后，根据实际氢耗和燃料电池衰减归一化的等效氢耗优化目标，按照每种驾驶模式下相应的最佳充电状态（SoC）值对锂电池进行在线充放电。实验结果表明，驾驶模式识别的准确率可达 90.75%，经济性提高了 3.11%。因此，该研究能有效平衡氢消耗和燃料电池衰减，达到节约氢和延缓燃料电池寿命衰减的目的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Energy Conversion and Management 工程技术-力学

CiteScore

19.00

自引率

11.50%

发文量

1304

审稿时长

17 days

期刊介绍： The journal Energy Conversion and Management provides a forum for publishing original contributions and comprehensive technical review articles of interdisciplinary and original research on all important energy topics. The topics considered include energy generation, utilization, conversion, storage, transmission, conservation, management and sustainability. These topics typically involve various types of energy such as mechanical, thermal, nuclear, chemical, electromagnetic, magnetic and electric. These energy types cover all known energy resources, including renewable resources (e.g., solar, bio, hydro, wind, geothermal and ocean energy), fossil fuels and nuclear resources.