基于深度强化学习算法的多速纯电动汽车能量管理策略研究

Weiwei Yang, Denghao Luo, Wenming Zhang, Nong Zhang
{"title":"基于深度强化学习算法的多速纯电动汽车能量管理策略研究","authors":"Weiwei Yang, Denghao Luo, Wenming Zhang, Nong Zhang","doi":"10.1177/09544070241275427","DOIUrl":null,"url":null,"abstract":"With increasingly prominent problems such as environmental pollution and the energy crisis, the development of pure electric vehicles has attracted more and more attention. However, the short range is still one of the main reasons affecting consumer purchases. Therefore, an optimized energy management strategy (EMS) based on the Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to minimize the energy loss for multi-speed pure electric vehicles, respectively, in this paper. Vehicle speed, acceleration, and battery SOC are selected as state variables, and the action space is set to the transmission gear. The reward function takes into account energy consumption and battery life. Simulation results reveal that the proposed EMS-based SAC has a better performance compared to DDPG in the NEDC cycle, manifested explicitly in the following three aspects: (1) the battery SOC decreases from 0.8 to 0.7339 and 0.73385, and the energy consumption consumes 5264.8 and 5296.6 kJ, respectively; (2) The maximumC-rate is 1.565 and 1.566, respectively; (3) the training efficiency of SAC is higher. Therefore, the SAC-based energy management strategy proposed in this paper has a faster convergence speed and gradually approaches the optimal energy-saving effect with a smaller gap. In the WLTC condition, the SAC algorithm reduces 24.1 kJ of energy compared with DDPG, and the C-rate of SAC is below 1. The maximum value is 1.565, which aligns with the reasonable operating range of vehicle batteries. The results show that the SAC algorithm is adaptable under different working conditions.","PeriodicalId":54568,"journal":{"name":"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering","volume":"12 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Investigation of energy management strategy based on deep reinforcement learning algorithm for multi-speed pure electric vehicles\",\"authors\":\"Weiwei Yang, Denghao Luo, Wenming Zhang, Nong Zhang\",\"doi\":\"10.1177/09544070241275427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With increasingly prominent problems such as environmental pollution and the energy crisis, the development of pure electric vehicles has attracted more and more attention. However, the short range is still one of the main reasons affecting consumer purchases. Therefore, an optimized energy management strategy (EMS) based on the Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to minimize the energy loss for multi-speed pure electric vehicles, respectively, in this paper. Vehicle speed, acceleration, and battery SOC are selected as state variables, and the action space is set to the transmission gear. The reward function takes into account energy consumption and battery life. Simulation results reveal that the proposed EMS-based SAC has a better performance compared to DDPG in the NEDC cycle, manifested explicitly in the following three aspects: (1) the battery SOC decreases from 0.8 to 0.7339 and 0.73385, and the energy consumption consumes 5264.8 and 5296.6 kJ, respectively; (2) The maximumC-rate is 1.565 and 1.566, respectively; (3) the training efficiency of SAC is higher. Therefore, the SAC-based energy management strategy proposed in this paper has a faster convergence speed and gradually approaches the optimal energy-saving effect with a smaller gap. In the WLTC condition, the SAC algorithm reduces 24.1 kJ of energy compared with DDPG, and the C-rate of SAC is below 1. The maximum value is 1.565, which aligns with the reasonable operating range of vehicle batteries. The results show that the SAC algorithm is adaptable under different working conditions.\",\"PeriodicalId\":54568,\"journal\":{\"name\":\"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1177/09544070241275427\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, MECHANICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Institution of Mechanical Engineers Part D-Journal of Automobile Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1177/09544070241275427","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
引用次数: 0

摘要

随着环境污染、能源危机等问题日益突出,纯电动汽车的发展受到越来越多的关注。然而,续航里程短仍然是影响消费者购买的主要原因之一。因此,本文提出了一种基于软行为批判(SAC)和深度确定性策略梯度(DDPG)算法的优化能量管理策略(EMS),以分别最小化多速纯电动汽车的能量损耗。选取车辆速度、加速度和电池 SOC 作为状态变量,并将行动空间设置为变速箱档位。奖励函数考虑了能耗和电池寿命。仿真结果表明,与 DDPG 相比,基于 EMS 的 SAC 在 NEDC 循环中具有更好的性能,具体表现在以下三个方面:(1)电池 SOC 分别从 0.8 降至 0.7339 和 0.73385,能耗分别为 5264.8 和 5296.6 kJ;(2)最大 C 率分别为 1.565 和 1.566;(3)SAC 的训练效率更高。因此,本文提出的基于 SAC 的能量管理策略收敛速度较快,并以较小的差距逐渐接近最佳节能效果。在 WLTC 条件下,SAC 算法比 DDPG 减少了 24.1 kJ 能量,且 SAC 的 C 率低于 1,最大值为 1.565,符合车辆电池的合理工作范围。结果表明,SAC 算法能适应不同的工作条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Investigation of energy management strategy based on deep reinforcement learning algorithm for multi-speed pure electric vehicles
With increasingly prominent problems such as environmental pollution and the energy crisis, the development of pure electric vehicles has attracted more and more attention. However, the short range is still one of the main reasons affecting consumer purchases. Therefore, an optimized energy management strategy (EMS) based on the Soft Actor-Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to minimize the energy loss for multi-speed pure electric vehicles, respectively, in this paper. Vehicle speed, acceleration, and battery SOC are selected as state variables, and the action space is set to the transmission gear. The reward function takes into account energy consumption and battery life. Simulation results reveal that the proposed EMS-based SAC has a better performance compared to DDPG in the NEDC cycle, manifested explicitly in the following three aspects: (1) the battery SOC decreases from 0.8 to 0.7339 and 0.73385, and the energy consumption consumes 5264.8 and 5296.6 kJ, respectively; (2) The maximumC-rate is 1.565 and 1.566, respectively; (3) the training efficiency of SAC is higher. Therefore, the SAC-based energy management strategy proposed in this paper has a faster convergence speed and gradually approaches the optimal energy-saving effect with a smaller gap. In the WLTC condition, the SAC algorithm reduces 24.1 kJ of energy compared with DDPG, and the C-rate of SAC is below 1. The maximum value is 1.565, which aligns with the reasonable operating range of vehicle batteries. The results show that the SAC algorithm is adaptable under different working conditions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.40
自引率
17.60%
发文量
263
审稿时长
3.5 months
期刊介绍: The Journal of Automobile Engineering is an established, high quality multi-disciplinary journal which publishes the very best peer-reviewed science and engineering in the field.
期刊最新文献
Comparison of simplex and duplex drum brakes linings with transverse slots in vehicles Scenario-aware clustered federated learning for vehicle trajectory prediction with non-IID data Vehicle trajectory prediction method integrating spatiotemporal relationships with hybrid time-step scene interaction Research on Obstacle Avoidance Strategy of Automated Heavy Vehicle Platoon in High-Speed Scenarios Cooperative energy optimal control involving optimization of longitudinal motion, powertrain, and air conditioning systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1