基于Actor-Critic td3的HEV能量管理策略深度强化学习

2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) Pub Date : 2023-06-08 DOI:10.1109/HORA58378.2023.10156727

Ozan Yazar, S. Coskun, Lin Li, Feng Zhang, Cong Huang

{"title":"基于Actor-Critic td3的HEV能量管理策略深度强化学习","authors":"Ozan Yazar, S. Coskun, Lin Li, Feng Zhang, Cong Huang","doi":"10.1109/HORA58378.2023.10156727","DOIUrl":null,"url":null,"abstract":"In the last decade, deep reinforcement learning (DRL) algorithms have been employed in the design of energy management strategy (EMS) for hybrid electric vehicles (HEVs). Investigation of the real-time applicability of DRL algorithms as an EMS is critical in terms of training time, fuel savings, and state-of-charge (SOC) sustainability. To this end, we propose a twin delayed deep deterministic policy gradient (TD3) algorithm that is an improved version of the deep deterministic policy gradient (DDPG) algorithm for HEV fuel savings. Compared to the existing Q-learning-based reinforcement learning and the deep Q-network-based and DDPG-based deep reinforcement algorithms, the proposed TD3 provides stable training efficiency, promising fuel economy, and a lower variation range of SOC charge sustainability under various drive cycles.","PeriodicalId":247679,"journal":{"name":"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Actor-Critic TD3-based Deep Reinforcement Learning for Energy Management Strategy of HEV\",\"authors\":\"Ozan Yazar, S. Coskun, Lin Li, Feng Zhang, Cong Huang\",\"doi\":\"10.1109/HORA58378.2023.10156727\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last decade, deep reinforcement learning (DRL) algorithms have been employed in the design of energy management strategy (EMS) for hybrid electric vehicles (HEVs). Investigation of the real-time applicability of DRL algorithms as an EMS is critical in terms of training time, fuel savings, and state-of-charge (SOC) sustainability. To this end, we propose a twin delayed deep deterministic policy gradient (TD3) algorithm that is an improved version of the deep deterministic policy gradient (DDPG) algorithm for HEV fuel savings. Compared to the existing Q-learning-based reinforcement learning and the deep Q-network-based and DDPG-based deep reinforcement algorithms, the proposed TD3 provides stable training efficiency, promising fuel economy, and a lower variation range of SOC charge sustainability under various drive cycles.\",\"PeriodicalId\":247679,\"journal\":{\"name\":\"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HORA58378.2023.10156727\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HORA58378.2023.10156727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在过去的十年中，深度强化学习(DRL)算法被应用于混合动力汽车(hev)的能量管理策略(EMS)设计中。研究DRL算法作为EMS的实时适用性对于训练时间、燃料节约和充电状态(SOC)可持续性至关重要。为此，我们提出了一种双延迟深度确定性策略梯度(TD3)算法，该算法是用于HEV节油的深度确定性策略梯度(DDPG)算法的改进版本。与现有的基于q -learning的强化学习算法、基于深度q -network的深度强化学习算法和基于ddpg的深度强化算法相比，TD3具有稳定的训练效率、良好的燃油经济性和较低的不同驱动循环下SOC充电可持续性变化范围。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Actor-Critic TD3-based Deep Reinforcement Learning for Energy Management Strategy of HEV

In the last decade, deep reinforcement learning (DRL) algorithms have been employed in the design of energy management strategy (EMS) for hybrid electric vehicles (HEVs). Investigation of the real-time applicability of DRL algorithms as an EMS is critical in terms of training time, fuel savings, and state-of-charge (SOC) sustainability. To this end, we propose a twin delayed deep deterministic policy gradient (TD3) algorithm that is an improved version of the deep deterministic policy gradient (DDPG) algorithm for HEV fuel savings. Compared to the existing Q-learning-based reinforcement learning and the deep Q-network-based and DDPG-based deep reinforcement algorithms, the proposed TD3 provides stable training efficiency, promising fuel economy, and a lower variation range of SOC charge sustainability under various drive cycles.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)

自引率

0.00%

发文量