Ozan Yazar, S. Coskun, Lin Li, Feng Zhang, Cong Huang
{"title":"基于Actor-Critic td3的HEV能量管理策略深度强化学习","authors":"Ozan Yazar, S. Coskun, Lin Li, Feng Zhang, Cong Huang","doi":"10.1109/HORA58378.2023.10156727","DOIUrl":null,"url":null,"abstract":"In the last decade, deep reinforcement learning (DRL) algorithms have been employed in the design of energy management strategy (EMS) for hybrid electric vehicles (HEVs). Investigation of the real-time applicability of DRL algorithms as an EMS is critical in terms of training time, fuel savings, and state-of-charge (SOC) sustainability. To this end, we propose a twin delayed deep deterministic policy gradient (TD3) algorithm that is an improved version of the deep deterministic policy gradient (DDPG) algorithm for HEV fuel savings. Compared to the existing Q-learning-based reinforcement learning and the deep Q-network-based and DDPG-based deep reinforcement algorithms, the proposed TD3 provides stable training efficiency, promising fuel economy, and a lower variation range of SOC charge sustainability under various drive cycles.","PeriodicalId":247679,"journal":{"name":"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Actor-Critic TD3-based Deep Reinforcement Learning for Energy Management Strategy of HEV\",\"authors\":\"Ozan Yazar, S. Coskun, Lin Li, Feng Zhang, Cong Huang\",\"doi\":\"10.1109/HORA58378.2023.10156727\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last decade, deep reinforcement learning (DRL) algorithms have been employed in the design of energy management strategy (EMS) for hybrid electric vehicles (HEVs). Investigation of the real-time applicability of DRL algorithms as an EMS is critical in terms of training time, fuel savings, and state-of-charge (SOC) sustainability. To this end, we propose a twin delayed deep deterministic policy gradient (TD3) algorithm that is an improved version of the deep deterministic policy gradient (DDPG) algorithm for HEV fuel savings. Compared to the existing Q-learning-based reinforcement learning and the deep Q-network-based and DDPG-based deep reinforcement algorithms, the proposed TD3 provides stable training efficiency, promising fuel economy, and a lower variation range of SOC charge sustainability under various drive cycles.\",\"PeriodicalId\":247679,\"journal\":{\"name\":\"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HORA58378.2023.10156727\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HORA58378.2023.10156727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Actor-Critic TD3-based Deep Reinforcement Learning for Energy Management Strategy of HEV
In the last decade, deep reinforcement learning (DRL) algorithms have been employed in the design of energy management strategy (EMS) for hybrid electric vehicles (HEVs). Investigation of the real-time applicability of DRL algorithms as an EMS is critical in terms of training time, fuel savings, and state-of-charge (SOC) sustainability. To this end, we propose a twin delayed deep deterministic policy gradient (TD3) algorithm that is an improved version of the deep deterministic policy gradient (DDPG) algorithm for HEV fuel savings. Compared to the existing Q-learning-based reinforcement learning and the deep Q-network-based and DDPG-based deep reinforcement algorithms, the proposed TD3 provides stable training efficiency, promising fuel economy, and a lower variation range of SOC charge sustainability under various drive cycles.