基于增强拉格朗日的深度强化学习电动汽车充电调度方法

IF 10.1 1区 工程技术 Q1 ENERGY & FUELS Applied Energy Pub Date : 2024-11-02 DOI:10.1016/j.apenergy.2024.124706
Lun Yang , Guibin Chen , Xiaoyu Cao
{"title":"基于增强拉格朗日的深度强化学习电动汽车充电调度方法","authors":"Lun Yang ,&nbsp;Guibin Chen ,&nbsp;Xiaoyu Cao","doi":"10.1016/j.apenergy.2024.124706","DOIUrl":null,"url":null,"abstract":"<div><div>The adoption of electric vehicles (EVs) is increasingly recognized as a promising solution to decarbonization, thereby large scales of EVs are integrated into transportation and power systems in recent years. The transportation and power systems' operation states largely influence EVs' patterns, introducing uncertainties into EVs' driving patterns and energy demand. Such uncertainties make it a challenge to optimize the operations of charging stations, which provide both charging and electric grid services such as demand responses. To handle this dilemma, this paper models the chargers' operation decisions as a constrained Markov decision process (CMDP). By synergistically combining the augmented Lagrangian method and soft actor-critic algorithm, a novel safe off-policy reinforcement learning (RL) approach is proposed in this paper to solve the CMDP. The actor-network is updated in a policy gradient manner with the Lagrangian value function. A double-critics network is adopted to estimate the action-value function to avoid overestimation bias synchronously. The proposed algorithm does not require a strong convexity guarantee of examined problems and is sample efficient. Comprehensive numerical experiments with real-world electricity prices demonstrate that our proposed algorithm can achieve high solution optimality and constraint compliance.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"378 ","pages":"Article 124706"},"PeriodicalIF":10.1000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep reinforcement learning-based charging scheduling approach with augmented Lagrangian for electric vehicles\",\"authors\":\"Lun Yang ,&nbsp;Guibin Chen ,&nbsp;Xiaoyu Cao\",\"doi\":\"10.1016/j.apenergy.2024.124706\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The adoption of electric vehicles (EVs) is increasingly recognized as a promising solution to decarbonization, thereby large scales of EVs are integrated into transportation and power systems in recent years. The transportation and power systems' operation states largely influence EVs' patterns, introducing uncertainties into EVs' driving patterns and energy demand. Such uncertainties make it a challenge to optimize the operations of charging stations, which provide both charging and electric grid services such as demand responses. To handle this dilemma, this paper models the chargers' operation decisions as a constrained Markov decision process (CMDP). By synergistically combining the augmented Lagrangian method and soft actor-critic algorithm, a novel safe off-policy reinforcement learning (RL) approach is proposed in this paper to solve the CMDP. The actor-network is updated in a policy gradient manner with the Lagrangian value function. A double-critics network is adopted to estimate the action-value function to avoid overestimation bias synchronously. The proposed algorithm does not require a strong convexity guarantee of examined problems and is sample efficient. Comprehensive numerical experiments with real-world electricity prices demonstrate that our proposed algorithm can achieve high solution optimality and constraint compliance.</div></div>\",\"PeriodicalId\":246,\"journal\":{\"name\":\"Applied Energy\",\"volume\":\"378 \",\"pages\":\"Article 124706\"},\"PeriodicalIF\":10.1000,\"publicationDate\":\"2024-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Energy\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306261924020890\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306261924020890","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0

摘要

电动汽车(EVs)的采用被越来越多的人认为是一种有望实现脱碳的解决方案,因此近年来电动汽车被大规模地集成到交通和电力系统中。交通和电力系统的运行状态在很大程度上影响着电动汽车的模式,给电动汽车的驾驶模式和能源需求带来了不确定性。这种不确定性给充电站的运营优化带来了挑战,因为充电站既要提供充电服务,又要提供电网服务(如需求响应)。为解决这一难题,本文将充电站的运营决策建模为受约束马尔可夫决策过程(CMDP)。通过协同结合增强拉格朗日法和软演员批评算法,本文提出了一种新颖的安全非政策强化学习(RL)方法来解决 CMDP。通过拉格朗日值函数,以策略梯度方式更新演员网络。采用双批判网络来估算行动价值函数,以避免同步高估偏差。所提出的算法不要求所研究问题具有很强的凸性保证,并且具有很高的采样效率。用真实世界的电价进行的综合数值实验证明,我们提出的算法可以获得较高的最优解,并符合约束条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A deep reinforcement learning-based charging scheduling approach with augmented Lagrangian for electric vehicles
The adoption of electric vehicles (EVs) is increasingly recognized as a promising solution to decarbonization, thereby large scales of EVs are integrated into transportation and power systems in recent years. The transportation and power systems' operation states largely influence EVs' patterns, introducing uncertainties into EVs' driving patterns and energy demand. Such uncertainties make it a challenge to optimize the operations of charging stations, which provide both charging and electric grid services such as demand responses. To handle this dilemma, this paper models the chargers' operation decisions as a constrained Markov decision process (CMDP). By synergistically combining the augmented Lagrangian method and soft actor-critic algorithm, a novel safe off-policy reinforcement learning (RL) approach is proposed in this paper to solve the CMDP. The actor-network is updated in a policy gradient manner with the Lagrangian value function. A double-critics network is adopted to estimate the action-value function to avoid overestimation bias synchronously. The proposed algorithm does not require a strong convexity guarantee of examined problems and is sample efficient. Comprehensive numerical experiments with real-world electricity prices demonstrate that our proposed algorithm can achieve high solution optimality and constraint compliance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Energy
Applied Energy 工程技术-工程:化工
CiteScore
21.20
自引率
10.70%
发文量
1830
审稿时长
41 days
期刊介绍: Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.
期刊最新文献
Boosting the power density of direct borohydride fuel cells to >600 mW cm−2 by cathode water management Editorial Board A distributed thermal-pressure coupling model of large-format lithium iron phosphate battery thermal runaway Optimization and parametric analysis of a novel design of Savonius hydrokinetic turbine using artificial neural network Delay-tolerant hierarchical distributed control for DC microgrid clusters considering microgrid autonomy
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1