{"title":"基于强化学习的异构电池组最优循环","authors":"Vivek Deulkar, J. Nair","doi":"10.1109/SmartGridComm51999.2021.9632307","DOIUrl":null,"url":null,"abstract":"We consider the problem of optimal charging/discharging of a bank of heterogenous battery units, driven by stochastic electricity generation and demand processes. The batteries in the battery bank may differ with respect to their capacities, ramp constraints, losses, as well as cycling costs. The goal is to minimize the degradation costs associated with battery cycling in the long run; this is posed formally as a Markov decision process. We propose a linear function approximation based Q-learning algorithm for learning the optimal solution, using a specially designed class of kernel functions that approximate the structure of the value functions associated with the MDP. The proposed algorithm is validated via an extensive case study.","PeriodicalId":378884,"journal":{"name":"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimal Cycling of a Heterogenous Battery Bank via Reinforcement Learning\",\"authors\":\"Vivek Deulkar, J. Nair\",\"doi\":\"10.1109/SmartGridComm51999.2021.9632307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of optimal charging/discharging of a bank of heterogenous battery units, driven by stochastic electricity generation and demand processes. The batteries in the battery bank may differ with respect to their capacities, ramp constraints, losses, as well as cycling costs. The goal is to minimize the degradation costs associated with battery cycling in the long run; this is posed formally as a Markov decision process. We propose a linear function approximation based Q-learning algorithm for learning the optimal solution, using a specially designed class of kernel functions that approximate the structure of the value functions associated with the MDP. The proposed algorithm is validated via an extensive case study.\",\"PeriodicalId\":378884,\"journal\":{\"name\":\"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SmartGridComm51999.2021.9632307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SmartGridComm51999.2021.9632307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们考虑了在随机发电和随机需求过程驱动下的一组异质电池单元的最优充放电问题。电池组中的电池可能在容量、斜坡限制、损耗以及循环成本方面有所不同。从长远来看,目标是最大限度地减少与电池循环相关的退化成本;这是一个正式的马尔可夫决策过程。我们提出了一种基于线性函数近似的q -学习算法,用于学习最优解,该算法使用了一类特殊设计的核函数,这些核函数近似于与MDP相关的值函数的结构。通过广泛的案例研究验证了所提出的算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Optimal Cycling of a Heterogenous Battery Bank via Reinforcement Learning
We consider the problem of optimal charging/discharging of a bank of heterogenous battery units, driven by stochastic electricity generation and demand processes. The batteries in the battery bank may differ with respect to their capacities, ramp constraints, losses, as well as cycling costs. The goal is to minimize the degradation costs associated with battery cycling in the long run; this is posed formally as a Markov decision process. We propose a linear function approximation based Q-learning algorithm for learning the optimal solution, using a specially designed class of kernel functions that approximate the structure of the value functions associated with the MDP. The proposed algorithm is validated via an extensive case study.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Low-complexity Risk-averse MPC for EMS Modeling framework for study of distributed and centralized smart grid system services Data-Driven Frequency Regulation Reserve Prediction Based on Deep Learning Approach Data Communication Interfaces in Smart Grid Real-time Simulations: Challenges and Solutions Modeling of Cyber Attacks Against Converter-Driven Stability of PMSG-Based Wind Farms with Intentional Subsynchronous Resonance
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1