Optimized tracking control using reinforcement learning and backstepping technique for canonical nonlinear unknown dynamic system

Yanfen Song, Zijun Li, Guoxing Wen
{"title":"Optimized tracking control using reinforcement learning and backstepping technique for canonical nonlinear unknown dynamic system","authors":"Yanfen Song, Zijun Li, Guoxing Wen","doi":"10.1002/oca.3115","DOIUrl":null,"url":null,"abstract":"The work addresses the optimized tracking control problem by combining both reinforcement learning (RL) and backstepping technique for the canonical nonlinear unknown dynamic system. Since such dynamic system contains multiple state variables with differential relation, the backstepping technique is considered by making a virtual control sequence in accordance with Lyapunov functions. In the last backstepping step, the optimized actual control is derived by performing the RL under identifier-critic-actor structure, where RL is to overcome the difficulty coming from solving Hamilton-Jacobi-Bellman (HJB) equation. Different from the traditional RL optimizing methods that find the RL updating laws from the square of the HJB equation's approximation, this optimized control is to find the RL training laws from the negative gradient of a simple positive definite function, which is equivalent to the HJB equation. The result shows that this optimized control can obviously alleviate the algorithm complexity. Meanwhile, it can remove the requirement of known dynamic as well. Finally, theory and simulation indicate the feasibility of this optimized control.","PeriodicalId":501055,"journal":{"name":"Optimal Control Applications and Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optimal Control Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oca.3115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The work addresses the optimized tracking control problem by combining both reinforcement learning (RL) and backstepping technique for the canonical nonlinear unknown dynamic system. Since such dynamic system contains multiple state variables with differential relation, the backstepping technique is considered by making a virtual control sequence in accordance with Lyapunov functions. In the last backstepping step, the optimized actual control is derived by performing the RL under identifier-critic-actor structure, where RL is to overcome the difficulty coming from solving Hamilton-Jacobi-Bellman (HJB) equation. Different from the traditional RL optimizing methods that find the RL updating laws from the square of the HJB equation's approximation, this optimized control is to find the RL training laws from the negative gradient of a simple positive definite function, which is equivalent to the HJB equation. The result shows that this optimized control can obviously alleviate the algorithm complexity. Meanwhile, it can remove the requirement of known dynamic as well. Finally, theory and simulation indicate the feasibility of this optimized control.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
针对典型非线性未知动态系统使用强化学习和反步进技术进行优化跟踪控制
该研究针对典型非线性未知动态系统,结合强化学习(RL)和反步技术,解决了优化跟踪控制问题。由于这种动态系统包含具有微分关系的多个状态变量,因此考虑采用反向步进技术,根据 Lyapunov 函数建立虚拟控制序列。在最后一个反步进步骤中,通过在标识符-批判者-作用者结构下执行 RL 得出优化的实际控制,其中 RL 是为了克服求解汉密尔顿-雅各比-贝尔曼(HJB)方程所带来的困难。与传统的 RL 优化方法从 HJB 方程近似值的平方中寻找 RL 更新规律不同,该优化控制是从一个简单正定函数的负梯度中寻找 RL 训练规律,该函数等价于 HJB 方程。结果表明,这种优化控制可以明显减轻算法的复杂性。同时,它还能消除对已知动态的要求。最后,理论和仿真表明了这种优化控制的可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An optimal demand side management for microgrid cost minimization considering renewables Output feedback control of anti‐linear systems using adaptive dynamic programming Reachable set estimation of delayed Markovian jump neural networks based on an augmented zero equality approach Adaptive neural network dynamic surface optimal saturation control for single‐phase grid‐connected photovoltaic systems Intelligent integration of ANN and H‐infinity control for optimal enhanced performance of a wind generation unit linked to a power system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1