不使用障碍 Lyapunov 函数方法的全状态约束非线性严格反馈系统的自适应优化反步进跟踪控制

Optimal Control Applications and Methods Pub Date : 2024-04-30 DOI:10.1002/oca.3136

Boyan Zhu, Ning Xu, Guangdeng Zong, Xudong Zhao

{"title":"不使用障碍 Lyapunov 函数方法的全状态约束非线性严格反馈系统的自适应优化反步进跟踪控制","authors":"Boyan Zhu, Ning Xu, Guangdeng Zong, Xudong Zhao","doi":"10.1002/oca.3136","DOIUrl":null,"url":null,"abstract":"In this article, the problem of adaptive optimal tracking control is studied for nonlinear strict‐feedback systems. While not directly measurable, the states of these systems are subject to both time‐varying and asymmetric constraints. Bypassing the conventional barrier Lyapunov function method, the constrained system is transformed into its unconstrained counterpart, thereby obviating the need for feasibility conditions. A specially designed reinforcement learning (RL) algorithm, featuring an observer‐critic‐actor architecture, is deployed in an adaptive optimal control scheme to ensure the stabilization of the converted unconstrained system. Within this architecture, the observer estimates the unmeasurable system states, the critic evaluates the control performance, and the actor executes the control actions. Furthermore, enhancements to the RL algorithm lead to relaxed conditions of persistent excitation, and the design methodology for the observer overcomes the restrictions imposed by the Hurwitz equation. The Lyapunov stability theorem is applied for two primary purposes: to ascertain the boundedness of all signals within the closed‐loop system, and to ensure the accuracy of the output signal in tracking the desired reference trajectory. Finally, numerical and practical simulations are provided to corroborate the effectiveness of the proposed control strategy.","PeriodicalId":501055,"journal":{"name":"Optimal Control Applications and Methods","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive optimized backstepping tracking control for full‐state constrained nonlinear strict‐feedback systems without using barrier Lyapunov function method\",\"authors\":\"Boyan Zhu, Ning Xu, Guangdeng Zong, Xudong Zhao\",\"doi\":\"10.1002/oca.3136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, the problem of adaptive optimal tracking control is studied for nonlinear strict‐feedback systems. While not directly measurable, the states of these systems are subject to both time‐varying and asymmetric constraints. Bypassing the conventional barrier Lyapunov function method, the constrained system is transformed into its unconstrained counterpart, thereby obviating the need for feasibility conditions. A specially designed reinforcement learning (RL) algorithm, featuring an observer‐critic‐actor architecture, is deployed in an adaptive optimal control scheme to ensure the stabilization of the converted unconstrained system. Within this architecture, the observer estimates the unmeasurable system states, the critic evaluates the control performance, and the actor executes the control actions. Furthermore, enhancements to the RL algorithm lead to relaxed conditions of persistent excitation, and the design methodology for the observer overcomes the restrictions imposed by the Hurwitz equation. The Lyapunov stability theorem is applied for two primary purposes: to ascertain the boundedness of all signals within the closed‐loop system, and to ensure the accuracy of the output signal in tracking the desired reference trajectory. Finally, numerical and practical simulations are provided to corroborate the effectiveness of the proposed control strategy.\",\"PeriodicalId\":501055,\"journal\":{\"name\":\"Optimal Control Applications and Methods\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optimal Control Applications and Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/oca.3136\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optimal Control Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oca.3136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文研究了非线性严格反馈系统的自适应最优跟踪控制问题。这些系统的状态虽然不可直接测量，但同时受到时变和非对称约束。绕过传统的障碍 Lyapunov 函数方法，受约束系统被转化为无约束系统，从而省去了可行性条件。在自适应最优控制方案中采用了专门设计的强化学习（RL）算法，该算法采用观察者-批判者-行动者架构，以确保转换后的无约束系统的稳定。在这一架构中，观察者估计不可测量的系统状态，批评者评估控制性能，行动者执行控制行动。此外，对 RL 算法的改进放宽了持续激励的条件，观测器的设计方法克服了 Hurwitz 方程的限制。应用李亚普诺夫稳定性定理有两个主要目的：确定闭环系统内所有信号的有界性，以及确保输出信号在跟踪所需的参考轨迹时的准确性。最后，还提供了数值和实际模拟，以证实所提控制策略的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Adaptive optimized backstepping tracking control for full‐state constrained nonlinear strict‐feedback systems without using barrier Lyapunov function method

In this article, the problem of adaptive optimal tracking control is studied for nonlinear strict‐feedback systems. While not directly measurable, the states of these systems are subject to both time‐varying and asymmetric constraints. Bypassing the conventional barrier Lyapunov function method, the constrained system is transformed into its unconstrained counterpart, thereby obviating the need for feasibility conditions. A specially designed reinforcement learning (RL) algorithm, featuring an observer‐critic‐actor architecture, is deployed in an adaptive optimal control scheme to ensure the stabilization of the converted unconstrained system. Within this architecture, the observer estimates the unmeasurable system states, the critic evaluates the control performance, and the actor executes the control actions. Furthermore, enhancements to the RL algorithm lead to relaxed conditions of persistent excitation, and the design methodology for the observer overcomes the restrictions imposed by the Hurwitz equation. The Lyapunov stability theorem is applied for two primary purposes: to ascertain the boundedness of all signals within the closed‐loop system, and to ensure the accuracy of the output signal in tracking the desired reference trajectory. Finally, numerical and practical simulations are provided to corroborate the effectiveness of the proposed control strategy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Optimal Control Applications and Methods

自引率

0.00%

发文量