Adaptive Critic Control With Knowledge Transfer for Uncertain Nonlinear Dynamical Systems: A Reinforcement Learning Approach

IF 6.4 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2024-09-09 DOI:10.1109/TASE.2024.3453926
Liangju Zhang;Kun Zhang;Xiang Peng Xie;Mohammed Chadli
{"title":"Adaptive Critic Control With Knowledge Transfer for Uncertain Nonlinear Dynamical Systems: A Reinforcement Learning Approach","authors":"Liangju Zhang;Kun Zhang;Xiang Peng Xie;Mohammed Chadli","doi":"10.1109/TASE.2024.3453926","DOIUrl":null,"url":null,"abstract":"This paper presents an online transfer heuristic dynamic programming (THDP) control approach for a class of nonlinear discrete systems. The proposed approach integrates transfer learning with adaptive critic control. To design a robust optimal control strategy for the nonlinear discrete systems, we utilize sample data collected from a source task to acquire prior knowledge. This prior knowledge is subsequently used to guide the online control process of nonlinear systems of target tasks. To avoid negative transfer effects and conserve computational resources, we introduce a novel attenuation function with a truncation mechanism. Additionally, we develop a disturbance compensation control mechanism to address uncertainties. Furthermore, we demonstrate that the properties of the uncertain nonlinear systems under robust optimal control, as well as the weight error of neural networks, are ultimately uniformly bounded given certain conditions. Finally, two simulations are conducted to verify the performance of the proposed algorithm. Note to Practitioners—Adaptive dynamic programming (ADP) is one of the main methods to solve the Hamilton-Jacobi-Bellman (HJB) equation. However, when using neural network approximation, it often requires a long time of iteration and a large amount of computational process, wasting a lot of computational resources. For this reason, we propose an ADP control scheme with enhanced detection speed: that is, by learning a class of similar tasks to obtain prior knowledge to assist in the online control of our actual system. At the same time, this paper considers system disturbances, which means that they are more universal and robust. After simulation experiments, it has been proven that this scheme has good performance.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"6752-6761"},"PeriodicalIF":6.4000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10669391/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper presents an online transfer heuristic dynamic programming (THDP) control approach for a class of nonlinear discrete systems. The proposed approach integrates transfer learning with adaptive critic control. To design a robust optimal control strategy for the nonlinear discrete systems, we utilize sample data collected from a source task to acquire prior knowledge. This prior knowledge is subsequently used to guide the online control process of nonlinear systems of target tasks. To avoid negative transfer effects and conserve computational resources, we introduce a novel attenuation function with a truncation mechanism. Additionally, we develop a disturbance compensation control mechanism to address uncertainties. Furthermore, we demonstrate that the properties of the uncertain nonlinear systems under robust optimal control, as well as the weight error of neural networks, are ultimately uniformly bounded given certain conditions. Finally, two simulations are conducted to verify the performance of the proposed algorithm. Note to Practitioners—Adaptive dynamic programming (ADP) is one of the main methods to solve the Hamilton-Jacobi-Bellman (HJB) equation. However, when using neural network approximation, it often requires a long time of iteration and a large amount of computational process, wasting a lot of computational resources. For this reason, we propose an ADP control scheme with enhanced detection speed: that is, by learning a class of similar tasks to obtain prior knowledge to assist in the online control of our actual system. At the same time, this paper considers system disturbances, which means that they are more universal and robust. After simulation experiments, it has been proven that this scheme has good performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
针对不确定非线性动态系统的知识转移自适应批判控制:强化学习方法
针对一类非线性离散系统,提出了一种在线迁移启发式动态规划控制方法。该方法将迁移学习与自适应批评控制相结合。为了设计非线性离散系统的鲁棒最优控制策略,我们利用从源任务中收集的样本数据来获取先验知识。该先验知识随后用于指导目标任务非线性系统的在线控制过程。为了避免负传递效应和节约计算资源,我们引入了一种具有截断机制的衰减函数。此外,我们开发了一种干扰补偿控制机制来解决不确定性。进一步证明了在鲁棒最优控制下的不确定非线性系统的性质,以及神经网络的权值误差,在一定条件下最终是一致有界的。最后,通过两个仿真验证了所提算法的性能。自适应动态规划是求解Hamilton-Jacobi-Bellman (HJB)方程的主要方法之一。然而,在使用神经网络逼近时,往往需要长时间的迭代和大量的计算过程,浪费了大量的计算资源。为此,我们提出了一种提高检测速度的ADP控制方案:即通过学习一类相似任务来获得先验知识,以辅助我们实际系统的在线控制。同时,本文还考虑了系统扰动,使其具有更强的通用性和鲁棒性。经过仿真实验,证明了该方案具有良好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
期刊最新文献
Finite-time Formation for Two-Wheeled Mobile Robots with A Triggered and Saturated Control Observer-based Fault Estimation and Compensation Control for Nonlinear Multi-delays Systems With Multi-Source Faults Data-driven H ∞ performance analysis for event-triggered model-free systems: a data-based expression method Meta-Learning Enhanced Online Adaptive Control for Robust Motion of Autonomous Electric Vehicles Prescribed Iterative Learning Control for Switched Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1