{"title":"Adaptive Critic Control With Knowledge Transfer for Uncertain Nonlinear Dynamical Systems: A Reinforcement Learning Approach","authors":"Liangju Zhang;Kun Zhang;Xiang Peng Xie;Mohammed Chadli","doi":"10.1109/TASE.2024.3453926","DOIUrl":null,"url":null,"abstract":"This paper presents an online transfer heuristic dynamic programming (THDP) control approach for a class of nonlinear discrete systems. The proposed approach integrates transfer learning with adaptive critic control. To design a robust optimal control strategy for the nonlinear discrete systems, we utilize sample data collected from a source task to acquire prior knowledge. This prior knowledge is subsequently used to guide the online control process of nonlinear systems of target tasks. To avoid negative transfer effects and conserve computational resources, we introduce a novel attenuation function with a truncation mechanism. Additionally, we develop a disturbance compensation control mechanism to address uncertainties. Furthermore, we demonstrate that the properties of the uncertain nonlinear systems under robust optimal control, as well as the weight error of neural networks, are ultimately uniformly bounded given certain conditions. Finally, two simulations are conducted to verify the performance of the proposed algorithm. Note to Practitioners—Adaptive dynamic programming (ADP) is one of the main methods to solve the Hamilton-Jacobi-Bellman (HJB) equation. However, when using neural network approximation, it often requires a long time of iteration and a large amount of computational process, wasting a lot of computational resources. For this reason, we propose an ADP control scheme with enhanced detection speed: that is, by learning a class of similar tasks to obtain prior knowledge to assist in the online control of our actual system. At the same time, this paper considers system disturbances, which means that they are more universal and robust. After simulation experiments, it has been proven that this scheme has good performance.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"6752-6761"},"PeriodicalIF":6.4000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10669391/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents an online transfer heuristic dynamic programming (THDP) control approach for a class of nonlinear discrete systems. The proposed approach integrates transfer learning with adaptive critic control. To design a robust optimal control strategy for the nonlinear discrete systems, we utilize sample data collected from a source task to acquire prior knowledge. This prior knowledge is subsequently used to guide the online control process of nonlinear systems of target tasks. To avoid negative transfer effects and conserve computational resources, we introduce a novel attenuation function with a truncation mechanism. Additionally, we develop a disturbance compensation control mechanism to address uncertainties. Furthermore, we demonstrate that the properties of the uncertain nonlinear systems under robust optimal control, as well as the weight error of neural networks, are ultimately uniformly bounded given certain conditions. Finally, two simulations are conducted to verify the performance of the proposed algorithm. Note to Practitioners—Adaptive dynamic programming (ADP) is one of the main methods to solve the Hamilton-Jacobi-Bellman (HJB) equation. However, when using neural network approximation, it often requires a long time of iteration and a large amount of computational process, wasting a lot of computational resources. For this reason, we propose an ADP control scheme with enhanced detection speed: that is, by learning a class of similar tasks to obtain prior knowledge to assist in the online control of our actual system. At the same time, this paper considers system disturbances, which means that they are more universal and robust. After simulation experiments, it has been proven that this scheme has good performance.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.