{"title":"基于非策略强化学习的直升机非仿射偏航通道数据驱动跟踪控制","authors":"Kun Zhang;Shijie Luo;Huai-Ning Wu;Rong Su","doi":"10.1109/TAES.2025.3539264","DOIUrl":null,"url":null,"abstract":"This article presents an off-policy tracking control scheme for the continuous-time nonaffine yaw channel of uncrewed aerial vehicle helicopter. First, the article constructs an affine augmented system (AAS) within a parallel control structure to convert the original nonaffine tracking error dynamics into affine dynamics. Second, the article derives a stability criterion linking the nonaffine system and the AAS, demonstrating that the obtained zero-sum policy from the AAS can achieve the <inline-formula><tex-math>$H_\\infty$</tex-math></inline-formula> performance of the nonaffine system. Third, a data-driven off-policy tracking algorithm is designed for approximating the zero-sum solution of the Hamilton–Jacobi–Isaacs equations with unknown dynamics. Moreover, the recursive least squares process with a variable forgetting factor is employed to update the actor-critic neural network weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded of tracking errors is guaranteed. Finally, two application examples are offered in simulation to validate the effectiveness of this presented method.","PeriodicalId":13157,"journal":{"name":"IEEE Transactions on Aerospace and Electronic Systems","volume":"61 3","pages":"7725-7737"},"PeriodicalIF":5.7000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-Driven Tracking Control for Nonaffine Yaw Channel of Helicopter via Off-Policy Reinforcement Learning\",\"authors\":\"Kun Zhang;Shijie Luo;Huai-Ning Wu;Rong Su\",\"doi\":\"10.1109/TAES.2025.3539264\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article presents an off-policy tracking control scheme for the continuous-time nonaffine yaw channel of uncrewed aerial vehicle helicopter. First, the article constructs an affine augmented system (AAS) within a parallel control structure to convert the original nonaffine tracking error dynamics into affine dynamics. Second, the article derives a stability criterion linking the nonaffine system and the AAS, demonstrating that the obtained zero-sum policy from the AAS can achieve the <inline-formula><tex-math>$H_\\\\infty$</tex-math></inline-formula> performance of the nonaffine system. Third, a data-driven off-policy tracking algorithm is designed for approximating the zero-sum solution of the Hamilton–Jacobi–Isaacs equations with unknown dynamics. Moreover, the recursive least squares process with a variable forgetting factor is employed to update the actor-critic neural network weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded of tracking errors is guaranteed. Finally, two application examples are offered in simulation to validate the effectiveness of this presented method.\",\"PeriodicalId\":13157,\"journal\":{\"name\":\"IEEE Transactions on Aerospace and Electronic Systems\",\"volume\":\"61 3\",\"pages\":\"7725-7737\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-02-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Aerospace and Electronic Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10876598/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, AEROSPACE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Aerospace and Electronic Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10876598/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
Data-Driven Tracking Control for Nonaffine Yaw Channel of Helicopter via Off-Policy Reinforcement Learning
This article presents an off-policy tracking control scheme for the continuous-time nonaffine yaw channel of uncrewed aerial vehicle helicopter. First, the article constructs an affine augmented system (AAS) within a parallel control structure to convert the original nonaffine tracking error dynamics into affine dynamics. Second, the article derives a stability criterion linking the nonaffine system and the AAS, demonstrating that the obtained zero-sum policy from the AAS can achieve the $H_\infty$ performance of the nonaffine system. Third, a data-driven off-policy tracking algorithm is designed for approximating the zero-sum solution of the Hamilton–Jacobi–Isaacs equations with unknown dynamics. Moreover, the recursive least squares process with a variable forgetting factor is employed to update the actor-critic neural network weights, with the algorithm's convergence being proven. Then, the uniformly ultimately bounded of tracking errors is guaranteed. Finally, two application examples are offered in simulation to validate the effectiveness of this presented method.
期刊介绍:
IEEE Transactions on Aerospace and Electronic Systems focuses on the organization, design, development, integration, and operation of complex systems for space, air, ocean, or ground environment. These systems include, but are not limited to, navigation, avionics, spacecraft, aerospace power, radar, sonar, telemetry, defense, transportation, automated testing, and command and control.