{"title":"Reinforcement Learning-Based 3D Trajectory Tracking Control of Hypersonic Gliding Vehicles With Time-Varying Uncertainties","authors":"Biao Luo;Jingyi Sun;Rui Tang;Xiaodong Xu","doi":"10.1109/TASE.2024.3481422","DOIUrl":null,"url":null,"abstract":"In this paper, a robust three-dimensional trajectory tracking control scheme based on reinforcement learning is proposed for the glide phase of a hypersonic gliding vehicle (HGV) with time-varying uncertainties. First, the non-affine nonlinear full-state kinematics and dynamics model of the HGV glide phase is constructed. Then, without linearizing the system, the desired multiplanar reference trajectories for HGVs are planned based on the pseudo-spectral theory under the input constraints, initial conditions, and terminal conditions. Subsequently, the full-state error system is generated by subtracting the reference system state from the actual state of the HGV system with time-varying uncertainty. For the full-state HGV error system with time-varying uncertainty and input constraints, we design a reinforcement learning-based optimal control scheme for its nominal system and establish the equivalence between this optimal control and the robust control of the original HGV error system. A single-evaluation network structure is used in the concrete implementation to reduce the computational cost. A rigorous theory is given to demonstrate the uniform ultimate boundedness of the closed-loop system and the weight error. Finally, we perform simulation traces for reference trajectories with different optimization performances to verify the effectiveness of the proposed method. Note to Practitioners—There are various constraints and uncertainties in the glide phase of HGVs, which is the hinge connecting the initial descent phase and the terminal management phase. How to design robust trajectory tracking controllers for the glide phase of HGVs with complex environments and large span of flight parameters is of great significance to aerial guidance practitioners. In this paper, an RL-based three-dimensional trajectory robust tracking guidance method is proposed for the HGV glide phase system, which can resist time-varying uncertainties and satisfy flight constraints. The uniform ultimate boundedness of the closed-loop system is proved using the Lyapunov method. The proposed tracking algorithm is effective for reference trajectories with different performance indexes.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"8187-8199"},"PeriodicalIF":6.4000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10737663/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, a robust three-dimensional trajectory tracking control scheme based on reinforcement learning is proposed for the glide phase of a hypersonic gliding vehicle (HGV) with time-varying uncertainties. First, the non-affine nonlinear full-state kinematics and dynamics model of the HGV glide phase is constructed. Then, without linearizing the system, the desired multiplanar reference trajectories for HGVs are planned based on the pseudo-spectral theory under the input constraints, initial conditions, and terminal conditions. Subsequently, the full-state error system is generated by subtracting the reference system state from the actual state of the HGV system with time-varying uncertainty. For the full-state HGV error system with time-varying uncertainty and input constraints, we design a reinforcement learning-based optimal control scheme for its nominal system and establish the equivalence between this optimal control and the robust control of the original HGV error system. A single-evaluation network structure is used in the concrete implementation to reduce the computational cost. A rigorous theory is given to demonstrate the uniform ultimate boundedness of the closed-loop system and the weight error. Finally, we perform simulation traces for reference trajectories with different optimization performances to verify the effectiveness of the proposed method. Note to Practitioners—There are various constraints and uncertainties in the glide phase of HGVs, which is the hinge connecting the initial descent phase and the terminal management phase. How to design robust trajectory tracking controllers for the glide phase of HGVs with complex environments and large span of flight parameters is of great significance to aerial guidance practitioners. In this paper, an RL-based three-dimensional trajectory robust tracking guidance method is proposed for the HGV glide phase system, which can resist time-varying uncertainties and satisfy flight constraints. The uniform ultimate boundedness of the closed-loop system is proved using the Lyapunov method. The proposed tracking algorithm is effective for reference trajectories with different performance indexes.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.