{"title":"A Reinforcement-Learning-Enhanced Spoofing Algorithm for UAV With GPS/INS-Integrated Navigation","authors":"Xiaomeng Ma;Taohan Sun;Meiguo Gao","doi":"10.1109/TAES.2025.3545388","DOIUrl":null,"url":null,"abstract":"This article optimizes the covert deception effects on UAV GPS/INS integrated navigation systems by combining spatial information entropy (SIE) and maximum entropy reinforcement learning (MERL) techniques. Specifically, we integrate insights from SIE to meticulously articulate spatial correlations, thereby intricately refining the entropy components within MERL, where this nuanced refinement aims to attain an elevated distribution of navigational spoofing positions. Given that UAV flight control commands are determined exclusively by the current positioning results, regardless of whether the signals are authentic or counterfeit, the navigation deception process satisfies Markov properties. Subsequently, the article establishes theoretical evidence for the Gaussian distribution properties of spoofing positions based on radar Kalman Filter (KF) estimation, and enforces stealth and stability constraints through chi-square distributed random variables. Building on these constraints, a reward function is formulated to jointly optimize deception position concealment, trajectory stability, and successful navigation of the victim UAV to the actual destination. To achieve these objectives, spatial information entropy (SIE) is introduced to model the positional correlations among the deception location, actual destination, and deception destination. Finally, we propose an algorithm based on soft actor-critic (SAC) and SIE, named SIE-SAC, to coordinate the learning process between the deception strategy and the SIE. Without prior knowledge of the UAV's reference trajectory or internal KF parameters, comparative results show that SIE improves deception position concealment. Ablation experiments further validate the constraints' role in stabilizing deceptive trajectories, and the SIE-SAC covert spoofing effect seamlessly extends to three-dimensional scenario.","PeriodicalId":13157,"journal":{"name":"IEEE Transactions on Aerospace and Electronic Systems","volume":"61 4","pages":"8659-8673"},"PeriodicalIF":5.7000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Aerospace and Electronic Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10904026/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0
Abstract
This article optimizes the covert deception effects on UAV GPS/INS integrated navigation systems by combining spatial information entropy (SIE) and maximum entropy reinforcement learning (MERL) techniques. Specifically, we integrate insights from SIE to meticulously articulate spatial correlations, thereby intricately refining the entropy components within MERL, where this nuanced refinement aims to attain an elevated distribution of navigational spoofing positions. Given that UAV flight control commands are determined exclusively by the current positioning results, regardless of whether the signals are authentic or counterfeit, the navigation deception process satisfies Markov properties. Subsequently, the article establishes theoretical evidence for the Gaussian distribution properties of spoofing positions based on radar Kalman Filter (KF) estimation, and enforces stealth and stability constraints through chi-square distributed random variables. Building on these constraints, a reward function is formulated to jointly optimize deception position concealment, trajectory stability, and successful navigation of the victim UAV to the actual destination. To achieve these objectives, spatial information entropy (SIE) is introduced to model the positional correlations among the deception location, actual destination, and deception destination. Finally, we propose an algorithm based on soft actor-critic (SAC) and SIE, named SIE-SAC, to coordinate the learning process between the deception strategy and the SIE. Without prior knowledge of the UAV's reference trajectory or internal KF parameters, comparative results show that SIE improves deception position concealment. Ablation experiments further validate the constraints' role in stabilizing deceptive trajectories, and the SIE-SAC covert spoofing effect seamlessly extends to three-dimensional scenario.
期刊介绍:
IEEE Transactions on Aerospace and Electronic Systems focuses on the organization, design, development, integration, and operation of complex systems for space, air, ocean, or ground environment. These systems include, but are not limited to, navigation, avionics, spacecraft, aerospace power, radar, sonar, telemetry, defense, transportation, automated testing, and command and control.