A Reinforcement-Learning-Enhanced Spoofing Algorithm for UAV With GPS/INS-Integrated Navigation

IF 5.7 2区 计算机科学 Q1 ENGINEERING, AEROSPACE IEEE Transactions on Aerospace and Electronic Systems Pub Date : 2025-02-25 DOI:10.1109/TAES.2025.3545388
Xiaomeng Ma;Taohan Sun;Meiguo Gao
{"title":"A Reinforcement-Learning-Enhanced Spoofing Algorithm for UAV With GPS/INS-Integrated Navigation","authors":"Xiaomeng Ma;Taohan Sun;Meiguo Gao","doi":"10.1109/TAES.2025.3545388","DOIUrl":null,"url":null,"abstract":"This article optimizes the covert deception effects on UAV GPS/INS integrated navigation systems by combining spatial information entropy (SIE) and maximum entropy reinforcement learning (MERL) techniques. Specifically, we integrate insights from SIE to meticulously articulate spatial correlations, thereby intricately refining the entropy components within MERL, where this nuanced refinement aims to attain an elevated distribution of navigational spoofing positions. Given that UAV flight control commands are determined exclusively by the current positioning results, regardless of whether the signals are authentic or counterfeit, the navigation deception process satisfies Markov properties. Subsequently, the article establishes theoretical evidence for the Gaussian distribution properties of spoofing positions based on radar Kalman Filter (KF) estimation, and enforces stealth and stability constraints through chi-square distributed random variables. Building on these constraints, a reward function is formulated to jointly optimize deception position concealment, trajectory stability, and successful navigation of the victim UAV to the actual destination. To achieve these objectives, spatial information entropy (SIE) is introduced to model the positional correlations among the deception location, actual destination, and deception destination. Finally, we propose an algorithm based on soft actor-critic (SAC) and SIE, named SIE-SAC, to coordinate the learning process between the deception strategy and the SIE. Without prior knowledge of the UAV's reference trajectory or internal KF parameters, comparative results show that SIE improves deception position concealment. Ablation experiments further validate the constraints' role in stabilizing deceptive trajectories, and the SIE-SAC covert spoofing effect seamlessly extends to three-dimensional scenario.","PeriodicalId":13157,"journal":{"name":"IEEE Transactions on Aerospace and Electronic Systems","volume":"61 4","pages":"8659-8673"},"PeriodicalIF":5.7000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Aerospace and Electronic Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10904026/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, AEROSPACE","Score":null,"Total":0}
引用次数: 0

Abstract

This article optimizes the covert deception effects on UAV GPS/INS integrated navigation systems by combining spatial information entropy (SIE) and maximum entropy reinforcement learning (MERL) techniques. Specifically, we integrate insights from SIE to meticulously articulate spatial correlations, thereby intricately refining the entropy components within MERL, where this nuanced refinement aims to attain an elevated distribution of navigational spoofing positions. Given that UAV flight control commands are determined exclusively by the current positioning results, regardless of whether the signals are authentic or counterfeit, the navigation deception process satisfies Markov properties. Subsequently, the article establishes theoretical evidence for the Gaussian distribution properties of spoofing positions based on radar Kalman Filter (KF) estimation, and enforces stealth and stability constraints through chi-square distributed random variables. Building on these constraints, a reward function is formulated to jointly optimize deception position concealment, trajectory stability, and successful navigation of the victim UAV to the actual destination. To achieve these objectives, spatial information entropy (SIE) is introduced to model the positional correlations among the deception location, actual destination, and deception destination. Finally, we propose an algorithm based on soft actor-critic (SAC) and SIE, named SIE-SAC, to coordinate the learning process between the deception strategy and the SIE. Without prior knowledge of the UAV's reference trajectory or internal KF parameters, comparative results show that SIE improves deception position concealment. Ablation experiments further validate the constraints' role in stabilizing deceptive trajectories, and the SIE-SAC covert spoofing effect seamlessly extends to three-dimensional scenario.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GPS/ ins组合导航无人机的强化学习增强欺骗算法
结合空间信息熵(SIE)和最大熵强化学习(MERL)技术,优化了无人机GPS/INS组合导航系统的隐蔽欺骗效果。具体来说,我们整合了来自SIE的见解,细致地阐明了空间相关性,从而复杂地精炼了MERL中的熵成分,这种细微的改进旨在获得更高的导航欺骗位置分布。考虑到无人机飞行控制指令完全由当前定位结果决定,无论信号是真还是假,导航欺骗过程都满足马尔可夫性质。随后,本文基于雷达卡尔曼滤波(KF)估计建立了欺骗位置高斯分布特性的理论证据,并通过卡方分布随机变量强制隐身和稳定性约束。在这些约束条件的基础上,建立了一个奖励函数,共同优化欺骗位置隐蔽性、弹道稳定性和受害无人机成功导航到实际目的地。为了实现这些目标,引入空间信息熵(SIE)来模拟欺骗地点、实际目的地和欺骗目的地之间的位置相关性。最后,我们提出了一种基于软行为批评家(SAC)和SIE的算法,称为SIE-SAC,以协调欺骗策略和SIE之间的学习过程。在不事先知道UAV参考弹道或内部KF参数的情况下,对比结果表明SIE提高了欺骗位置隐蔽性。消融实验进一步验证了约束条件在稳定欺骗轨迹中的作用,并将SIE-SAC隐蔽欺骗效果无缝扩展到三维场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.80
自引率
13.60%
发文量
433
审稿时长
8.7 months
期刊介绍: IEEE Transactions on Aerospace and Electronic Systems focuses on the organization, design, development, integration, and operation of complex systems for space, air, ocean, or ground environment. These systems include, but are not limited to, navigation, avionics, spacecraft, aerospace power, radar, sonar, telemetry, defense, transportation, automated testing, and command and control.
期刊最新文献
Multidimensional Assessment of the VMF3-FC and Its Application in PPP-IAR EdgeEnhance-YOLO: A Lightweight Small Object Detection Model with Multi-Dimensional Edge Enhancement Neural Network Aided Information Filtering for Model Uncertainty Robust Direct Position Estimation Based on Grid Space Reduction and Data Association in Complex Environments Adaptive Super-Twisting Kernel Dynamic Programming: Energy Optimal and Robust Theory Application for Pursuit-Evasion Game System
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1