Intelligent Anti-Jamming Decision With Continuous Action and State in Bivariate Frequency Agility Communication System

IF 7.4 1区 计算机科学 Q1 TELECOMMUNICATIONS IEEE Transactions on Cognitive Communications and Networking Pub Date : 2023-08-18 DOI:10.1109/TCCN.2023.3306363
Yupei Zhang;Zhijin Zhao;Shilian Zheng;Fangfang Qiang
{"title":"Intelligent Anti-Jamming Decision With Continuous Action and State in Bivariate Frequency Agility Communication System","authors":"Yupei Zhang;Zhijin Zhao;Shilian Zheng;Fangfang Qiang","doi":"10.1109/TCCN.2023.3306363","DOIUrl":null,"url":null,"abstract":"The conventional frequency hopping (FH) system is susceptible to malicious jamming due to the prearranged hopping frequency table. In this paper, we develop a bivariate frequency agility (BFA) communication system to improve the anti-jamming capability by assigning time-varying characteristics to the communication parameters such as fixed frequency interval and hopping rate in conventional FH. Our goal is to find the optimal frequency interval and hopping rate strategy in jamming environment to maximize the signal-to-noise ratio (SINR). We formulate the parameter decision problem as a Markov decision process (MDP). Then, we propose a deep deterministic policy gradient (DDPG) based algorithm for frequency interval selection and hopping rate setting. In addition, to overcome the shortcomings of DDPG, which is prone to fall into local optimum and unstable convergence, an improved deep deterministic policy gradient algorithm with a weighted dual-prioritized experience replay and periodically updated learning rate (IDDPG) is proposed. In IDDPG, on the one hand, the model is trained by replaying more experiences with high immediate reward and large temporal difference error (TD error) to make it more accurate. On the other hand, the learning rate is periodically decayed so that the update rate of the network model varies periodically, resulting in a richer and more diverse exploration. The simulation results under different electromagnetic jamming environment indicates that the anti-jamming performance of the proposed two algorithms outperforms that of the PPER-DQN algorithm and the RFH algorithm.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"9 6","pages":"1579-1595"},"PeriodicalIF":7.4000,"publicationDate":"2023-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10224324/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

The conventional frequency hopping (FH) system is susceptible to malicious jamming due to the prearranged hopping frequency table. In this paper, we develop a bivariate frequency agility (BFA) communication system to improve the anti-jamming capability by assigning time-varying characteristics to the communication parameters such as fixed frequency interval and hopping rate in conventional FH. Our goal is to find the optimal frequency interval and hopping rate strategy in jamming environment to maximize the signal-to-noise ratio (SINR). We formulate the parameter decision problem as a Markov decision process (MDP). Then, we propose a deep deterministic policy gradient (DDPG) based algorithm for frequency interval selection and hopping rate setting. In addition, to overcome the shortcomings of DDPG, which is prone to fall into local optimum and unstable convergence, an improved deep deterministic policy gradient algorithm with a weighted dual-prioritized experience replay and periodically updated learning rate (IDDPG) is proposed. In IDDPG, on the one hand, the model is trained by replaying more experiences with high immediate reward and large temporal difference error (TD error) to make it more accurate. On the other hand, the learning rate is periodically decayed so that the update rate of the network model varies periodically, resulting in a richer and more diverse exploration. The simulation results under different electromagnetic jamming environment indicates that the anti-jamming performance of the proposed two algorithms outperforms that of the PPER-DQN algorithm and the RFH algorithm.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
双变量频率敏捷通信系统中的连续行动和状态智能抗干扰决策
传统的跳频(FH)系统由于预先安排了跳频表,很容易受到恶意干扰。在本文中,我们开发了一种双变量频率敏捷性(BFA)通信系统,通过为传统跳频系统中的固定频率间隔和跳频速率等通信参数赋予时变特性来提高抗干扰能力。我们的目标是找到干扰环境下的最佳频率间隔和跳频策略,以最大化信噪比(SINR)。我们将参数决策问题表述为马尔可夫决策过程(MDP)。然后,我们提出了一种基于深度确定性策略梯度(DDPG)的频率间隔选择和跳频设置算法。此外,为了克服 DDPG 容易陷入局部最优和收敛不稳定的缺点,我们提出了一种改进的深度确定性策略梯度算法(IDDPG),该算法具有加权双优先经验重放和定期更新学习率的特点。在 IDDPG 中,一方面,通过重放更多即时奖励高且时差误差(TD 误差)大的经验来训练模型,使其更加精确。另一方面,学习率周期性衰减,使网络模型的更新率周期性变化,从而实现更丰富、更多样的探索。在不同电磁干扰环境下的仿真结果表明,所提出的两种算法的抗干扰性能优于 PPER-DQN 算法和 RFH 算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Cognitive Communications and Networking
IEEE Transactions on Cognitive Communications and Networking Computer Science-Artificial Intelligence
CiteScore
15.50
自引率
7.00%
发文量
108
期刊介绍: The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.
期刊最新文献
Intelligent Resource Adaptation for Diversified Service Requirements in Industrial IoT Real Field Error Correction for Coded Distributed Computing based Training Adaptive PCI Allocation in Heterogeneous Networks: A DRL-Driven Framework With Hash Table, FAGA, and Guiding Policies Generative AI on SpectrumNet: An Open Benchmark of Multiband 3D Radio Maps LiveStream Meta-DAMS: Multipath Scheduler Using Hybrid Meta Reinforcement Learning for Live Video Streaming
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1