{"title":"Intelligent Anti-Jamming Decision With Continuous Action and State in Bivariate Frequency Agility Communication System","authors":"Yupei Zhang;Zhijin Zhao;Shilian Zheng;Fangfang Qiang","doi":"10.1109/TCCN.2023.3306363","DOIUrl":null,"url":null,"abstract":"The conventional frequency hopping (FH) system is susceptible to malicious jamming due to the prearranged hopping frequency table. In this paper, we develop a bivariate frequency agility (BFA) communication system to improve the anti-jamming capability by assigning time-varying characteristics to the communication parameters such as fixed frequency interval and hopping rate in conventional FH. Our goal is to find the optimal frequency interval and hopping rate strategy in jamming environment to maximize the signal-to-noise ratio (SINR). We formulate the parameter decision problem as a Markov decision process (MDP). Then, we propose a deep deterministic policy gradient (DDPG) based algorithm for frequency interval selection and hopping rate setting. In addition, to overcome the shortcomings of DDPG, which is prone to fall into local optimum and unstable convergence, an improved deep deterministic policy gradient algorithm with a weighted dual-prioritized experience replay and periodically updated learning rate (IDDPG) is proposed. In IDDPG, on the one hand, the model is trained by replaying more experiences with high immediate reward and large temporal difference error (TD error) to make it more accurate. On the other hand, the learning rate is periodically decayed so that the update rate of the network model varies periodically, resulting in a richer and more diverse exploration. The simulation results under different electromagnetic jamming environment indicates that the anti-jamming performance of the proposed two algorithms outperforms that of the PPER-DQN algorithm and the RFH algorithm.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"9 6","pages":"1579-1595"},"PeriodicalIF":7.4000,"publicationDate":"2023-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10224324/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The conventional frequency hopping (FH) system is susceptible to malicious jamming due to the prearranged hopping frequency table. In this paper, we develop a bivariate frequency agility (BFA) communication system to improve the anti-jamming capability by assigning time-varying characteristics to the communication parameters such as fixed frequency interval and hopping rate in conventional FH. Our goal is to find the optimal frequency interval and hopping rate strategy in jamming environment to maximize the signal-to-noise ratio (SINR). We formulate the parameter decision problem as a Markov decision process (MDP). Then, we propose a deep deterministic policy gradient (DDPG) based algorithm for frequency interval selection and hopping rate setting. In addition, to overcome the shortcomings of DDPG, which is prone to fall into local optimum and unstable convergence, an improved deep deterministic policy gradient algorithm with a weighted dual-prioritized experience replay and periodically updated learning rate (IDDPG) is proposed. In IDDPG, on the one hand, the model is trained by replaying more experiences with high immediate reward and large temporal difference error (TD error) to make it more accurate. On the other hand, the learning rate is periodically decayed so that the update rate of the network model varies periodically, resulting in a richer and more diverse exploration. The simulation results under different electromagnetic jamming environment indicates that the anti-jamming performance of the proposed two algorithms outperforms that of the PPER-DQN algorithm and the RFH algorithm.
期刊介绍:
The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.