Intelligent decision-making for a “Three-Variable” frequency-hopping pattern based on OC-CDRL

IF 2 4区 计算机科学 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC Physical Communication Pub Date : 2024-07-05 DOI:10.1016/j.phycom.2024.102434
Ziyu Meng , Shaogang Dai , Zhijin Zhao , Xueyi Ye , Shilian Zheng , Caiyi Lou , Xiaoniu Yang
{"title":"Intelligent decision-making for a “Three-Variable” frequency-hopping pattern based on OC-CDRL","authors":"Ziyu Meng ,&nbsp;Shaogang Dai ,&nbsp;Zhijin Zhao ,&nbsp;Xueyi Ye ,&nbsp;Shilian Zheng ,&nbsp;Caiyi Lou ,&nbsp;Xiaoniu Yang","doi":"10.1016/j.phycom.2024.102434","DOIUrl":null,"url":null,"abstract":"<div><p>The frequency hopping pattern of the existing frequency hopping communication system is not designed according to the electromagnetic interference environment, resulting in blind anti-jamming. Therefore, to address this problem, a “three-variable” frequency-hopping pattern is proposed, where the frequency, hopping rate, and instantaneous bandwidth of the frequency-hopping signal vary randomly based on the background electromagnetic interference. The decision-making problem of the “three-variable” frequency-hopping pattern is modeled as a Markov decision process (MDP) by constructing the state-action-reward tuple. The designed frequency varies continuously within a small frequency band selected from a pseudo-random sequence to alleviate the problem of dimension explosion in decision-making. At the same time, discrete values for the hopping rate and instantaneous bandwidth are designed. To solve this MDP problem efficiently, a combined deep reinforcement learning algorithm (OC-CDRL) based on optimistic exploration and conservative estimation is proposed, which combines the features of TD3 and D3QN algorithms and designs the corresponding states, actions, and rewards to deal with continuous and discrete action spaces, respectively. To address the problem that the D3QN algorithm tends to fall into local optimal solutions, an optimistic exploration strategy (OES) for action selection is proposed to improve the degree of exploration. Moreover, the loss function is improved by conservatively estimating state–action pairs outside the experience replay buffer, reducing the overestimation of the optimistic action-value function and increasing the stability and convergence of the algorithm. Comparative simulation results of the algorithms in different electromagnetic interference environments show that the OC-CDRL algorithm effectively avoids most regions with higher interference and has better adaptability and anti-jamming capability.</p></div>","PeriodicalId":48707,"journal":{"name":"Physical Communication","volume":"66 ","pages":"Article 102434"},"PeriodicalIF":2.0000,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical Communication","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1874490724001526","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

The frequency hopping pattern of the existing frequency hopping communication system is not designed according to the electromagnetic interference environment, resulting in blind anti-jamming. Therefore, to address this problem, a “three-variable” frequency-hopping pattern is proposed, where the frequency, hopping rate, and instantaneous bandwidth of the frequency-hopping signal vary randomly based on the background electromagnetic interference. The decision-making problem of the “three-variable” frequency-hopping pattern is modeled as a Markov decision process (MDP) by constructing the state-action-reward tuple. The designed frequency varies continuously within a small frequency band selected from a pseudo-random sequence to alleviate the problem of dimension explosion in decision-making. At the same time, discrete values for the hopping rate and instantaneous bandwidth are designed. To solve this MDP problem efficiently, a combined deep reinforcement learning algorithm (OC-CDRL) based on optimistic exploration and conservative estimation is proposed, which combines the features of TD3 and D3QN algorithms and designs the corresponding states, actions, and rewards to deal with continuous and discrete action spaces, respectively. To address the problem that the D3QN algorithm tends to fall into local optimal solutions, an optimistic exploration strategy (OES) for action selection is proposed to improve the degree of exploration. Moreover, the loss function is improved by conservatively estimating state–action pairs outside the experience replay buffer, reducing the overestimation of the optimistic action-value function and increasing the stability and convergence of the algorithm. Comparative simulation results of the algorithms in different electromagnetic interference environments show that the OC-CDRL algorithm effectively avoids most regions with higher interference and has better adaptability and anti-jamming capability.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于 OC-CDRL 的 "三变量 "跳频模式智能决策
现有跳频通信系统的跳频模式没有根据电磁干扰环境进行设计,造成抗干扰盲区。因此,针对这一问题,提出了一种 "三变量 "跳频模式,即跳频信号的频率、跳频率和瞬时带宽根据背景电磁干扰随机变化。通过构建状态-行动-回报元组,将 "三变量 "跳频模式的决策问题建模为马尔可夫决策过程(MDP)。设计的频率在一个从伪随机序列中选取的小频带内连续变化,以缓解决策中的维度爆炸问题。同时,还设计了跳跃率和瞬时带宽的离散值。为了高效解决该 MDP 问题,本文提出了一种基于乐观探索和保守估计的组合深度强化学习算法(OC-CDRL),该算法结合了 TD3 算法和 D3QN 算法的特点,设计了相应的状态、行动和奖励,分别处理连续和离散的行动空间。针对 D3QN 算法容易陷入局部最优解的问题,提出了一种用于行动选择的乐观探索策略(OES),以提高探索程度。此外,通过对经验重放缓冲区外的状态-行动对进行保守估计来改进损失函数,从而降低了乐观行动值函数的高估,提高了算法的稳定性和收敛性。算法在不同电磁干扰环境下的仿真比较结果表明,OC-CDRL 算法能有效避开大部分干扰较强的区域,具有更好的适应性和抗干扰能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Physical Communication
Physical Communication ENGINEERING, ELECTRICAL & ELECTRONICTELECO-TELECOMMUNICATIONS
CiteScore
5.00
自引率
9.10%
发文量
212
审稿时长
55 days
期刊介绍: PHYCOM: Physical Communication is an international and archival journal providing complete coverage of all topics of interest to those involved in all aspects of physical layer communications. Theoretical research contributions presenting new techniques, concepts or analyses, applied contributions reporting on experiences and experiments, and tutorials are published. Topics of interest include but are not limited to: Physical layer issues of Wireless Local Area Networks, WiMAX, Wireless Mesh Networks, Sensor and Ad Hoc Networks, PCS Systems; Radio access protocols and algorithms for the physical layer; Spread Spectrum Communications; Channel Modeling; Detection and Estimation; Modulation and Coding; Multiplexing and Carrier Techniques; Broadband Wireless Communications; Wireless Personal Communications; Multi-user Detection; Signal Separation and Interference rejection: Multimedia Communications over Wireless; DSP Applications to Wireless Systems; Experimental and Prototype Results; Multiple Access Techniques; Space-time Processing; Synchronization Techniques; Error Control Techniques; Cryptography; Software Radios; Tracking; Resource Allocation and Inference Management; Multi-rate and Multi-carrier Communications; Cross layer Design and Optimization; Propagation and Channel Characterization; OFDM Systems; MIMO Systems; Ultra-Wideband Communications; Cognitive Radio System Architectures; Platforms and Hardware Implementations for the Support of Cognitive, Radio Systems; Cognitive Radio Resource Management and Dynamic Spectrum Sharing.
期刊最新文献
Hybrid FSO/RF and UWOC system for enabling terrestrial–underwater communication: Performance analysis Enhancing performance of end-to-end communication system using Attention Mechanism-based Sparse Autoencoder over Rayleigh fading channel Clustering based strategic 3D deployment and trajectory optimization of UAVs with A-star algorithm for enhanced disaster response Modified fractional power allocation for downlink cell-free massive MIMO systems Joint RSU and agent vehicle cooperative localization using mmWave sensing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1