Intelligent Frequency Reuse for Dynamic Spectrum Anti-Jamming: A Hybrid-Reward-Based Multi-Agent Deep Reinforcement Learning Approach

IF 5.5 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Wireless Communications Letters Pub Date : 2024-12-26 DOI:10.1109/LWC.2024.3523221

Zhenyi Ke;Ximing Wang;Zhiyong Du;Tao Xiong;Yifan Xu;Jiaqi Chen

{"title":"Intelligent Frequency Reuse for Dynamic Spectrum Anti-Jamming: A Hybrid-Reward-Based Multi-Agent Deep Reinforcement Learning Approach","authors":"Zhenyi Ke;Ximing Wang;Zhiyong Du;Tao Xiong;Yifan Xu;Jiaqi Chen","doi":"10.1109/LWC.2024.3523221","DOIUrl":null,"url":null,"abstract":"This letter investigates the problem of distributed multi-user dynamic spectrum access in dynamic and unknown jamming environment based on deep reinforcement learning. Most existing studies considered small-scale networks with enough communication channels (number of channels > number of users), and users can obtain global spectrum states to learn the collaborative anti-jamming policy. A reliable control link is also assumed to realize control information exchange without being interfered. Thus they worked poorly in practical networks with limited spectrum resources and local information. To deal with these issues, we present a collaborative anti-jamming approach based on the idea of intelligent frequency reuse. To describe the independent and local properties of the independent learning by each user, we formulate the multi-user decision-making problem as a decentralized partially observable Markov decision process. Then, a hybrid-reward-based deep reinforcement learning algorithm is designed to learn the multi-user frequency reuse task and anti-jamming task, simultaneously realizing internal frequency coordination and external anti-jamming. Finally, the effectiveness and robustness of the proposed approach is verified by simulation results.","PeriodicalId":13343,"journal":{"name":"IEEE Wireless Communications Letters","volume":"14 3","pages":"771-775"},"PeriodicalIF":5.5000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Wireless Communications Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10816518/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

This letter investigates the problem of distributed multi-user dynamic spectrum access in dynamic and unknown jamming environment based on deep reinforcement learning. Most existing studies considered small-scale networks with enough communication channels (number of channels > number of users), and users can obtain global spectrum states to learn the collaborative anti-jamming policy. A reliable control link is also assumed to realize control information exchange without being interfered. Thus they worked poorly in practical networks with limited spectrum resources and local information. To deal with these issues, we present a collaborative anti-jamming approach based on the idea of intelligent frequency reuse. To describe the independent and local properties of the independent learning by each user, we formulate the multi-user decision-making problem as a decentralized partially observable Markov decision process. Then, a hybrid-reward-based deep reinforcement learning algorithm is designed to learn the multi-user frequency reuse task and anti-jamming task, simultaneously realizing internal frequency coordination and external anti-jamming. Finally, the effectiveness and robustness of the proposed approach is verified by simulation results.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

动态频谱抗干扰智能频率复用：基于混合奖励的多智能体深度强化学习方法

本文研究了基于深度强化学习的动态未知干扰环境下分布式多用户动态频谱接入问题。现有的研究大多考虑具有足够通信信道（信道数为>用户数）的小规模网络，用户可以获取全局频谱状态来学习协同抗干扰策略。同时假定有可靠的控制链路，实现不受干扰的控制信息交换。因此，它们在频谱资源和本地信息有限的实际网络中表现不佳。为了解决这些问题，我们提出了一种基于智能频率复用思想的协同抗干扰方法。为了描述每个用户独立学习的独立性和局部性，我们将多用户决策问题表述为一个分散的部分可观察马尔可夫决策过程。然后，设计了一种基于混合奖励的深度强化学习算法，学习多用户频率复用任务和抗干扰任务，同时实现内部频率协调和外部抗干扰。最后，仿真结果验证了该方法的有效性和鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Wireless Communications Letters Engineering-Electrical and Electronic Engineering

CiteScore

12.30

自引率

6.30%

发文量

481

期刊介绍： IEEE Wireless Communications Letters publishes short papers in a rapid publication cycle on advances in the state-of-the-art of wireless communications. Both theoretical contributions (including new techniques, concepts, and analyses) and practical contributions (including system experiments and prototypes, and new applications) are encouraged. This journal focuses on the physical layer and the link layer of wireless communication systems.