Multi-Agent Double Deep Q-Learning for Fairness in Multiple-Access Underlay Cognitive Radio Networks

Zain Ali;Zouheir Rezki;Hamid Sadjadpour
{"title":"Multi-Agent Double Deep Q-Learning for Fairness in Multiple-Access Underlay Cognitive Radio Networks","authors":"Zain Ali;Zouheir Rezki;Hamid Sadjadpour","doi":"10.1109/TMLCN.2024.3391216","DOIUrl":null,"url":null,"abstract":"Underlay Cognitive Radio (CR) systems were introduced to resolve the issue of spectrum scarcity in wireless communication. In CR systems, an unlicensed Secondary Transmitter (ST) shares the channel with a licensed Primary Transmitter (PT). Spectral efficiency of the CR systems can be further increased if multiple STs share the same channel. In underlay CR systems, the STs are required to keep interference at a low level to avoid outage at the primary system. The restriction on interference in underlay CR prevents some STs from transmitting while other STs may achieve high data rates, thus making the underlay CR network unfair. In this work, we consider the problem of achieving fairness in the rates of the STs. The considered optimization problem is non-convex in nature. The conventional iteration-based optimizers are time-consuming and may not converge when the considered problem is non-convex. To deal with the problem, we propose a deep-Q reinforcement learning (DQ-RL) framework that employs two separate deep neural networks for the computation and estimation of the Q-values which provides a fast solution and is robust to channel dynamic. The proposed technique achieves near optimal values of fairness while offering primary outage probability of less than 4%. Further, increasing the number of STs results in a linear increase in the computational complexity of the proposed framework. A comparison of several variants of the proposed scheme with the optimal solution is also presented. Finally, we present a novel cumulative reward framework and discuss how the combined-reward approach improves the performance of the communication system.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"580-595"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10504881","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10504881/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Underlay Cognitive Radio (CR) systems were introduced to resolve the issue of spectrum scarcity in wireless communication. In CR systems, an unlicensed Secondary Transmitter (ST) shares the channel with a licensed Primary Transmitter (PT). Spectral efficiency of the CR systems can be further increased if multiple STs share the same channel. In underlay CR systems, the STs are required to keep interference at a low level to avoid outage at the primary system. The restriction on interference in underlay CR prevents some STs from transmitting while other STs may achieve high data rates, thus making the underlay CR network unfair. In this work, we consider the problem of achieving fairness in the rates of the STs. The considered optimization problem is non-convex in nature. The conventional iteration-based optimizers are time-consuming and may not converge when the considered problem is non-convex. To deal with the problem, we propose a deep-Q reinforcement learning (DQ-RL) framework that employs two separate deep neural networks for the computation and estimation of the Q-values which provides a fast solution and is robust to channel dynamic. The proposed technique achieves near optimal values of fairness while offering primary outage probability of less than 4%. Further, increasing the number of STs results in a linear increase in the computational complexity of the proposed framework. A comparison of several variants of the proposed scheme with the optimal solution is also presented. Finally, we present a novel cumulative reward framework and discuss how the combined-reward approach improves the performance of the communication system.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多代理双深度 Q 学习促进多重接入下层认知无线电网络的公平性
底层认知无线电(CR)系统的出现是为了解决无线通信中频谱稀缺的问题。在认知无线电系统中,未获得许可的二级发射机(ST)与获得许可的一级发射机(PT)共享信道。如果多个 ST 共享同一信道,则可进一步提高 CR 系统的频谱效率。在下层 CR 系统中,ST 必须将干扰控制在较低水平,以避免主系统中断。在底层 CR 中,对干扰的限制使一些 ST 无法进行传输,而其他 ST 则可能实现很高的数据传输速率,从而使底层 CR 网络变得不公平。在这项工作中,我们考虑了如何实现 ST 速率公平性的问题。所考虑的优化问题在本质上是非凸的。传统的基于迭代的优化器非常耗时,而且当所考虑的问题是非凸问题时可能无法收敛。为解决这一问题,我们提出了一种深度 Q 强化学习(DQ-RL)框架,该框架采用两个独立的深度神经网络来计算和估计 Q 值,可提供快速解决方案,并对信道动态具有鲁棒性。所提出的技术可实现接近最优的公平值,同时提供小于 4% 的主中断概率。此外,增加 ST 的数量会导致拟议框架的计算复杂度线性增加。此外,我们还对所提方案的几种变体与最优解决方案进行了比较。最后,我们提出了一个新颖的累积奖励框架,并讨论了综合奖励方法如何提高通信系统的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Conditional Denoising Diffusion Probabilistic Models for Data Reconstruction Enhancement in Wireless Communications Multi-Agent Reinforcement Learning With Action Masking for UAV-Enabled Mobile Communications Online Learning for Intelligent Thermal Management of Interference-Coupled and Passively Cooled Base Stations Robust and Lightweight Modeling of IoT Network Behaviors From Raw Traffic Packets Front Cover
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1