Goal-Oriented Reinforcement Learning in THz-Enabled UAV-Aided Network Using Supervised Learning

IF 6.3 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Open Journal of the Communications Society Pub Date : 2024-08-12 DOI:10.1109/OJCOMS.2024.3442709
Atefeh Termehchi;Tingnan Bao;Aisha Syed;William Sean Kennedy;Melike Erol-Kantarci
{"title":"Goal-Oriented Reinforcement Learning in THz-Enabled UAV-Aided Network Using Supervised Learning","authors":"Atefeh Termehchi;Tingnan Bao;Aisha Syed;William Sean Kennedy;Melike Erol-Kantarci","doi":"10.1109/OJCOMS.2024.3442709","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning (DRL) has been a key machine learning technique in many 5G and 6G applications. DRL agents learn optimal (or sub-optimal) policies by interacting with the environment. However, this process often involves numerous uninformative and repetitive message transmissions between the DRL agent and its environment. In this paper, we address the problem of reducing interactions between the DRL agent and the environment, called goal-oriented DRL. Meanwhile, Terahertz (THz) bands and unmanned aerial vehicles (UAVs) are considered two of the main enablers of 6G. Therefore, we investigate the goal-oriented DRL problem in a THz-enabled UAV-aided network. We formulate it as an optimization problem with the goals of i) reducing interactions between the UAV (DRL agent) and IoT devices (environment), ii) maximizing the number of served IoT devices, and iii) ensuring fairness. The constraints include the movement characteristics of IoT devices, the maximum speed limitation of the UAV, the QoS requirements of the served IoT devices, and the limited uplink coverage of the THz-enabled UAV. This problem is a mixed-integer nonlinear programming optimization problem and is NP-hard. To address this problem, we employ the decoupling optimization method and an approach inspired by the self-triggered method from control engineering. Specifically, the problem is divided into two sub-problems; Then, we propose using supervised learning as a teacher for DRL to reduce the interactions. Our simulation results show that the goal-oriented DRL approach outperforms conventional methods by reducing interactions and maintaining good performance in terms of the number of served IoT devices and fairness.","PeriodicalId":33803,"journal":{"name":"IEEE Open Journal of the Communications Society","volume":null,"pages":null},"PeriodicalIF":6.3000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10634216","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10634216/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Deep reinforcement learning (DRL) has been a key machine learning technique in many 5G and 6G applications. DRL agents learn optimal (or sub-optimal) policies by interacting with the environment. However, this process often involves numerous uninformative and repetitive message transmissions between the DRL agent and its environment. In this paper, we address the problem of reducing interactions between the DRL agent and the environment, called goal-oriented DRL. Meanwhile, Terahertz (THz) bands and unmanned aerial vehicles (UAVs) are considered two of the main enablers of 6G. Therefore, we investigate the goal-oriented DRL problem in a THz-enabled UAV-aided network. We formulate it as an optimization problem with the goals of i) reducing interactions between the UAV (DRL agent) and IoT devices (environment), ii) maximizing the number of served IoT devices, and iii) ensuring fairness. The constraints include the movement characteristics of IoT devices, the maximum speed limitation of the UAV, the QoS requirements of the served IoT devices, and the limited uplink coverage of the THz-enabled UAV. This problem is a mixed-integer nonlinear programming optimization problem and is NP-hard. To address this problem, we employ the decoupling optimization method and an approach inspired by the self-triggered method from control engineering. Specifically, the problem is divided into two sub-problems; Then, we propose using supervised learning as a teacher for DRL to reduce the interactions. Our simulation results show that the goal-oriented DRL approach outperforms conventional methods by reducing interactions and maintaining good performance in terms of the number of served IoT devices and fairness.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用监督学习在太赫兹无人机辅助网络中进行目标导向强化学习
深度强化学习(DRL)是许多 5G 和 6G 应用中的关键机器学习技术。DRL 代理通过与环境交互来学习最优(或次优)策略。然而,在这一过程中,DRL 代理与其环境之间往往需要进行大量无信息的重复信息传输。在本文中,我们要解决的问题是减少 DRL 代理与环境之间的交互,即所谓的 "目标导向 DRL"。与此同时,太赫兹(THz)频段和无人机(UAV)被认为是 6G 的两个主要推动因素。因此,我们研究了太赫兹无人机辅助网络中面向目标的 DRL 问题。我们将其表述为一个优化问题,其目标是 i) 减少无人机(DRL 代理)与物联网设备(环境)之间的交互;ii) 使服务的物联网设备数量最大化;iii) 确保公平性。约束条件包括物联网设备的移动特性、无人机的最大速度限制、所服务物联网设备的 QoS 要求以及太赫兹无人机有限的上行链路覆盖范围。该问题是一个混合整数非线性编程优化问题,具有 NP 难度。为了解决这个问题,我们采用了解耦优化方法和受控制工程中自触发方法启发的方法。具体来说,该问题被分为两个子问题;然后,我们提出使用监督学习作为 DRL 的教师,以减少交互。我们的仿真结果表明,以目标为导向的 DRL 方法优于传统方法,不仅减少了交互,还在服务的物联网设备数量和公平性方面保持了良好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
13.70
自引率
3.80%
发文量
94
审稿时长
10 weeks
期刊介绍: The IEEE Open Journal of the Communications Society (OJ-COMS) is an open access, all-electronic journal that publishes original high-quality manuscripts on advances in the state of the art of telecommunications systems and networks. The papers in IEEE OJ-COMS are included in Scopus. Submissions reporting new theoretical findings (including novel methods, concepts, and studies) and practical contributions (including experiments and development of prototypes) are welcome. Additionally, survey and tutorial articles are considered. The IEEE OJCOMS received its debut impact factor of 7.9 according to the Journal Citation Reports (JCR) 2023. The IEEE Open Journal of the Communications Society covers science, technology, applications and standards for information organization, collection and transfer using electronic, optical and wireless channels and networks. Some specific areas covered include: Systems and network architecture, control and management Protocols, software, and middleware Quality of service, reliability, and security Modulation, detection, coding, and signaling Switching and routing Mobile and portable communications Terminals and other end-user devices Networks for content distribution and distributed computing Communications-based distributed resources control.
期刊最新文献
GP-DGECN: Geometric Prior Dynamic Group Equivariant Convolutional Networks for Specific Emitter Identification A Tractable Framework for Spectrum Coexistence Between Satellite Receivers and Terrestrial Networks A Survey of LoRaWAN-Integrated Wearable Sensor Networks for Human Activity Recognition: Applications, Challenges and Possible Solutions Detection of Zero-Day Attacks in a Software-Defined LEO Constellation Network Using Enhanced Network Metric Predictions Few-Shot Class-Incremental Learning for Network Intrusion Detection Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1