Atefeh Termehchi;Tingnan Bao;Aisha Syed;William Sean Kennedy;Melike Erol-Kantarci
{"title":"Goal-Oriented Reinforcement Learning in THz-Enabled UAV-Aided Network Using Supervised Learning","authors":"Atefeh Termehchi;Tingnan Bao;Aisha Syed;William Sean Kennedy;Melike Erol-Kantarci","doi":"10.1109/OJCOMS.2024.3442709","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning (DRL) has been a key machine learning technique in many 5G and 6G applications. DRL agents learn optimal (or sub-optimal) policies by interacting with the environment. However, this process often involves numerous uninformative and repetitive message transmissions between the DRL agent and its environment. In this paper, we address the problem of reducing interactions between the DRL agent and the environment, called goal-oriented DRL. Meanwhile, Terahertz (THz) bands and unmanned aerial vehicles (UAVs) are considered two of the main enablers of 6G. Therefore, we investigate the goal-oriented DRL problem in a THz-enabled UAV-aided network. We formulate it as an optimization problem with the goals of i) reducing interactions between the UAV (DRL agent) and IoT devices (environment), ii) maximizing the number of served IoT devices, and iii) ensuring fairness. The constraints include the movement characteristics of IoT devices, the maximum speed limitation of the UAV, the QoS requirements of the served IoT devices, and the limited uplink coverage of the THz-enabled UAV. This problem is a mixed-integer nonlinear programming optimization problem and is NP-hard. To address this problem, we employ the decoupling optimization method and an approach inspired by the self-triggered method from control engineering. Specifically, the problem is divided into two sub-problems; Then, we propose using supervised learning as a teacher for DRL to reduce the interactions. Our simulation results show that the goal-oriented DRL approach outperforms conventional methods by reducing interactions and maintaining good performance in terms of the number of served IoT devices and fairness.","PeriodicalId":33803,"journal":{"name":"IEEE Open Journal of the Communications Society","volume":null,"pages":null},"PeriodicalIF":6.3000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10634216","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10634216/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Deep reinforcement learning (DRL) has been a key machine learning technique in many 5G and 6G applications. DRL agents learn optimal (or sub-optimal) policies by interacting with the environment. However, this process often involves numerous uninformative and repetitive message transmissions between the DRL agent and its environment. In this paper, we address the problem of reducing interactions between the DRL agent and the environment, called goal-oriented DRL. Meanwhile, Terahertz (THz) bands and unmanned aerial vehicles (UAVs) are considered two of the main enablers of 6G. Therefore, we investigate the goal-oriented DRL problem in a THz-enabled UAV-aided network. We formulate it as an optimization problem with the goals of i) reducing interactions between the UAV (DRL agent) and IoT devices (environment), ii) maximizing the number of served IoT devices, and iii) ensuring fairness. The constraints include the movement characteristics of IoT devices, the maximum speed limitation of the UAV, the QoS requirements of the served IoT devices, and the limited uplink coverage of the THz-enabled UAV. This problem is a mixed-integer nonlinear programming optimization problem and is NP-hard. To address this problem, we employ the decoupling optimization method and an approach inspired by the self-triggered method from control engineering. Specifically, the problem is divided into two sub-problems; Then, we propose using supervised learning as a teacher for DRL to reduce the interactions. Our simulation results show that the goal-oriented DRL approach outperforms conventional methods by reducing interactions and maintaining good performance in terms of the number of served IoT devices and fairness.
期刊介绍:
The IEEE Open Journal of the Communications Society (OJ-COMS) is an open access, all-electronic journal that publishes original high-quality manuscripts on advances in the state of the art of telecommunications systems and networks. The papers in IEEE OJ-COMS are included in Scopus. Submissions reporting new theoretical findings (including novel methods, concepts, and studies) and practical contributions (including experiments and development of prototypes) are welcome. Additionally, survey and tutorial articles are considered. The IEEE OJCOMS received its debut impact factor of 7.9 according to the Journal Citation Reports (JCR) 2023.
The IEEE Open Journal of the Communications Society covers science, technology, applications and standards for information organization, collection and transfer using electronic, optical and wireless channels and networks. Some specific areas covered include:
Systems and network architecture, control and management
Protocols, software, and middleware
Quality of service, reliability, and security
Modulation, detection, coding, and signaling
Switching and routing
Mobile and portable communications
Terminals and other end-user devices
Networks for content distribution and distributed computing
Communications-based distributed resources control.