多无人机组播系统三维轨迹与资源分配优化:多智能体强化学习方法

IF 4.4 2区 地球科学 Q1 REMOTE SENSING Drones Pub Date : 2023-10-19 DOI:10.3390/drones7100641
Dongyu Wang, Yue Liu, Hongda Yu, Yanzhao Hou
{"title":"多无人机组播系统三维轨迹与资源分配优化:多智能体强化学习方法","authors":"Dongyu Wang, Yue Liu, Hongda Yu, Yanzhao Hou","doi":"10.3390/drones7100641","DOIUrl":null,"url":null,"abstract":"Unmanned aerial vehicles (UAVs) are able to act as movable aerial base stations to enhance wireless coverage for edge users with poor ground communication quality. However, in urban environments, the link between UAVs and ground users can be blocked by obstacles, especially when complicated terrestrial infrastructures increase the probability of non-line-of-sight (NLoS) links. In this paper, in order to improve the average throughput, we propose a multi-UAV multicast system, where a multi-agent reinforcement learning method is utilized to help UAVs determine the optimal altitude and trajectory. Intelligent reflective surfaces (IRSs) are also employed to reflect signals to solve the blocking problem. Furthermore, since the UAV’s onboard power is limited, this paper aims to minimize the UAVs’ energy consumption and maximize the transmission rate for edge users by jointly optimizing the UAVs’ 3D trajectory and transmit power. Firstly, we deduce the channel capacity of ground users in different multicast groups. Subsequently, the K-medoids algorithm is utilized for the multicast grouping problem of edge users based on transmission rate requirements. Then, we employ the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm to learn an optimal solution and eliminate the non-stationarity of multi-agent training. Finally, the simulation results show that the proposed system can increase the average throughput by 14% approximately compared to the non-grouping system, and the MADDPG algorithm can achieve a 20% improvement in reducing the energy consumption of UAVs compared to traditional deep reinforcement learning (DRL) methods.","PeriodicalId":36448,"journal":{"name":"Drones","volume":"194 1","pages":"0"},"PeriodicalIF":4.4000,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method\",\"authors\":\"Dongyu Wang, Yue Liu, Hongda Yu, Yanzhao Hou\",\"doi\":\"10.3390/drones7100641\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Unmanned aerial vehicles (UAVs) are able to act as movable aerial base stations to enhance wireless coverage for edge users with poor ground communication quality. However, in urban environments, the link between UAVs and ground users can be blocked by obstacles, especially when complicated terrestrial infrastructures increase the probability of non-line-of-sight (NLoS) links. In this paper, in order to improve the average throughput, we propose a multi-UAV multicast system, where a multi-agent reinforcement learning method is utilized to help UAVs determine the optimal altitude and trajectory. Intelligent reflective surfaces (IRSs) are also employed to reflect signals to solve the blocking problem. Furthermore, since the UAV’s onboard power is limited, this paper aims to minimize the UAVs’ energy consumption and maximize the transmission rate for edge users by jointly optimizing the UAVs’ 3D trajectory and transmit power. Firstly, we deduce the channel capacity of ground users in different multicast groups. Subsequently, the K-medoids algorithm is utilized for the multicast grouping problem of edge users based on transmission rate requirements. Then, we employ the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm to learn an optimal solution and eliminate the non-stationarity of multi-agent training. Finally, the simulation results show that the proposed system can increase the average throughput by 14% approximately compared to the non-grouping system, and the MADDPG algorithm can achieve a 20% improvement in reducing the energy consumption of UAVs compared to traditional deep reinforcement learning (DRL) methods.\",\"PeriodicalId\":36448,\"journal\":{\"name\":\"Drones\",\"volume\":\"194 1\",\"pages\":\"0\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2023-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Drones\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/drones7100641\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drones","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/drones7100641","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0

摘要

无人机(uav)能够充当可移动的空中基站,以增强地面通信质量差的边缘用户的无线覆盖。然而,在城市环境中,无人机和地面用户之间的链接可能被障碍物阻挡,特别是当复杂的地面基础设施增加了非视距(NLoS)链接的可能性时。为了提高平均吞吐量,本文提出了一种多无人机组播系统,该系统利用多智能体强化学习方法帮助无人机确定最佳高度和轨迹。智能反射面(IRSs)也被用来反射信号,以解决阻塞问题。此外,由于无人机机载功率有限,本文旨在通过联合优化无人机的三维轨迹和发射功率,实现无人机能耗最小化和边缘用户传输速率最大化。首先推导了不同组播组中地面用户的信道容量。随后,基于传输速率要求,利用K-medoids算法解决边缘用户组播问题。然后,我们采用多智能体深度确定性策略梯度(madpg)算法来学习最优解,消除多智能体训练的非平稳性。最后,仿真结果表明,与非分组系统相比,该系统的平均吞吐量提高了约14%,与传统的深度强化学习(DRL)方法相比,MADDPG算法在降低无人机能耗方面提高了20%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Three-Dimensional Trajectory and Resource Allocation Optimization in Multi-Unmanned Aerial Vehicle Multicast System: A Multi-Agent Reinforcement Learning Method
Unmanned aerial vehicles (UAVs) are able to act as movable aerial base stations to enhance wireless coverage for edge users with poor ground communication quality. However, in urban environments, the link between UAVs and ground users can be blocked by obstacles, especially when complicated terrestrial infrastructures increase the probability of non-line-of-sight (NLoS) links. In this paper, in order to improve the average throughput, we propose a multi-UAV multicast system, where a multi-agent reinforcement learning method is utilized to help UAVs determine the optimal altitude and trajectory. Intelligent reflective surfaces (IRSs) are also employed to reflect signals to solve the blocking problem. Furthermore, since the UAV’s onboard power is limited, this paper aims to minimize the UAVs’ energy consumption and maximize the transmission rate for edge users by jointly optimizing the UAVs’ 3D trajectory and transmit power. Firstly, we deduce the channel capacity of ground users in different multicast groups. Subsequently, the K-medoids algorithm is utilized for the multicast grouping problem of edge users based on transmission rate requirements. Then, we employ the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm to learn an optimal solution and eliminate the non-stationarity of multi-agent training. Finally, the simulation results show that the proposed system can increase the average throughput by 14% approximately compared to the non-grouping system, and the MADDPG algorithm can achieve a 20% improvement in reducing the energy consumption of UAVs compared to traditional deep reinforcement learning (DRL) methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Drones
Drones Engineering-Aerospace Engineering
CiteScore
5.60
自引率
18.80%
发文量
331
期刊最新文献
Firefighting Drone Configuration and Scheduling for Wildfire Based on Loss Estimation and Minimization Wind Tunnel Balance Measurements of Bioinspired Tails for a Fixed Wing MAV Three-Dimensional Indoor Positioning Scheme for Drone with Fingerprint-Based Deep-Learning Classifier Blockchain-Enabled Infection Sample Collection System Using Two-Echelon Drone-Assisted Mechanism Joint Trajectory Design and Resource Optimization in UAV-Assisted Caching-Enabled Networks with Finite Blocklength Transmissions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1