Energy-Efficient Multi-Agent Reinforcement Learning for UAV Trajectory Optimization in Cell-Free Massive MIMO Networks

IF 10.7 1区 计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Wireless Communications Pub Date : 2025-03-20 DOI:10.1109/TWC.2025.3550266
Zhilong Liu;Jiayi Zhang;Yong Zeng;Bo Ai
{"title":"Energy-Efficient Multi-Agent Reinforcement Learning for UAV Trajectory Optimization in Cell-Free Massive MIMO Networks","authors":"Zhilong Liu;Jiayi Zhang;Yong Zeng;Bo Ai","doi":"10.1109/TWC.2025.3550266","DOIUrl":null,"url":null,"abstract":"To enhance global data transmission, uncrewed aerial vehicle (UAV)-aided space-air-ground integrated networks (SAGIN) represent a pivotal direction for future advancements. In this paper, we focus on the trajectory optimization problem with the goal of maximizing the energy efficiency (EE), thereby balancing the system capacity with energy expenditure. To this end, we first introduce a cell-free SAGIN network where UAVs function as flying access points to serve ground user equipment (GUE). Given that the transmission power of satellite direct-to-cell devices typically exceeds that of GUEs, we investigate the interference effect and derive exact closed-form expressions for the uplink spectral efficiency. In order to improve the service access efficiency, a GUE grouping scheme based on density distribution is proposed. Then, an effective EE analysis model is established considering the power consumption of fixed-wing UAVs. To solve the UAV trajectory optimization problem, two algorithms over two timescales are proposed: a successive convex approximation strategy and a multi-agent reinforcement learning (MARL)-based algorithm. In particular, to reduce the algorithmic complexity, we employ a shared Critic network in the proposed MARL algorithm to reduce the training parameters. Importantly, our approach comprehensively optimizes the UAV trajectory, acceleration, and velocity parameters. The results show that the proposed GUE grouping algorithm and the MARL-based optimization algorithm demonstrate adaptability in dynamic time-varying environments.","PeriodicalId":13431,"journal":{"name":"IEEE Transactions on Wireless Communications","volume":"24 7","pages":"5917-5930"},"PeriodicalIF":10.7000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Wireless Communications","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10932660/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

To enhance global data transmission, uncrewed aerial vehicle (UAV)-aided space-air-ground integrated networks (SAGIN) represent a pivotal direction for future advancements. In this paper, we focus on the trajectory optimization problem with the goal of maximizing the energy efficiency (EE), thereby balancing the system capacity with energy expenditure. To this end, we first introduce a cell-free SAGIN network where UAVs function as flying access points to serve ground user equipment (GUE). Given that the transmission power of satellite direct-to-cell devices typically exceeds that of GUEs, we investigate the interference effect and derive exact closed-form expressions for the uplink spectral efficiency. In order to improve the service access efficiency, a GUE grouping scheme based on density distribution is proposed. Then, an effective EE analysis model is established considering the power consumption of fixed-wing UAVs. To solve the UAV trajectory optimization problem, two algorithms over two timescales are proposed: a successive convex approximation strategy and a multi-agent reinforcement learning (MARL)-based algorithm. In particular, to reduce the algorithmic complexity, we employ a shared Critic network in the proposed MARL algorithm to reduce the training parameters. Importantly, our approach comprehensively optimizes the UAV trajectory, acceleration, and velocity parameters. The results show that the proposed GUE grouping algorithm and the MARL-based optimization algorithm demonstrate adaptability in dynamic time-varying environments.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
无蜂窝大规模多输入多输出网络中无人飞行器轨迹优化的高能效多代理强化学习
为了增强全球数据传输,无人驾驶飞行器(UAV)辅助空-空-地综合网络(SAGIN)代表了未来发展的关键方向。本文主要研究以能源效率最大化为目标的轨迹优化问题,从而平衡系统容量和能量消耗。为此,我们首先引入无小区SAGIN网络,其中无人机作为飞行接入点为地面用户设备(GUE)提供服务。鉴于卫星直接到小区设备的发射功率通常超过GUEs,我们研究了干扰效应,并推导了上行频谱效率的精确封闭形式表达式。为了提高业务接入效率,提出了一种基于密度分布的GUE分组方案。然后,考虑固定翼无人机的功耗,建立了有效的EE分析模型。针对无人机的轨迹优化问题,提出了两个时间尺度上的连续凸逼近算法和基于多智能体强化学习(MARL)的算法。特别地,为了降低算法复杂度,我们在MARL算法中使用了一个共享的Critic网络来减少训练参数。重要的是,我们的方法全面优化了无人机的轨迹、加速度和速度参数。结果表明,本文提出的GUE分组算法和基于marl的优化算法对动态时变环境具有较强的适应性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
18.60
自引率
10.60%
发文量
708
审稿时长
5.6 months
期刊介绍: The IEEE Transactions on Wireless Communications is a prestigious publication that showcases cutting-edge advancements in wireless communications. It welcomes both theoretical and practical contributions in various areas. The scope of the Transactions encompasses a wide range of topics, including modulation and coding, detection and estimation, propagation and channel characterization, and diversity techniques. The journal also emphasizes the physical and link layer communication aspects of network architectures and protocols. The journal is open to papers on specific topics or non-traditional topics related to specific application areas. This includes simulation tools and methodologies, orthogonal frequency division multiplexing, MIMO systems, and wireless over optical technologies. Overall, the IEEE Transactions on Wireless Communications serves as a platform for high-quality manuscripts that push the boundaries of wireless communications and contribute to advancements in the field.
期刊最新文献
Deterministic Statistical QoS Guarantee over FBL-AMC-HARQ based Cell-Free mMIMO Resource Allocation for Heterogeneous Services in Satellite-Terrestrial IoT Networks With Multi-Access Edge Computing Dual Time-Scale Resource Allocation for Hybrid VR and Haptic Services With Diffusion-Based DRL Joint Energy-Efficient and Throughput Optimization in Large-Scale Mobile Networks via Safe Hierarchical MARL A Fingerprint Database Generation Method for RIS-Assisted Indoor Positioning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1