通过进化多目标深度强化学习实现地面空间协同通信

Jiahui Li;Geng Sun;Qingqing Wu;Dusit Niyato;Jiawen Kang;Abbas Jamalipour;Victor C. M. Leung
{"title":"通过进化多目标深度强化学习实现地面空间协同通信","authors":"Jiahui Li;Geng Sun;Qingqing Wu;Dusit Niyato;Jiawen Kang;Abbas Jamalipour;Victor C. M. Leung","doi":"10.1109/JSAC.2024.3459029","DOIUrl":null,"url":null,"abstract":"Low Earth Orbit (LEO) satellites have emerged as crucial enablers of direct connections with remote terrestrial terminals. However, energy limitations and insufficient antenna capabilities at the terminals often hamper these connections, resulting in inefficient communications and frequent ping-pong handovers. This paper proposes a Distributed Collaborative Beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications. Specifically, DCB treats the terminals that are unable to establish efficient direct connections with the LEO satellites as distributed antennas, forming a virtual antenna array to enhance the terminal-to-satellite uplink achievable rates and durations. However, such systems need multiple trade-off policies that jointly balance the terminal-satellite uplink achievable rate, energy consumption of terminals, and satellite switching frequency to satisfy the scenario requirement changes. Thus, we formulate a long-term multi-objective optimization problem to optimize these goals simultaneously. To address availability in different terminal cluster scales, we reformulate this problem into an action space-reduced and universal Multi-Objective Markov Decision Process (MOMDP). Then, we propose an Evolutionary Multi-Objective Deep Reinforcement Learning (EMODRL) algorithm to obtain multiple policies, in which the low-value actions are masked to speed up the training process. Simulation results show that DCB enables terminals that cannot reach the uplink achievable rate threshold to achieve efficient direct uplink transmission. Moreover, the proposed algorithm outmatches various baselines and saves 30% handover frequency with a similar uplink achievable rate compared with the rate greedy method, which thus reveals that the proposed method is an effective solution for enabling direct ground-space communications.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"42 12","pages":"3395-3411"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Collaborative Ground-Space Communications via Evolutionary Multi-Objective Deep Reinforcement Learning\",\"authors\":\"Jiahui Li;Geng Sun;Qingqing Wu;Dusit Niyato;Jiawen Kang;Abbas Jamalipour;Victor C. M. Leung\",\"doi\":\"10.1109/JSAC.2024.3459029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Low Earth Orbit (LEO) satellites have emerged as crucial enablers of direct connections with remote terrestrial terminals. However, energy limitations and insufficient antenna capabilities at the terminals often hamper these connections, resulting in inefficient communications and frequent ping-pong handovers. This paper proposes a Distributed Collaborative Beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications. Specifically, DCB treats the terminals that are unable to establish efficient direct connections with the LEO satellites as distributed antennas, forming a virtual antenna array to enhance the terminal-to-satellite uplink achievable rates and durations. However, such systems need multiple trade-off policies that jointly balance the terminal-satellite uplink achievable rate, energy consumption of terminals, and satellite switching frequency to satisfy the scenario requirement changes. Thus, we formulate a long-term multi-objective optimization problem to optimize these goals simultaneously. To address availability in different terminal cluster scales, we reformulate this problem into an action space-reduced and universal Multi-Objective Markov Decision Process (MOMDP). Then, we propose an Evolutionary Multi-Objective Deep Reinforcement Learning (EMODRL) algorithm to obtain multiple policies, in which the low-value actions are masked to speed up the training process. Simulation results show that DCB enables terminals that cannot reach the uplink achievable rate threshold to achieve efficient direct uplink transmission. Moreover, the proposed algorithm outmatches various baselines and saves 30% handover frequency with a similar uplink achievable rate compared with the rate greedy method, which thus reveals that the proposed method is an effective solution for enabling direct ground-space communications.\",\"PeriodicalId\":73294,\"journal\":{\"name\":\"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society\",\"volume\":\"42 12\",\"pages\":\"3395-3411\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10679228/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10679228/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

低地球轨道(LEO)卫星已成为与远程地面终端直接连接的关键推动者。然而,终端的能量限制和天线能力不足常常阻碍这些连接,导致通信效率低下和频繁的乒乓切换。本文提出了一种基于分布式协同波束形成(DCB)的上行通信范式,用于实现地面空间直接通信。具体而言,DCB将无法与LEO卫星建立有效直接连接的终端视为分布式天线,形成虚拟天线阵列,以提高终端到卫星上行可达速率和持续时间。然而,这种系统需要多个权衡策略,共同平衡终端-卫星上行可达速率、终端能耗和卫星交换频率,以满足场景需求的变化。因此,我们制定了一个长期的多目标优化问题,以同时优化这些目标。为了解决不同终端集群尺度下的可用性问题,我们将这个问题重新表述为一个行动空间缩减的通用多目标马尔可夫决策过程(MOMDP)。然后,我们提出了一种进化多目标深度强化学习(EMODRL)算法来获得多个策略,其中低值动作被掩盖以加快训练过程。仿真结果表明,DCB可以使无法达到上行可达速率阈值的终端实现高效的上行直连传输。此外,该算法优于各种基线,在上行可达速率相似的情况下,与速率贪婪方法相比节省了30%的切换频率,表明该方法是实现地空直接通信的有效解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Collaborative Ground-Space Communications via Evolutionary Multi-Objective Deep Reinforcement Learning
Low Earth Orbit (LEO) satellites have emerged as crucial enablers of direct connections with remote terrestrial terminals. However, energy limitations and insufficient antenna capabilities at the terminals often hamper these connections, resulting in inefficient communications and frequent ping-pong handovers. This paper proposes a Distributed Collaborative Beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications. Specifically, DCB treats the terminals that are unable to establish efficient direct connections with the LEO satellites as distributed antennas, forming a virtual antenna array to enhance the terminal-to-satellite uplink achievable rates and durations. However, such systems need multiple trade-off policies that jointly balance the terminal-satellite uplink achievable rate, energy consumption of terminals, and satellite switching frequency to satisfy the scenario requirement changes. Thus, we formulate a long-term multi-objective optimization problem to optimize these goals simultaneously. To address availability in different terminal cluster scales, we reformulate this problem into an action space-reduced and universal Multi-Objective Markov Decision Process (MOMDP). Then, we propose an Evolutionary Multi-Objective Deep Reinforcement Learning (EMODRL) algorithm to obtain multiple policies, in which the low-value actions are masked to speed up the training process. Simulation results show that DCB enables terminals that cannot reach the uplink achievable rate threshold to achieve efficient direct uplink transmission. Moreover, the proposed algorithm outmatches various baselines and saves 30% handover frequency with a similar uplink achievable rate compared with the rate greedy method, which thus reveals that the proposed method is an effective solution for enabling direct ground-space communications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Table of Contents IEEE Communications Society Information Corrections to “Coverage Rate Analysis for Integrated Sensing and Communication Networks” IEEE Journal on Selected Areas in Communications Publication Information Guest Editorial: Integrated Ground-Air-Space Wireless Networks for 6G Mobile—Part II
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1