Jiahui Li;Geng Sun;Qingqing Wu;Dusit Niyato;Jiawen Kang;Abbas Jamalipour;Victor C. M. Leung
{"title":"通过进化多目标深度强化学习实现地面空间协同通信","authors":"Jiahui Li;Geng Sun;Qingqing Wu;Dusit Niyato;Jiawen Kang;Abbas Jamalipour;Victor C. M. Leung","doi":"10.1109/JSAC.2024.3459029","DOIUrl":null,"url":null,"abstract":"Low Earth Orbit (LEO) satellites have emerged as crucial enablers of direct connections with remote terrestrial terminals. However, energy limitations and insufficient antenna capabilities at the terminals often hamper these connections, resulting in inefficient communications and frequent ping-pong handovers. This paper proposes a Distributed Collaborative Beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications. Specifically, DCB treats the terminals that are unable to establish efficient direct connections with the LEO satellites as distributed antennas, forming a virtual antenna array to enhance the terminal-to-satellite uplink achievable rates and durations. However, such systems need multiple trade-off policies that jointly balance the terminal-satellite uplink achievable rate, energy consumption of terminals, and satellite switching frequency to satisfy the scenario requirement changes. Thus, we formulate a long-term multi-objective optimization problem to optimize these goals simultaneously. To address availability in different terminal cluster scales, we reformulate this problem into an action space-reduced and universal Multi-Objective Markov Decision Process (MOMDP). Then, we propose an Evolutionary Multi-Objective Deep Reinforcement Learning (EMODRL) algorithm to obtain multiple policies, in which the low-value actions are masked to speed up the training process. Simulation results show that DCB enables terminals that cannot reach the uplink achievable rate threshold to achieve efficient direct uplink transmission. Moreover, the proposed algorithm outmatches various baselines and saves 30% handover frequency with a similar uplink achievable rate compared with the rate greedy method, which thus reveals that the proposed method is an effective solution for enabling direct ground-space communications.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"42 12","pages":"3395-3411"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Collaborative Ground-Space Communications via Evolutionary Multi-Objective Deep Reinforcement Learning\",\"authors\":\"Jiahui Li;Geng Sun;Qingqing Wu;Dusit Niyato;Jiawen Kang;Abbas Jamalipour;Victor C. M. Leung\",\"doi\":\"10.1109/JSAC.2024.3459029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Low Earth Orbit (LEO) satellites have emerged as crucial enablers of direct connections with remote terrestrial terminals. However, energy limitations and insufficient antenna capabilities at the terminals often hamper these connections, resulting in inefficient communications and frequent ping-pong handovers. This paper proposes a Distributed Collaborative Beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications. Specifically, DCB treats the terminals that are unable to establish efficient direct connections with the LEO satellites as distributed antennas, forming a virtual antenna array to enhance the terminal-to-satellite uplink achievable rates and durations. However, such systems need multiple trade-off policies that jointly balance the terminal-satellite uplink achievable rate, energy consumption of terminals, and satellite switching frequency to satisfy the scenario requirement changes. Thus, we formulate a long-term multi-objective optimization problem to optimize these goals simultaneously. To address availability in different terminal cluster scales, we reformulate this problem into an action space-reduced and universal Multi-Objective Markov Decision Process (MOMDP). Then, we propose an Evolutionary Multi-Objective Deep Reinforcement Learning (EMODRL) algorithm to obtain multiple policies, in which the low-value actions are masked to speed up the training process. Simulation results show that DCB enables terminals that cannot reach the uplink achievable rate threshold to achieve efficient direct uplink transmission. Moreover, the proposed algorithm outmatches various baselines and saves 30% handover frequency with a similar uplink achievable rate compared with the rate greedy method, which thus reveals that the proposed method is an effective solution for enabling direct ground-space communications.\",\"PeriodicalId\":73294,\"journal\":{\"name\":\"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society\",\"volume\":\"42 12\",\"pages\":\"3395-3411\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10679228/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10679228/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Collaborative Ground-Space Communications via Evolutionary Multi-Objective Deep Reinforcement Learning
Low Earth Orbit (LEO) satellites have emerged as crucial enablers of direct connections with remote terrestrial terminals. However, energy limitations and insufficient antenna capabilities at the terminals often hamper these connections, resulting in inefficient communications and frequent ping-pong handovers. This paper proposes a Distributed Collaborative Beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications. Specifically, DCB treats the terminals that are unable to establish efficient direct connections with the LEO satellites as distributed antennas, forming a virtual antenna array to enhance the terminal-to-satellite uplink achievable rates and durations. However, such systems need multiple trade-off policies that jointly balance the terminal-satellite uplink achievable rate, energy consumption of terminals, and satellite switching frequency to satisfy the scenario requirement changes. Thus, we formulate a long-term multi-objective optimization problem to optimize these goals simultaneously. To address availability in different terminal cluster scales, we reformulate this problem into an action space-reduced and universal Multi-Objective Markov Decision Process (MOMDP). Then, we propose an Evolutionary Multi-Objective Deep Reinforcement Learning (EMODRL) algorithm to obtain multiple policies, in which the low-value actions are masked to speed up the training process. Simulation results show that DCB enables terminals that cannot reach the uplink achievable rate threshold to achieve efficient direct uplink transmission. Moreover, the proposed algorithm outmatches various baselines and saves 30% handover frequency with a similar uplink achievable rate compared with the rate greedy method, which thus reveals that the proposed method is an effective solution for enabling direct ground-space communications.