Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang
{"title":"依赖消除MADRL:支线和用户链路集成卫星通信的可扩展星上资源分配","authors":"Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang","doi":"10.1109/TCOMM.2025.3529212","DOIUrl":null,"url":null,"abstract":"Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.","PeriodicalId":13041,"journal":{"name":"IEEE Transactions on Communications","volume":"73 8","pages":"6673-6688"},"PeriodicalIF":8.4000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dependency-Elimination MADRL: Scalable On-Board Resource Allocation for Feeder- and User-Link Integrated Satellite Communications\",\"authors\":\"Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang\",\"doi\":\"10.1109/TCOMM.2025.3529212\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.\",\"PeriodicalId\":13041,\"journal\":{\"name\":\"IEEE Transactions on Communications\",\"volume\":\"73 8\",\"pages\":\"6673-6688\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10839405/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Communications","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10839405/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Dependency-Elimination MADRL: Scalable On-Board Resource Allocation for Feeder- and User-Link Integrated Satellite Communications
Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.
期刊介绍:
The IEEE Transactions on Communications is dedicated to publishing high-quality manuscripts that showcase advancements in the state-of-the-art of telecommunications. Our scope encompasses all aspects of telecommunications, including telephone, telegraphy, facsimile, and television, facilitated by electromagnetic propagation methods such as radio, wire, aerial, underground, coaxial, and submarine cables, as well as waveguides, communication satellites, and lasers. We cover telecommunications in various settings, including marine, aeronautical, space, and fixed station services, addressing topics such as repeaters, radio relaying, signal storage, regeneration, error detection and correction, multiplexing, carrier techniques, communication switching systems, data communications, and communication theory. Join us in advancing the field of telecommunications through groundbreaking research and innovation.