依赖消除MADRL：支线和用户链路集成卫星通信的可扩展星上资源分配

IF 8.4 2区计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Communications Pub Date : 2025-01-13 DOI:10.1109/TCOMM.2025.3529212

Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang

{"title":"依赖消除MADRL：支线和用户链路集成卫星通信的可扩展星上资源分配","authors":"Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang","doi":"10.1109/TCOMM.2025.3529212","DOIUrl":null,"url":null,"abstract":"Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.","PeriodicalId":13041,"journal":{"name":"IEEE Transactions on Communications","volume":"73 8","pages":"6673-6688"},"PeriodicalIF":8.4000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dependency-Elimination MADRL: Scalable On-Board Resource Allocation for Feeder- and User-Link Integrated Satellite Communications\",\"authors\":\"Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang\",\"doi\":\"10.1109/TCOMM.2025.3529212\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.\",\"PeriodicalId\":13041,\"journal\":{\"name\":\"IEEE Transactions on Communications\",\"volume\":\"73 8\",\"pages\":\"6673-6688\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10839405/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Communications","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10839405/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

在多波束卫星通信中，将馈线链路和用户链路相结合可以显著提高系统的灵活性，但需要有效的资源配置才能充分发挥其潜力。多智能体深度强化学习（MADRL）通过允许每个智能体优化一个波束的传输参数，已经成为一种可扩展的波束跳变解决方案。然而，集成馈线和用户链路引入了复杂的依赖关系，包括馈线和用户链路之间的资源竞争以及上行链路和下行链路之间的数据流耦合，极大地恶化了agent的合作。为了接近性能极限，本文引入了一个消除依赖的MADRL框架，该框架结合了模型分解、链路解耦和新颖的代理级协作机制，以降低复杂性来分配波束、功率和带宽。具体来说，在馈线和用户链路异构的情况下，以数据流聚合和分割为特征，为了促进波束级代理重用以降低复杂性，我们将带宽分配与学习模型解耦。然后使用基于性能上界的广义注水策略解决带宽分配中的上行链路和下行链路依赖关系。此外，我们通过状态和奖励分解以及一种新的不合作惩罚来提高代理的合作效率。评估表明，与其他MADRL方法相比，我们的方法将系统性能提高了57.7%，同时将训练复杂性降低了50%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Dependency-Elimination MADRL: Scalable On-Board Resource Allocation for Feeder- and User-Link Integrated Satellite Communications

Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Communications 工程技术-电信学

CiteScore

16.10

自引率

8.40%

发文量

528

审稿时长

4.1 months

期刊介绍： The IEEE Transactions on Communications is dedicated to publishing high-quality manuscripts that showcase advancements in the state-of-the-art of telecommunications. Our scope encompasses all aspects of telecommunications, including telephone, telegraphy, facsimile, and television, facilitated by electromagnetic propagation methods such as radio, wire, aerial, underground, coaxial, and submarine cables, as well as waveguides, communication satellites, and lasers. We cover telecommunications in various settings, including marine, aeronautical, space, and fixed station services, addressing topics such as repeaters, radio relaying, signal storage, regeneration, error detection and correction, multiplexing, carrier techniques, communication switching systems, data communications, and communication theory. Join us in advancing the field of telecommunications through groundbreaking research and innovation.