依赖消除MADRL:支线和用户链路集成卫星通信的可扩展星上资源分配

IF 8.4 2区 计算机科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Communications Pub Date : 2025-01-13 DOI:10.1109/TCOMM.2025.3529212
Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang
{"title":"依赖消除MADRL:支线和用户链路集成卫星通信的可扩展星上资源分配","authors":"Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang","doi":"10.1109/TCOMM.2025.3529212","DOIUrl":null,"url":null,"abstract":"Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.","PeriodicalId":13041,"journal":{"name":"IEEE Transactions on Communications","volume":"73 8","pages":"6673-6688"},"PeriodicalIF":8.4000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dependency-Elimination MADRL: Scalable On-Board Resource Allocation for Feeder- and User-Link Integrated Satellite Communications\",\"authors\":\"Qiaolin Ouyang;Neng Ye;Wonjae Shin;Xiaozheng Gao;Dusit Niyato;Kai Yang\",\"doi\":\"10.1109/TCOMM.2025.3529212\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.\",\"PeriodicalId\":13041,\"journal\":{\"name\":\"IEEE Transactions on Communications\",\"volume\":\"73 8\",\"pages\":\"6673-6688\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10839405/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Communications","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10839405/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

在多波束卫星通信中,将馈线链路和用户链路相结合可以显著提高系统的灵活性,但需要有效的资源配置才能充分发挥其潜力。多智能体深度强化学习(MADRL)通过允许每个智能体优化一个波束的传输参数,已经成为一种可扩展的波束跳变解决方案。然而,集成馈线和用户链路引入了复杂的依赖关系,包括馈线和用户链路之间的资源竞争以及上行链路和下行链路之间的数据流耦合,极大地恶化了agent的合作。为了接近性能极限,本文引入了一个消除依赖的MADRL框架,该框架结合了模型分解、链路解耦和新颖的代理级协作机制,以降低复杂性来分配波束、功率和带宽。具体来说,在馈线和用户链路异构的情况下,以数据流聚合和分割为特征,为了促进波束级代理重用以降低复杂性,我们将带宽分配与学习模型解耦。然后使用基于性能上界的广义注水策略解决带宽分配中的上行链路和下行链路依赖关系。此外,我们通过状态和奖励分解以及一种新的不合作惩罚来提高代理的合作效率。评估表明,与其他MADRL方法相比,我们的方法将系统性能提高了57.7%,同时将训练复杂性降低了50%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Dependency-Elimination MADRL: Scalable On-Board Resource Allocation for Feeder- and User-Link Integrated Satellite Communications
Integrating feeder- and user-links in multi-beam satellite communications significantly enhances system flexibility but requires effective resource allocation to fully realize its potential. Multi-agent deep reinforcement learning (MADRL) has emerged as a scalable solution for beam hopping, by allowing each agent to optimize the transmission parameters for one beam. However, integrating feeder- and user-links introduces complicated dependencies, including resource competition between feeder- and user-links and data-flow coupling between uplinks and downlinks, dramatically deteriorating agent cooperation. To approach the performance limit, this paper introduces a dependency-elimination MADRL framework incorporating model decomposition, link decoupling, and novel agent-level collaboration mechanisms to allocate beams, power, and bandwidth with reduced complexity. Specifically, to facilitate beam-level agent reuse for complexity reduction under the heterogeneity of feeder- and user-links, characterized by data-flow aggregation and division, we decouple bandwidth allocation from the learning model. The uplink-downlink dependencies in the bandwidth allocation is then resolved using a generalized water-filling strategy based on the performance upper bounds. Furthermore, we improve agent cooperation efficiency through state and reward decomposition and a novel non-cooperation penalty. Evaluations show that our method improves the system performance by up to 57.7% compared to sota MADRL methods while reducing training complexity by more than 50%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Communications
IEEE Transactions on Communications 工程技术-电信学
CiteScore
16.10
自引率
8.40%
发文量
528
审稿时长
4.1 months
期刊介绍: The IEEE Transactions on Communications is dedicated to publishing high-quality manuscripts that showcase advancements in the state-of-the-art of telecommunications. Our scope encompasses all aspects of telecommunications, including telephone, telegraphy, facsimile, and television, facilitated by electromagnetic propagation methods such as radio, wire, aerial, underground, coaxial, and submarine cables, as well as waveguides, communication satellites, and lasers. We cover telecommunications in various settings, including marine, aeronautical, space, and fixed station services, addressing topics such as repeaters, radio relaying, signal storage, regeneration, error detection and correction, multiplexing, carrier techniques, communication switching systems, data communications, and communication theory. Join us in advancing the field of telecommunications through groundbreaking research and innovation.
期刊最新文献
Adaptive UAV Positioning to Enhance SNR in Air-to-Water Optical Wireless Channels CRB-Constrained Rate Optimization for Movable Antenna-Enabled IRS-Aided ISAC Systems Enhancing Near-field BAN-based Vital-Sign Monitoring via Integrated Sensing, Communication, and Powering Network-Level Performance Analysis for Hybrid sub-6 GHz and mmWave Integrated Sensing and Communications OIRS-assisted VLC Channel Optimization Against UAV Blockage Based on Two-Stage Machine Learning Framework
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1