Dynamic Beam Hopping for DVB-S2X Satellite: A Multi-Objective Deep Reinforcement Learning Approach

Yuchen Zhang, Xin Hu, Rong Chen, Zhili Zhang, Liquan Wang, Weidong Wang
{"title":"Dynamic Beam Hopping for DVB-S2X Satellite: A Multi-Objective Deep Reinforcement Learning Approach","authors":"Yuchen Zhang, Xin Hu, Rong Chen, Zhili Zhang, Liquan Wang, Weidong Wang","doi":"10.1109/IUCC/DSCI/SmartCNS.2019.00056","DOIUrl":null,"url":null,"abstract":"Dynamic Beam Hopping (DBH) is a crucial technology for adapting to the flexibility of different service configurations in the multi-beam satellite communications market. The conventional beam hopping method, which ignores the intrinsic correlation between decisions, only obtains the optimal solution at the current time, while deep reinforcement learning (DRL) is a typical algorithm for solving sequential decision problems. Therefore, to deal with the DBH problem in the scenario of Differentiated Services (DIFFSERV), this paper designs a multiobjective deep reinforcement learning (MO-DRL) algorithm. Besides, as the demand for the number of beams increases, the complexity of system implementation increase significantly. This paper innovatively proposes a time division multi-action selectionmethod(TD-MASM) tosolvethecurseofdimensionality problem. Under the real condition, the MO-DRL algorithm with the low complexity can ensure the fairness of each cell, improve the throughput to about 5540Mbps, and reduce the delay to about 0.367ms. The simulation results show that when the GA is used to achieve similar effects, the complexity of GA is about 110 times that of the MO-DRL algorithm.","PeriodicalId":410905,"journal":{"name":"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IUCC/DSCI/SmartCNS.2019.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Dynamic Beam Hopping (DBH) is a crucial technology for adapting to the flexibility of different service configurations in the multi-beam satellite communications market. The conventional beam hopping method, which ignores the intrinsic correlation between decisions, only obtains the optimal solution at the current time, while deep reinforcement learning (DRL) is a typical algorithm for solving sequential decision problems. Therefore, to deal with the DBH problem in the scenario of Differentiated Services (DIFFSERV), this paper designs a multiobjective deep reinforcement learning (MO-DRL) algorithm. Besides, as the demand for the number of beams increases, the complexity of system implementation increase significantly. This paper innovatively proposes a time division multi-action selectionmethod(TD-MASM) tosolvethecurseofdimensionality problem. Under the real condition, the MO-DRL algorithm with the low complexity can ensure the fairness of each cell, improve the throughput to about 5540Mbps, and reduce the delay to about 0.367ms. The simulation results show that when the GA is used to achieve similar effects, the complexity of GA is about 110 times that of the MO-DRL algorithm.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DVB-S2X卫星动态波束跳变:一种多目标深度强化学习方法
在多波束卫星通信市场中,动态跳波束是适应不同业务配置灵活性的一项关键技术。传统的跳波束方法忽略了决策之间的内在相关性,只能得到当前时刻的最优解,而深度强化学习(DRL)是求解序列决策问题的典型算法。因此,为了解决DIFFSERV (Differentiated Services)场景下的DBH问题,本文设计了一种多目标深度强化学习(MO-DRL)算法。此外,随着对波束数量需求的增加,系统实现的复杂性也显著增加。本文创新性地提出了一种时分多动作选择方法(TD-MASM)来解决维数变化问题。在实际条件下,复杂度较低的MO-DRL算法可以保证每个cell的公平性,将吞吐量提高到5540Mbps左右,将延迟降低到0.367ms左右。仿真结果表明,当采用遗传算法达到相似的效果时,遗传算法的复杂度是MO-DRL算法的110倍左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Message from the RTWC 2019 Workshop Chairs Message from the NGDN 2019 Workshop Chairs Ideation Support System with Personalized Knowledge Level Prediction Message from the DSCI 2019 General Chairs Connection Degree Cost and Reward Based Algorithm in Cognitive Radio Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1