Dynamic Beam Hopping for DVB-S2X Satellite: A Multi-Objective Deep Reinforcement Learning Approach

2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS) Pub Date : 2019-10-01 DOI:10.1109/IUCC/DSCI/SmartCNS.2019.00056

Yuchen Zhang, Xin Hu, Rong Chen, Zhili Zhang, Liquan Wang, Weidong Wang

{"title":"Dynamic Beam Hopping for DVB-S2X Satellite: A Multi-Objective Deep Reinforcement Learning Approach","authors":"Yuchen Zhang, Xin Hu, Rong Chen, Zhili Zhang, Liquan Wang, Weidong Wang","doi":"10.1109/IUCC/DSCI/SmartCNS.2019.00056","DOIUrl":null,"url":null,"abstract":"Dynamic Beam Hopping (DBH) is a crucial technology for adapting to the flexibility of different service configurations in the multi-beam satellite communications market. The conventional beam hopping method, which ignores the intrinsic correlation between decisions, only obtains the optimal solution at the current time, while deep reinforcement learning (DRL) is a typical algorithm for solving sequential decision problems. Therefore, to deal with the DBH problem in the scenario of Differentiated Services (DIFFSERV), this paper designs a multiobjective deep reinforcement learning (MO-DRL) algorithm. Besides, as the demand for the number of beams increases, the complexity of system implementation increase significantly. This paper innovatively proposes a time division multi-action selectionmethod(TD-MASM) tosolvethecurseofdimensionality problem. Under the real condition, the MO-DRL algorithm with the low complexity can ensure the fairness of each cell, improve the throughput to about 5540Mbps, and reduce the delay to about 0.367ms. The simulation results show that when the GA is used to achieve similar effects, the complexity of GA is about 110 times that of the MO-DRL algorithm.","PeriodicalId":410905,"journal":{"name":"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IUCC/DSCI/SmartCNS.2019.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Dynamic Beam Hopping (DBH) is a crucial technology for adapting to the flexibility of different service configurations in the multi-beam satellite communications market. The conventional beam hopping method, which ignores the intrinsic correlation between decisions, only obtains the optimal solution at the current time, while deep reinforcement learning (DRL) is a typical algorithm for solving sequential decision problems. Therefore, to deal with the DBH problem in the scenario of Differentiated Services (DIFFSERV), this paper designs a multiobjective deep reinforcement learning (MO-DRL) algorithm. Besides, as the demand for the number of beams increases, the complexity of system implementation increase significantly. This paper innovatively proposes a time division multi-action selectionmethod(TD-MASM) tosolvethecurseofdimensionality problem. Under the real condition, the MO-DRL algorithm with the low complexity can ensure the fairness of each cell, improve the throughput to about 5540Mbps, and reduce the delay to about 0.367ms. The simulation results show that when the GA is used to achieve similar effects, the complexity of GA is about 110 times that of the MO-DRL algorithm.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DVB-S2X卫星动态波束跳变:一种多目标深度强化学习方法

在多波束卫星通信市场中，动态跳波束是适应不同业务配置灵活性的一项关键技术。传统的跳波束方法忽略了决策之间的内在相关性，只能得到当前时刻的最优解，而深度强化学习(DRL)是求解序列决策问题的典型算法。因此，为了解决DIFFSERV (Differentiated Services)场景下的DBH问题，本文设计了一种多目标深度强化学习(MO-DRL)算法。此外，随着对波束数量需求的增加，系统实现的复杂性也显著增加。本文创新性地提出了一种时分多动作选择方法(TD-MASM)来解决维数变化问题。在实际条件下，复杂度较低的MO-DRL算法可以保证每个cell的公平性，将吞吐量提高到5540Mbps左右，将延迟降低到0.367ms左右。仿真结果表明，当采用遗传算法达到相似的效果时，遗传算法的复杂度是MO-DRL算法的110倍左右。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE International Conferences on Ubiquitous Computing & Communications (IUCC) and Data Science and Computational Intelligence (DSCI) and Smart Computing, Networking and Services (SmartCNS)

自引率

0.00%

发文量

期刊最新文献

Message from the RTWC 2019 Workshop Chairs Message from the NGDN 2019 Workshop Chairs Ideation Support System with Personalized Knowledge Level Prediction Message from the DSCI 2019 General Chairs Connection Degree Cost and Reward Based Algorithm in Cognitive Radio Networks