基于元强化学习的动态通道访问

2021 IEEE Global Communications Conference (GLOBECOM) Pub Date : 2021-12-01 DOI:10.1109/GLOBECOM46510.2021.9685347

Ziyang Lu, M. C. Gursoy

{"title":"基于元强化学习的动态通道访问","authors":"Ziyang Lu, M. C. Gursoy","doi":"10.1109/GLOBECOM46510.2021.9685347","DOIUrl":null,"url":null,"abstract":"In this paper, we address the channel access problem in a dynamic wireless environment via meta-reinforcement learning. Spectrum is a scarce resource in wireless communications, especially with the dramatic increase in the number of devices in networks. Recently, inspired by the success of deep reinforcement learning (DRL), extensive studies have been conducted in addressing wireless resource allocation problems via DRL. However, training DRL algorithms usually requires a massive amount of data collected from the environment for each specific task and the well-trained model may fail if there is a small variation in the environment. In this work, in order to address these challenges, we propose a meta-DRL framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). In the proposed framework, we train a common initialization for similar channel selection tasks. From the initialization, we show that only a few gradient descents are required for adapting to different tasks drawn from the same distribution. We demonstrate the performance improvements via simulation results.","PeriodicalId":200641,"journal":{"name":"2021 IEEE Global Communications Conference (GLOBECOM)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Dynamic Channel Access via Meta-Reinforcement Learning\",\"authors\":\"Ziyang Lu, M. C. Gursoy\",\"doi\":\"10.1109/GLOBECOM46510.2021.9685347\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we address the channel access problem in a dynamic wireless environment via meta-reinforcement learning. Spectrum is a scarce resource in wireless communications, especially with the dramatic increase in the number of devices in networks. Recently, inspired by the success of deep reinforcement learning (DRL), extensive studies have been conducted in addressing wireless resource allocation problems via DRL. However, training DRL algorithms usually requires a massive amount of data collected from the environment for each specific task and the well-trained model may fail if there is a small variation in the environment. In this work, in order to address these challenges, we propose a meta-DRL framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). In the proposed framework, we train a common initialization for similar channel selection tasks. From the initialization, we show that only a few gradient descents are required for adapting to different tasks drawn from the same distribution. We demonstrate the performance improvements via simulation results.\",\"PeriodicalId\":200641,\"journal\":{\"name\":\"2021 IEEE Global Communications Conference (GLOBECOM)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Global Communications Conference (GLOBECOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GLOBECOM46510.2021.9685347\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Global Communications Conference (GLOBECOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM46510.2021.9685347","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在本文中，我们通过元强化学习解决了动态无线环境中的信道访问问题。频谱在无线通信中是一种稀缺资源，特别是随着网络中设备数量的急剧增加。近年来，受深度强化学习(DRL)成功的启发，人们对通过深度强化学习解决无线资源分配问题进行了广泛的研究。然而，训练DRL算法通常需要为每个特定任务从环境中收集大量数据，如果环境中存在微小变化，训练良好的模型可能会失败。在这项工作中，为了解决这些挑战，我们提出了一个包含模型不可知元学习(MAML)方法的元drl框架。在提出的框架中，我们为类似的信道选择任务训练了一个通用的初始化。从初始化，我们表明，只需要几个梯度下降，以适应从同一分布绘制的不同任务。我们通过仿真结果演示了性能改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Dynamic Channel Access via Meta-Reinforcement Learning

In this paper, we address the channel access problem in a dynamic wireless environment via meta-reinforcement learning. Spectrum is a scarce resource in wireless communications, especially with the dramatic increase in the number of devices in networks. Recently, inspired by the success of deep reinforcement learning (DRL), extensive studies have been conducted in addressing wireless resource allocation problems via DRL. However, training DRL algorithms usually requires a massive amount of data collected from the environment for each specific task and the well-trained model may fail if there is a small variation in the environment. In this work, in order to address these challenges, we propose a meta-DRL framework that incorporates the method of Model-Agnostic Meta-Learning (MAML). In the proposed framework, we train a common initialization for similar channel selection tasks. From the initialization, we show that only a few gradient descents are required for adapting to different tasks drawn from the same distribution. We demonstrate the performance improvements via simulation results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE Global Communications Conference (GLOBECOM)

自引率

0.00%

发文量

期刊最新文献

A Blockchain-based Energy Trading Scheme for Dynamic Charging of Electric Vehicles Algebraic Design of a Class of Rate 1/3 Quasi-Cyclic LDPC Codes A Fast and Scalable Resource Allocation Scheme for End-to-End Network Slices Modelling of Multi-Tier Handover in LiFi Networks Enabling Efficient Scheduling Policy in Intelligent Reflecting Surface Aided Federated Learning