Distributed Reinforcement Learning for NOMA-Enabled Mobile Edge Computing

Zhong Yang, Yuanwei Liu, Yue Chen
{"title":"Distributed Reinforcement Learning for NOMA-Enabled Mobile Edge Computing","authors":"Zhong Yang, Yuanwei Liu, Yue Chen","doi":"10.1109/ICCWorkshops49005.2020.9145457","DOIUrl":null,"url":null,"abstract":"A novel non-orthogonal multiple access (NOMA) enabled cache-aided mobile edge computing (MEC) framework is proposed, for minimizing the sum energy consumption. The NOMA strategy enables mobile users to offload computation tasks to the access point (AP) simultaneously, which improves the spectrum efficiency. In this article, the considered resource allocation problem is formulated as a long-term reward maximization problem that involves a joint optimization of task offloading decision, computation resource allocation, and caching decision. To tackle this nontrivial problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy from historical experience. Moreover, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal actions in every state. The proposed BLA based action selection scheme is instantaneously self-correcting, consequently, if the probabilities of two computing models (i.e., local computing and offloading computing) are not equal, the optimal action unveils eventually. Extensive simulations demonstrate that: 1) The proposed cache-aided NOMA MEC framework significantly outperforms the other representative benchmark schemes under various network setups. 2) The effectiveness of the proposed BAL-MAQ-learning algorithm is confirmed from the comparison with the results of conventional reinforcement learning algorithms.","PeriodicalId":254869,"journal":{"name":"2020 IEEE International Conference on Communications Workshops (ICC Workshops)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Communications Workshops (ICC Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCWorkshops49005.2020.9145457","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

A novel non-orthogonal multiple access (NOMA) enabled cache-aided mobile edge computing (MEC) framework is proposed, for minimizing the sum energy consumption. The NOMA strategy enables mobile users to offload computation tasks to the access point (AP) simultaneously, which improves the spectrum efficiency. In this article, the considered resource allocation problem is formulated as a long-term reward maximization problem that involves a joint optimization of task offloading decision, computation resource allocation, and caching decision. To tackle this nontrivial problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy from historical experience. Moreover, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal actions in every state. The proposed BLA based action selection scheme is instantaneously self-correcting, consequently, if the probabilities of two computing models (i.e., local computing and offloading computing) are not equal, the optimal action unveils eventually. Extensive simulations demonstrate that: 1) The proposed cache-aided NOMA MEC framework significantly outperforms the other representative benchmark schemes under various network setups. 2) The effectiveness of the proposed BAL-MAQ-learning algorithm is confirmed from the comparison with the results of conventional reinforcement learning algorithms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
支持noma的移动边缘计算的分布式强化学习
提出了一种新的非正交多址(NOMA)缓存辅助移动边缘计算(MEC)框架,以最小化总能耗。NOMA策略使移动用户能够同时将计算任务卸载到接入点(AP),从而提高频谱效率。在本文中,所考虑的资源分配问题被表述为一个长期奖励最大化问题,它涉及任务卸载决策、计算资源分配和缓存决策的联合优化。为了解决这个重要的问题,调用了单智能体q -学习(SAQ-learning)算法,从历史经验中学习长期资源分配策略。此外,提出了一种基于贝叶斯学习自动机(BLA)的多智能体q -学习(maq -学习)算法,用于任务卸载决策。更具体地说,提出了一种基于BLA的maq学习智能体在每个状态下选择最优动作的方法。所提出的基于BLA的动作选择方案具有即时自校正性,因此,当两种计算模型(即局部计算和卸载计算)的概率不相等时,最终会揭示出最优动作。大量的仿真表明:1)在各种网络设置下,所提出的缓存辅助NOMA MEC框架显著优于其他代表性基准方案。2)通过与传统强化学习算法的结果对比,验证了所提bal - maq学习算法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Peak Age-of-Information Minimization of UAV-Aided Relay Transmission ICC 2020 Symposium Chairs Green Cooperative Communication Based Cognitive Radio Sensor Networks for IoT Applications KaRuNa: A Blockchain-Based Sentiment Analysis Framework for Fraud Cryptocurrency Schemes A Systematic Framework for State Channel Protocols Identification for Blockchain-Based IoT Networks and Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1