面向O-RAN/MEC联合编排的深度强化学习贝叶斯框架

IF 6.3 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Open Journal of the Communications Society Pub Date : 2024-11-29 DOI:10.1109/OJCOMS.2024.3509777

Fahri Wisnu Murti;Samad Ali;Matti Latva-Aho

{"title":"面向O-RAN/MEC联合编排的深度强化学习贝叶斯框架","authors":"Fahri Wisnu Murti;Samad Ali;Matti Latva-Aho","doi":"10.1109/OJCOMS.2024.3509777","DOIUrl":null,"url":null,"abstract":"Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) to offer low-cost deployment and bring services closer to end-users. In this paper, the joint orchestration of O-RAN and MEC using a Bayesian deep reinforcement learning (RL) framework is proposed. The framework jointly controls the O-RAN functional splits, O-RAN/MEC computing resource allocation, hosting locations, and data flow routing across geo-distributed platforms. The goal is to minimize the long-term total network operation cost and maximize MEC performance criterion while adapting to varying demands and resource availability. This orchestration problem is formulated as a Markov decision process (MDP). However, finding the exact model of the underlying O-RAN/MEC system is impractical since the system shares the same resources, serves heterogeneous demands, and its parameters have non-trivial relationships. Moreover, the formulated MDP results in a large state space with multidimensional discrete actions. To address these challenges, a model-free RL agent based on a combination of Double Deep Q-network (DDQN) with action branching is proposed. Furthermore, an efficient exploration-exploitation strategy under a Bayesian learning framework is leveraged to improve learning performance and expedite convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges significantly faster), increases the reward by 32% compared to its non-Bayesian version, and outperforms Deep Deterministic Policy Gradient by up to 41%.","PeriodicalId":33803,"journal":{"name":"IEEE Open Journal of the Communications Society","volume":"5 ","pages":"7685-7700"},"PeriodicalIF":6.3000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10771978","citationCount":"0","resultStr":"{\"title\":\"A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration\",\"authors\":\"Fahri Wisnu Murti;Samad Ali;Matti Latva-Aho\",\"doi\":\"10.1109/OJCOMS.2024.3509777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) to offer low-cost deployment and bring services closer to end-users. In this paper, the joint orchestration of O-RAN and MEC using a Bayesian deep reinforcement learning (RL) framework is proposed. The framework jointly controls the O-RAN functional splits, O-RAN/MEC computing resource allocation, hosting locations, and data flow routing across geo-distributed platforms. The goal is to minimize the long-term total network operation cost and maximize MEC performance criterion while adapting to varying demands and resource availability. This orchestration problem is formulated as a Markov decision process (MDP). However, finding the exact model of the underlying O-RAN/MEC system is impractical since the system shares the same resources, serves heterogeneous demands, and its parameters have non-trivial relationships. Moreover, the formulated MDP results in a large state space with multidimensional discrete actions. To address these challenges, a model-free RL agent based on a combination of Double Deep Q-network (DDQN) with action branching is proposed. Furthermore, an efficient exploration-exploitation strategy under a Bayesian learning framework is leveraged to improve learning performance and expedite convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges significantly faster), increases the reward by 32% compared to its non-Bayesian version, and outperforms Deep Deterministic Policy Gradient by up to 41%.\",\"PeriodicalId\":33803,\"journal\":{\"name\":\"IEEE Open Journal of the Communications Society\",\"volume\":\"5 \",\"pages\":\"7685-7700\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2024-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10771978\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Communications Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10771978/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10771978/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

多接入边缘计算（MEC）可以与开放无线接入网（O-RAN）一起实施，以提供低成本部署，并使服务更接近最终用户。本文提出了基于贝叶斯深度强化学习（RL）框架的O-RAN和MEC的联合编排。该框架共同控制O-RAN功能拆分、O-RAN/MEC计算资源分配、托管位置和跨地理分布式平台的数据流路由。目标是在适应不同需求和资源可用性的同时，最小化长期总网络运行成本，最大化MEC性能标准。这个编排问题被表述为马尔可夫决策过程（Markov decision process， MDP）。然而，找到底层O-RAN/MEC系统的确切模型是不切实际的，因为系统共享相同的资源，服务异构需求，其参数具有重要的关系。此外，公式化的MDP导致具有多维离散动作的大状态空间。为了解决这些问题，提出了一种基于双深度q网络（DDQN）和动作分支相结合的无模型RL智能体。此外，利用贝叶斯学习框架下的有效探索-开发策略来提高学习性能并加快收敛速度。跟踪驱动的仿真使用o - ran兼容模型执行。结果表明，我们的方法是数据高效的（即，收敛速度明显更快），与非贝叶斯版本相比，奖励增加了32%，并且比深度确定性策略梯度高出41%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration

Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) to offer low-cost deployment and bring services closer to end-users. In this paper, the joint orchestration of O-RAN and MEC using a Bayesian deep reinforcement learning (RL) framework is proposed. The framework jointly controls the O-RAN functional splits, O-RAN/MEC computing resource allocation, hosting locations, and data flow routing across geo-distributed platforms. The goal is to minimize the long-term total network operation cost and maximize MEC performance criterion while adapting to varying demands and resource availability. This orchestration problem is formulated as a Markov decision process (MDP). However, finding the exact model of the underlying O-RAN/MEC system is impractical since the system shares the same resources, serves heterogeneous demands, and its parameters have non-trivial relationships. Moreover, the formulated MDP results in a large state space with multidimensional discrete actions. To address these challenges, a model-free RL agent based on a combination of Double Deep Q-network (DDQN) with action branching is proposed. Furthermore, an efficient exploration-exploitation strategy under a Bayesian learning framework is leveraged to improve learning performance and expedite convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges significantly faster), increases the reward by 32% compared to its non-Bayesian version, and outperforms Deep Deterministic Policy Gradient by up to 41%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Open Journal of the Communications Society Multiple-

CiteScore

13.70

自引率

3.80%

发文量

审稿时长

10 weeks

期刊介绍： The IEEE Open Journal of the Communications Society (OJ-COMS) is an open access, all-electronic journal that publishes original high-quality manuscripts on advances in the state of the art of telecommunications systems and networks. The papers in IEEE OJ-COMS are included in Scopus. Submissions reporting new theoretical findings (including novel methods, concepts, and studies) and practical contributions (including experiments and development of prototypes) are welcome. Additionally, survey and tutorial articles are considered. The IEEE OJCOMS received its debut impact factor of 7.9 according to the Journal Citation Reports (JCR) 2023. The IEEE Open Journal of the Communications Society covers science, technology, applications and standards for information organization, collection and transfer using electronic, optical and wireless channels and networks. Some specific areas covered include: Systems and network architecture, control and management Protocols, software, and middleware Quality of service, reliability, and security Modulation, detection, coding, and signaling Switching and routing Mobile and portable communications Terminals and other end-user devices Networks for content distribution and distributed computing Communications-based distributed resources control.

期刊最新文献

Table of Contents Front Cover Hierarchical Blockchain Radio Access Networks: Architecture, Modelling, and Performance Assessment Active RIS-NOMA Uplink in URLLC, Jamming Mitigation via Surrogate and Deep Learning Transceiver Design of a Secure Multiuser FDSS-Based DFT-Spread OFDM System for RIS- and UAV-Assisted THz Communications