{"title":"面向O-RAN/MEC联合编排的深度强化学习贝叶斯框架","authors":"Fahri Wisnu Murti;Samad Ali;Matti Latva-Aho","doi":"10.1109/OJCOMS.2024.3509777","DOIUrl":null,"url":null,"abstract":"Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) to offer low-cost deployment and bring services closer to end-users. In this paper, the joint orchestration of O-RAN and MEC using a Bayesian deep reinforcement learning (RL) framework is proposed. The framework jointly controls the O-RAN functional splits, O-RAN/MEC computing resource allocation, hosting locations, and data flow routing across geo-distributed platforms. The goal is to minimize the long-term total network operation cost and maximize MEC performance criterion while adapting to varying demands and resource availability. This orchestration problem is formulated as a Markov decision process (MDP). However, finding the exact model of the underlying O-RAN/MEC system is impractical since the system shares the same resources, serves heterogeneous demands, and its parameters have non-trivial relationships. Moreover, the formulated MDP results in a large state space with multidimensional discrete actions. To address these challenges, a model-free RL agent based on a combination of Double Deep Q-network (DDQN) with action branching is proposed. Furthermore, an efficient exploration-exploitation strategy under a Bayesian learning framework is leveraged to improve learning performance and expedite convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges significantly faster), increases the reward by 32% compared to its non-Bayesian version, and outperforms Deep Deterministic Policy Gradient by up to 41%.","PeriodicalId":33803,"journal":{"name":"IEEE Open Journal of the Communications Society","volume":"5 ","pages":"7685-7700"},"PeriodicalIF":6.3000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10771978","citationCount":"0","resultStr":"{\"title\":\"A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration\",\"authors\":\"Fahri Wisnu Murti;Samad Ali;Matti Latva-Aho\",\"doi\":\"10.1109/OJCOMS.2024.3509777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) to offer low-cost deployment and bring services closer to end-users. In this paper, the joint orchestration of O-RAN and MEC using a Bayesian deep reinforcement learning (RL) framework is proposed. The framework jointly controls the O-RAN functional splits, O-RAN/MEC computing resource allocation, hosting locations, and data flow routing across geo-distributed platforms. The goal is to minimize the long-term total network operation cost and maximize MEC performance criterion while adapting to varying demands and resource availability. This orchestration problem is formulated as a Markov decision process (MDP). However, finding the exact model of the underlying O-RAN/MEC system is impractical since the system shares the same resources, serves heterogeneous demands, and its parameters have non-trivial relationships. Moreover, the formulated MDP results in a large state space with multidimensional discrete actions. To address these challenges, a model-free RL agent based on a combination of Double Deep Q-network (DDQN) with action branching is proposed. Furthermore, an efficient exploration-exploitation strategy under a Bayesian learning framework is leveraged to improve learning performance and expedite convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges significantly faster), increases the reward by 32% compared to its non-Bayesian version, and outperforms Deep Deterministic Policy Gradient by up to 41%.\",\"PeriodicalId\":33803,\"journal\":{\"name\":\"IEEE Open Journal of the Communications Society\",\"volume\":\"5 \",\"pages\":\"7685-7700\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2024-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10771978\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Communications Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10771978/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10771978/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration
Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) to offer low-cost deployment and bring services closer to end-users. In this paper, the joint orchestration of O-RAN and MEC using a Bayesian deep reinforcement learning (RL) framework is proposed. The framework jointly controls the O-RAN functional splits, O-RAN/MEC computing resource allocation, hosting locations, and data flow routing across geo-distributed platforms. The goal is to minimize the long-term total network operation cost and maximize MEC performance criterion while adapting to varying demands and resource availability. This orchestration problem is formulated as a Markov decision process (MDP). However, finding the exact model of the underlying O-RAN/MEC system is impractical since the system shares the same resources, serves heterogeneous demands, and its parameters have non-trivial relationships. Moreover, the formulated MDP results in a large state space with multidimensional discrete actions. To address these challenges, a model-free RL agent based on a combination of Double Deep Q-network (DDQN) with action branching is proposed. Furthermore, an efficient exploration-exploitation strategy under a Bayesian learning framework is leveraged to improve learning performance and expedite convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges significantly faster), increases the reward by 32% compared to its non-Bayesian version, and outperforms Deep Deterministic Policy Gradient by up to 41%.
期刊介绍:
The IEEE Open Journal of the Communications Society (OJ-COMS) is an open access, all-electronic journal that publishes original high-quality manuscripts on advances in the state of the art of telecommunications systems and networks. The papers in IEEE OJ-COMS are included in Scopus. Submissions reporting new theoretical findings (including novel methods, concepts, and studies) and practical contributions (including experiments and development of prototypes) are welcome. Additionally, survey and tutorial articles are considered. The IEEE OJCOMS received its debut impact factor of 7.9 according to the Journal Citation Reports (JCR) 2023.
The IEEE Open Journal of the Communications Society covers science, technology, applications and standards for information organization, collection and transfer using electronic, optical and wireless channels and networks. Some specific areas covered include:
Systems and network architecture, control and management
Protocols, software, and middleware
Quality of service, reliability, and security
Modulation, detection, coding, and signaling
Switching and routing
Mobile and portable communications
Terminals and other end-user devices
Networks for content distribution and distributed computing
Communications-based distributed resources control.