Deep reinforcement learning for autonomous SideLink radio resource management in platoon-based C-V2X networks: An overview

IF 4.4 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Computer Networks Pub Date : 2024-11-12 DOI:10.1016/j.comnet.2024.110901

Nessrine Trabelsi , Lamia Chaari Fourati , Wael Jaafar

{"title":"Deep reinforcement learning for autonomous SideLink radio resource management in platoon-based C-V2X networks: An overview","authors":"Nessrine Trabelsi , Lamia Chaari Fourati , Wael Jaafar","doi":"10.1016/j.comnet.2024.110901","DOIUrl":null,"url":null,"abstract":"<div><div>Dynamic and autonomous SideLink (SL) Radio Resource Management (RRM) is essential for platoon-based cellular vehicular networks. However, this task is challenging due to several factors. These include the limited spectrum below 6 GHz, stringent vehicle-to-everything (V2X) communications requirements, uncertain and dynamic environments, limited vehicle sensing capabilities, and inherent distributed operation. These limitations often lead to resource collisions, data packet loss, and increased latency. Current standardized approaches in Long-Term Evolution-V2X (LTE-V2X) and New Radio-V2X (NR-V2X) rely on random resource selection, limiting their efficiency. Moreover, RRM is inherently a complex combinatorial optimization problem. It may involve conflicting objectives and constraints, making traditional approaches inadequate. Platoon-based communication necessitates careful resource allocation to support a diverse mix of communication types. These include safety–critical control messaging within platoons, less time-sensitive traffic management information between platoons, and even infotainment services like media streaming. Optimizing resource sharing inter- and intra-platoons is crucial to avoid excessive interference and ensure overall network performance. Deep Reinforcement Learning (DRL), combining Deep Learning (DL) and Reinforcement Learning (RL), has recently been investigated for network resource management. It offers a potential solution for these challenges. A DRL agent, represented by deep neural networks, interacts with the environment and learns optimal decision-making through trial and error. This paper overviews proposed DRL-based methods for autonomous SL RRM in single and multi-agent platoon-based C-V2X networks. It considers both intra- and inter-platoon communications with their specific requirements. We discuss the components of Markov Decision Processes (MDP) used to model the sequential decision-making of RRM. We then detail the DRL algorithms, training paradigms, and insights on the achieved results. Finally, we highlight challenges in existing works and suggest strategies for addressing them.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"255 ","pages":"Article 110901"},"PeriodicalIF":4.4000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128624007333","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Dynamic and autonomous SideLink (SL) Radio Resource Management (RRM) is essential for platoon-based cellular vehicular networks. However, this task is challenging due to several factors. These include the limited spectrum below 6 GHz, stringent vehicle-to-everything (V2X) communications requirements, uncertain and dynamic environments, limited vehicle sensing capabilities, and inherent distributed operation. These limitations often lead to resource collisions, data packet loss, and increased latency. Current standardized approaches in Long-Term Evolution-V2X (LTE-V2X) and New Radio-V2X (NR-V2X) rely on random resource selection, limiting their efficiency. Moreover, RRM is inherently a complex combinatorial optimization problem. It may involve conflicting objectives and constraints, making traditional approaches inadequate. Platoon-based communication necessitates careful resource allocation to support a diverse mix of communication types. These include safety–critical control messaging within platoons, less time-sensitive traffic management information between platoons, and even infotainment services like media streaming. Optimizing resource sharing inter- and intra-platoons is crucial to avoid excessive interference and ensure overall network performance. Deep Reinforcement Learning (DRL), combining Deep Learning (DL) and Reinforcement Learning (RL), has recently been investigated for network resource management. It offers a potential solution for these challenges. A DRL agent, represented by deep neural networks, interacts with the environment and learns optimal decision-making through trial and error. This paper overviews proposed DRL-based methods for autonomous SL RRM in single and multi-agent platoon-based C-V2X networks. It considers both intra- and inter-platoon communications with their specific requirements. We discuss the components of Markov Decision Processes (MDP) used to model the sequential decision-making of RRM. We then detail the DRL algorithms, training paradigms, and insights on the achieved results. Finally, we highlight challenges in existing works and suggest strategies for addressing them.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于排的 C-V2X 网络中自主 SideLink 无线电资源管理的深度强化学习：概述

动态和自主的侧联（SL）无线资源管理（RRM）对于基于排的蜂窝车载网络至关重要。然而，由于多种因素，这项任务具有挑战性。这些因素包括 6 GHz 以下的有限频谱、严格的车对物 (V2X) 通信要求、不确定的动态环境、有限的车辆传感能力以及固有的分布式操作。这些限制往往会导致资源碰撞、数据包丢失和延迟增加。目前，长期演进-V2X（LTE-V2X）和新无线电-V2X（NR-V2X）中的标准化方法依赖于随机资源选择，从而限制了其效率。此外，RRM 本身就是一个复杂的组合优化问题。它可能涉及相互冲突的目标和约束条件，使得传统方法无法胜任。基于排级的通信需要谨慎的资源分配，以支持通信类型的多样化组合。其中包括排内的安全关键控制信息、排间的时间敏感性较低的交通管理信息，甚至包括媒体流等信息娱乐服务。优化排间和排内的资源共享对于避免过度干扰和确保整体网络性能至关重要。深度强化学习（DRL）结合了深度学习（DL）和强化学习（RL），最近已被研究用于网络资源管理。它为应对这些挑战提供了一种潜在的解决方案。以深度神经网络为代表的 DRL 代理与环境互动，并通过试错学习最佳决策。本文概述了所提出的基于 DRL 的方法，用于基于单个和多个代理排的 C-V2X 网络中的自主 SL RRM。本文考虑了排内和排间通信的具体要求。我们讨论了马尔可夫决策过程（MDP）的组成部分，该过程用于为 RRM 的顺序决策建模。然后，我们详细介绍了 DRL 算法、训练范例以及对所取得成果的见解。最后，我们强调了现有工作中存在的挑战，并提出了应对策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.