自动驾驶车辆的联合自适应OFDM和强化学习设计：利用更新时间

IF 6.6 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Open Journal of Vehicular Technology Pub Date : 2025-01-15 DOI:10.1109/OJVT.2025.3530008

Mamady Delamou;Ahmed Naeem;Hüseyin Arslan;El Mehdi Amhoud

{"title":"自动驾驶车辆的联合自适应OFDM和强化学习设计：利用更新时间","authors":"Mamady Delamou;Ahmed Naeem;Hüseyin Arslan;El Mehdi Amhoud","doi":"10.1109/OJVT.2025.3530008","DOIUrl":null,"url":null,"abstract":"Millimeter wave (mmWave)-based orthogonal frequency-division multiplexing (OFDM) stands out as a suitable alternative for high-resolution sensing and high-speed data transmission. To meet communication and sensing requirements, many works propose a static configuration where the wave's hyperparameters such as the number of symbols in a frame and the number of frames in a communication slot are already predefined. However, two facts oblige us to redefine the problem, 1) the environment is often dynamic and uncertain, and 2) mmWave is severely impacted by wireless environments. A striking example where this challenge is very prominent is autonomous vehicle (AV). Such a system leverages integrated sensing and communication (ISAC) using mmWave to manage data transmission and the dynamism of the environment. In this work, we consider an autonomous vehicle network where an AV utilizes its queue state information (QSI) and channel state information (CSI) in conjunction with reinforcement learning techniques to manage communication and sensing. This enables the AV to achieve two primary objectives: establishing a stable communication link with other AVs and accurately estimating the velocities of surrounding objects with high resolution. The communication performance is therefore evaluated based on the queue state, the effective data rate, and the discarded packets rate. In contrast, the effectiveness of the sensing is assessed using the velocity resolution. In addition, we exploit adaptive OFDM techniques for dynamic modulation, and we suggest a reward function that leverages the age of updates to handle the communication buffer and improve sensing. The system is validated using advantage actor-critic (A2C) and proximal policy optimization (PPO). Furthermore, we compare our solution with the existing design and demonstrate its superior performance by computer simulations.","PeriodicalId":34270,"journal":{"name":"IEEE Open Journal of Vehicular Technology","volume":"6 ","pages":"455-470"},"PeriodicalIF":6.6000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10842044","citationCount":"0","resultStr":"{\"title\":\"Joint Adaptive OFDM and Reinforcement Learning Design for Autonomous Vehicles: Leveraging Age of Updates\",\"authors\":\"Mamady Delamou;Ahmed Naeem;Hüseyin Arslan;El Mehdi Amhoud\",\"doi\":\"10.1109/OJVT.2025.3530008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Millimeter wave (mmWave)-based orthogonal frequency-division multiplexing (OFDM) stands out as a suitable alternative for high-resolution sensing and high-speed data transmission. To meet communication and sensing requirements, many works propose a static configuration where the wave's hyperparameters such as the number of symbols in a frame and the number of frames in a communication slot are already predefined. However, two facts oblige us to redefine the problem, 1) the environment is often dynamic and uncertain, and 2) mmWave is severely impacted by wireless environments. A striking example where this challenge is very prominent is autonomous vehicle (AV). Such a system leverages integrated sensing and communication (ISAC) using mmWave to manage data transmission and the dynamism of the environment. In this work, we consider an autonomous vehicle network where an AV utilizes its queue state information (QSI) and channel state information (CSI) in conjunction with reinforcement learning techniques to manage communication and sensing. This enables the AV to achieve two primary objectives: establishing a stable communication link with other AVs and accurately estimating the velocities of surrounding objects with high resolution. The communication performance is therefore evaluated based on the queue state, the effective data rate, and the discarded packets rate. In contrast, the effectiveness of the sensing is assessed using the velocity resolution. In addition, we exploit adaptive OFDM techniques for dynamic modulation, and we suggest a reward function that leverages the age of updates to handle the communication buffer and improve sensing. The system is validated using advantage actor-critic (A2C) and proximal policy optimization (PPO). Furthermore, we compare our solution with the existing design and demonstrate its superior performance by computer simulations.\",\"PeriodicalId\":34270,\"journal\":{\"name\":\"IEEE Open Journal of Vehicular Technology\",\"volume\":\"6 \",\"pages\":\"455-470\"},\"PeriodicalIF\":6.6000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10842044\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of Vehicular Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10842044/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Vehicular Technology","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10842044/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

基于毫米波（mmWave）的正交频分复用（OFDM）作为高分辨率传感和高速数据传输的合适替代方案脱颖而出。为了满足通信和传感需求，许多工作提出了一种静态配置，其中波的超参数（如帧中的符号数和通信槽中的帧数）已经预定义。然而，两个事实迫使我们重新定义这个问题，1)环境通常是动态的和不确定的，2)毫米波受到无线环境的严重影响。一个突出的例子是自动驾驶汽车（AV）。这种系统利用集成传感和通信（ISAC），使用毫米波来管理数据传输和环境的动态。在这项工作中，我们考虑了一个自动驾驶汽车网络，其中自动驾驶汽车利用其队列状态信息（QSI）和通道状态信息（CSI）结合强化学习技术来管理通信和感知。这使自动驾驶汽车能够实现两个主要目标：与其他自动驾驶汽车建立稳定的通信链路，并以高分辨率准确估计周围物体的速度。因此，根据队列状态、有效数据速率和丢弃包速率来评估通信性能。相比之下，使用速度分辨率来评估传感的有效性。此外，我们利用自适应OFDM技术进行动态调制，并提出了一种奖励函数，该函数利用更新的年龄来处理通信缓冲并改善感知。采用优势行为者-批评家（A2C）和近端策略优化（PPO）对系统进行了验证。此外，我们将该方案与现有设计进行了比较，并通过计算机仿真证明了其优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Joint Adaptive OFDM and Reinforcement Learning Design for Autonomous Vehicles: Leveraging Age of Updates

Millimeter wave (mmWave)-based orthogonal frequency-division multiplexing (OFDM) stands out as a suitable alternative for high-resolution sensing and high-speed data transmission. To meet communication and sensing requirements, many works propose a static configuration where the wave's hyperparameters such as the number of symbols in a frame and the number of frames in a communication slot are already predefined. However, two facts oblige us to redefine the problem, 1) the environment is often dynamic and uncertain, and 2) mmWave is severely impacted by wireless environments. A striking example where this challenge is very prominent is autonomous vehicle (AV). Such a system leverages integrated sensing and communication (ISAC) using mmWave to manage data transmission and the dynamism of the environment. In this work, we consider an autonomous vehicle network where an AV utilizes its queue state information (QSI) and channel state information (CSI) in conjunction with reinforcement learning techniques to manage communication and sensing. This enables the AV to achieve two primary objectives: establishing a stable communication link with other AVs and accurately estimating the velocities of surrounding objects with high resolution. The communication performance is therefore evaluated based on the queue state, the effective data rate, and the discarded packets rate. In contrast, the effectiveness of the sensing is assessed using the velocity resolution. In addition, we exploit adaptive OFDM techniques for dynamic modulation, and we suggest a reward function that leverages the age of updates to handle the communication buffer and improve sensing. The system is validated using advantage actor-critic (A2C) and proximal policy optimization (PPO). Furthermore, we compare our solution with the existing design and demonstrate its superior performance by computer simulations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊