Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications Outside Coverage

2018 IEEE Vehicular Networking Conference (VNC) Pub Date : 2018-12-01 DOI:10.1109/VNC.2018.8628366

T. Şahin, R. Khalili, Mate Boban, A. Wolisz

{"title":"Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications Outside Coverage","authors":"T. Şahin, R. Khalili, Mate Boban, A. Wolisz","doi":"10.1109/VNC.2018.8628366","DOIUrl":null,"url":null,"abstract":"Radio resources in vehicle-to-vehicle (V2V) communication can be scheduled either by a centralized scheduler residing in the network (e.g., a base station in case of cellular systems) or a distributed scheduler, where the resources are autonomously selected by the vehicles. The former approach yields a considerably higher resource utilization in case the network coverage is uninterrupted. However, in case of intermittent or-of-coverage, due to not having input from centralized scheduler, vehicles need to revert to distributed scheduling.Motivated by recent advances in reinforcement learning (RL), we investigate whether a centralized learning scheduler can be taught to efficiently pre-assign the resources to vehicles for-of-coverage V2V communication. Specifically, we use the actor-critic RL algorithm to train the centralized scheduler to provide non-interfering resources to vehicles before they enter the-of-coverage area.Our initial results show that a RL-based scheduler can achieve performance as good as or better than the state-of-art distributed scheduler, often outperforming it. Furthermore, the learning process completes within a reasonable time (ranging from a few hundred to a few thousand epochs), thus making the RL-based scheduler a promising solution for V2V communications with intermittent network coverage.","PeriodicalId":335017,"journal":{"name":"2018 IEEE Vehicular Networking Conference (VNC)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Vehicular Networking Conference (VNC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VNC.2018.8628366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

Abstract

Radio resources in vehicle-to-vehicle (V2V) communication can be scheduled either by a centralized scheduler residing in the network (e.g., a base station in case of cellular systems) or a distributed scheduler, where the resources are autonomously selected by the vehicles. The former approach yields a considerably higher resource utilization in case the network coverage is uninterrupted. However, in case of intermittent or-of-coverage, due to not having input from centralized scheduler, vehicles need to revert to distributed scheduling.Motivated by recent advances in reinforcement learning (RL), we investigate whether a centralized learning scheduler can be taught to efficiently pre-assign the resources to vehicles for-of-coverage V2V communication. Specifically, we use the actor-critic RL algorithm to train the centralized scheduler to provide non-interfering resources to vehicles before they enter the-of-coverage area.Our initial results show that a RL-based scheduler can achieve performance as good as or better than the state-of-art distributed scheduler, often outperforming it. Furthermore, the learning process completes within a reasonable time (ranging from a few hundred to a few thousand epochs), thus making the RL-based scheduler a promising solution for V2V communications with intermittent network coverage.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

覆盖范围外车辆间通信的强化学习调度

车对车(V2V)通信中的无线电资源可以由驻留在网络中的集中式调度程序(例如，蜂窝系统中的基站)或分布式调度程序进行调度，其中资源由车辆自主选择。在网络覆盖不间断的情况下，前一种方法的资源利用率要高得多。然而，在间歇性或覆盖的情况下，由于没有集中式调度程序的输入，车辆需要恢复到分布式调度。受强化学习(RL)最新进展的激励，我们研究了是否可以教授集中式学习调度程序来有效地将资源预先分配给车辆进行覆盖V2V通信。具体来说，我们使用actor-critic RL算法来训练集中式调度程序，以便在车辆进入覆盖区域之前为其提供无干扰资源。我们的初步结果表明，基于rl的调度器可以实现与最先进的分布式调度器一样好的性能，甚至更好，甚至经常优于分布式调度器。此外，学习过程在合理的时间内完成(从几百到几千个epoch不等)，因此使基于rl的调度程序成为具有间歇性网络覆盖的V2V通信的有前途的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE Vehicular Networking Conference (VNC)

自引率

0.00%

发文量