用于不同场景中无全局信息自主导航的跨平台深度强化学习模型

IF 5.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Control Engineering Practice Pub Date : 2024-06-13 DOI:10.1016/j.conengprac.2024.105991

Chuanxin Cheng , Hao Zhang , Yuan Sun , Hongfeng Tao , Yiyang Chen

{"title":"用于不同场景中无全局信息自主导航的跨平台深度强化学习模型","authors":"Chuanxin Cheng , Hao Zhang , Yuan Sun , Hongfeng Tao , Yiyang Chen","doi":"10.1016/j.conengprac.2024.105991","DOIUrl":null,"url":null,"abstract":"<div><p>This paper employs a deep reinforcement learning algorithm named Twin Delayed Deep Deterministic algorithm into autonomous navigation in intelligent transportation systems. It trains a fully connected neural network model in a simulation environment, which outputs the expected linear and angular velocity of the vehicle based on real-time data measured by embedded sensors. Through continuous epochs of training, the model gradually navigates the vehicle to reach a provided destination by making rational motion decisions at each discrete time instant without knowing global environment information. Especially, to improve the model’s generalization ability across various scenes, an input preprocessing function is proposed to eliminate the singularity and uniformity of raw input data. A large number of simulation tests are carried out, where the proportion that the vehicle moves from a start position to a destination without collision within a specified limited time exceeds 90%. The remaining failures are mainly due to the vehicle’s inability to approach the destination immediately adjacent to obstacles for its safety. Furthermore, traditional mapless navigation algorithms suffer from locally optimal solutions in the face of U-shaped obstacles. This paper introduces a virtual obstacle mechanism designed to prevent the vehicle from entering the U-shaped region, effectively addressing the aforementioned issue. Finally, the model trained from the simulation environment can be directly loaded onto a physical vehicle without considering the different processor architectures. Large quantities of experiments show that the model improves the autonomous navigation capability of vehicles when global environment information cannot be obtained by the system, which optimizes the functions of the navigation module in intelligent transportation systems.</p></div>","PeriodicalId":50615,"journal":{"name":"Control Engineering Practice","volume":null,"pages":null},"PeriodicalIF":5.4000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A cross-platform deep reinforcement learning model for autonomous navigation without global information in different scenes\",\"authors\":\"Chuanxin Cheng , Hao Zhang , Yuan Sun , Hongfeng Tao , Yiyang Chen\",\"doi\":\"10.1016/j.conengprac.2024.105991\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper employs a deep reinforcement learning algorithm named Twin Delayed Deep Deterministic algorithm into autonomous navigation in intelligent transportation systems. It trains a fully connected neural network model in a simulation environment, which outputs the expected linear and angular velocity of the vehicle based on real-time data measured by embedded sensors. Through continuous epochs of training, the model gradually navigates the vehicle to reach a provided destination by making rational motion decisions at each discrete time instant without knowing global environment information. Especially, to improve the model’s generalization ability across various scenes, an input preprocessing function is proposed to eliminate the singularity and uniformity of raw input data. A large number of simulation tests are carried out, where the proportion that the vehicle moves from a start position to a destination without collision within a specified limited time exceeds 90%. The remaining failures are mainly due to the vehicle’s inability to approach the destination immediately adjacent to obstacles for its safety. Furthermore, traditional mapless navigation algorithms suffer from locally optimal solutions in the face of U-shaped obstacles. This paper introduces a virtual obstacle mechanism designed to prevent the vehicle from entering the U-shaped region, effectively addressing the aforementioned issue. Finally, the model trained from the simulation environment can be directly loaded onto a physical vehicle without considering the different processor architectures. Large quantities of experiments show that the model improves the autonomous navigation capability of vehicles when global environment information cannot be obtained by the system, which optimizes the functions of the navigation module in intelligent transportation systems.</p></div>\",\"PeriodicalId\":50615,\"journal\":{\"name\":\"Control Engineering Practice\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Control Engineering Practice\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0967066124001515\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Control Engineering Practice","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0967066124001515","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

本文将一种名为 "双延迟深度确定性算法 "的深度强化学习算法应用于智能交通系统的自主导航中。它在仿真环境中训练一个全连接神经网络模型，该模型根据嵌入式传感器测量的实时数据输出车辆的预期线速度和角速度。通过连续的历时训练，该模型在不知道全局环境信息的情况下，通过在每个离散时间瞬间做出合理的运动决策，逐步导航车辆到达指定目的地。特别是，为了提高模型在各种场景下的泛化能力，提出了一种输入预处理函数，以消除原始输入数据的单一性和均匀性。通过大量的仿真测试，车辆在规定的有限时间内从起始位置无碰撞地行驶到目的地的比例超过了 90%。其余失败的主要原因是，为了安全起见，车辆无法接近紧邻障碍物的目的地。此外，传统的无地图导航算法在面对 U 形障碍物时存在局部最优解的问题。本文引入了一种虚拟障碍物机制，旨在防止车辆进入 U 形区域，从而有效解决上述问题。最后，从仿真环境中训练出来的模型可以直接加载到物理车辆上，而无需考虑不同的处理器架构。大量实验表明，在系统无法获取全局环境信息的情况下，该模型提高了车辆的自主导航能力，优化了智能交通系统中导航模块的功能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A cross-platform deep reinforcement learning model for autonomous navigation without global information in different scenes

This paper employs a deep reinforcement learning algorithm named Twin Delayed Deep Deterministic algorithm into autonomous navigation in intelligent transportation systems. It trains a fully connected neural network model in a simulation environment, which outputs the expected linear and angular velocity of the vehicle based on real-time data measured by embedded sensors. Through continuous epochs of training, the model gradually navigates the vehicle to reach a provided destination by making rational motion decisions at each discrete time instant without knowing global environment information. Especially, to improve the model’s generalization ability across various scenes, an input preprocessing function is proposed to eliminate the singularity and uniformity of raw input data. A large number of simulation tests are carried out, where the proportion that the vehicle moves from a start position to a destination without collision within a specified limited time exceeds 90%. The remaining failures are mainly due to the vehicle’s inability to approach the destination immediately adjacent to obstacles for its safety. Furthermore, traditional mapless navigation algorithms suffer from locally optimal solutions in the face of U-shaped obstacles. This paper introduces a virtual obstacle mechanism designed to prevent the vehicle from entering the U-shaped region, effectively addressing the aforementioned issue. Finally, the model trained from the simulation environment can be directly loaded onto a physical vehicle without considering the different processor architectures. Large quantities of experiments show that the model improves the autonomous navigation capability of vehicles when global environment information cannot be obtained by the system, which optimizes the functions of the navigation module in intelligent transportation systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Control Engineering Practice 工程技术-工程：电子与电气

CiteScore

9.20

自引率

12.20%

发文量

183

审稿时长

44 days

期刊介绍： Control Engineering Practice strives to meet the needs of industrial practitioners and industrially related academics and researchers. It publishes papers which illustrate the direct application of control theory and its supporting tools in all possible areas of automation. As a result, the journal only contains papers which can be considered to have made significant contributions to the application of advanced control techniques. It is normally expected that practical results should be included, but where simulation only studies are available, it is necessary to demonstrate that the simulation model is representative of a genuine application. Strictly theoretical papers will find a more appropriate home in Control Engineering Practice''s sister publication, Automatica. It is also expected that papers are innovative with respect to the state of the art and are sufficiently detailed for a reader to be able to duplicate the main results of the paper (supplementary material, including datasets, tables, code and any relevant interactive material can be made available and downloaded from the website). The benefits of the presented methods must be made very clear and the new techniques must be compared and contrasted with results obtained using existing methods. Moreover, a thorough analysis of failures that may happen in the design process and implementation can also be part of the paper. The scope of Control Engineering Practice matches the activities of IFAC. Papers demonstrating the contribution of automation and control in improving the performance, quality, productivity, sustainability, resource and energy efficiency, and the manageability of systems and processes for the benefit of mankind and are relevant to industrial practitioners are most welcome.