Spatial-Temporal Multimodal End-to-End Autonomous Driving

2023 8th International Conference on Computer and Communication Systems (ICCCS) Pub Date : 2023-04-21 DOI:10.1109/ICCCS57501.2023.10151280

Lei Yang, Weimin Lei

{"title":"Spatial-Temporal Multimodal End-to-End Autonomous Driving","authors":"Lei Yang, Weimin Lei","doi":"10.1109/ICCCS57501.2023.10151280","DOIUrl":null,"url":null,"abstract":"Autonomous driving requires precise perception of the surrounding environment, and considering the complementarity of sensor data, we propose an end-to-end model of spatial-temporal multimodal fusion using an attention mechanism. Our model uses a fusion of camera and light detection and ranging (LiDAR), which works as follows: (i) The spatial network performs spatial feature learning using images from the range view (RV) representation of LiDAR and red, blue, and green (RGB) images as input, followed by a parallel ResNet18 network for feature extraction and fusion through an attention mechanism; (ii) the temporal network performs the learning of the temporal dimension of spatial features, and uses the current spatial features from the spatial network and the historical spatial features to do attention learning, which enhances the features relevant to the autonomous driving task; (iii) finally, the model uses spatial-temporal features to select a different prediction branch by navigation instructions to perform regression of waypoints. Our model was trained and tested in the CARLA simulator, and experiments showed that it enabled to complete autonomous driving tasks in complex environments, achieving a success rate of 85%, especially with many dynamic objects.","PeriodicalId":266168,"journal":{"name":"2023 8th International Conference on Computer and Communication Systems (ICCCS)","volume":"32 S2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 8th International Conference on Computer and Communication Systems (ICCCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCS57501.2023.10151280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Autonomous driving requires precise perception of the surrounding environment, and considering the complementarity of sensor data, we propose an end-to-end model of spatial-temporal multimodal fusion using an attention mechanism. Our model uses a fusion of camera and light detection and ranging (LiDAR), which works as follows: (i) The spatial network performs spatial feature learning using images from the range view (RV) representation of LiDAR and red, blue, and green (RGB) images as input, followed by a parallel ResNet18 network for feature extraction and fusion through an attention mechanism; (ii) the temporal network performs the learning of the temporal dimension of spatial features, and uses the current spatial features from the spatial network and the historical spatial features to do attention learning, which enhances the features relevant to the autonomous driving task; (iii) finally, the model uses spatial-temporal features to select a different prediction branch by navigation instructions to perform regression of waypoints. Our model was trained and tested in the CARLA simulator, and experiments showed that it enabled to complete autonomous driving tasks in complex environments, achieving a success rate of 85%, especially with many dynamic objects.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

时空多模式端到端自动驾驶

自动驾驶需要对周围环境的精确感知，考虑到传感器数据的互补性，我们提出了一个使用注意机制的端到端时空多模态融合模型。我们的模型使用相机和光探测和测距(LiDAR)的融合，其工作原理如下:(i)空间网络使用LiDAR的距离视图(RV)表示和红、蓝、绿(RGB)图像作为输入进行空间特征学习，然后通过注意机制并行ResNet18网络进行特征提取和融合;(ii)时间网络对空间特征的时间维度进行学习，利用空间网络中的当前空间特征和历史空间特征进行注意学习，增强与自动驾驶任务相关的特征;(iii)最后，模型利用时空特征，通过导航指令选择不同的预测分支进行路点回归。我们的模型在CARLA模拟器中进行了训练和测试，实验表明，该模型能够在复杂环境中完成自动驾驶任务，成功率达到85%，特别是在许多动态对象的情况下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 8th International Conference on Computer and Communication Systems (ICCCS)

自引率

0.00%

发文量