Spatial-Temporal Multimodal End-to-End Autonomous Driving

Lei Yang, Weimin Lei
{"title":"Spatial-Temporal Multimodal End-to-End Autonomous Driving","authors":"Lei Yang, Weimin Lei","doi":"10.1109/ICCCS57501.2023.10151280","DOIUrl":null,"url":null,"abstract":"Autonomous driving requires precise perception of the surrounding environment, and considering the complementarity of sensor data, we propose an end-to-end model of spatial-temporal multimodal fusion using an attention mechanism. Our model uses a fusion of camera and light detection and ranging (LiDAR), which works as follows: (i) The spatial network performs spatial feature learning using images from the range view (RV) representation of LiDAR and red, blue, and green (RGB) images as input, followed by a parallel ResNet18 network for feature extraction and fusion through an attention mechanism; (ii) the temporal network performs the learning of the temporal dimension of spatial features, and uses the current spatial features from the spatial network and the historical spatial features to do attention learning, which enhances the features relevant to the autonomous driving task; (iii) finally, the model uses spatial-temporal features to select a different prediction branch by navigation instructions to perform regression of waypoints. Our model was trained and tested in the CARLA simulator, and experiments showed that it enabled to complete autonomous driving tasks in complex environments, achieving a success rate of 85%, especially with many dynamic objects.","PeriodicalId":266168,"journal":{"name":"2023 8th International Conference on Computer and Communication Systems (ICCCS)","volume":"32 S2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 8th International Conference on Computer and Communication Systems (ICCCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCS57501.2023.10151280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Autonomous driving requires precise perception of the surrounding environment, and considering the complementarity of sensor data, we propose an end-to-end model of spatial-temporal multimodal fusion using an attention mechanism. Our model uses a fusion of camera and light detection and ranging (LiDAR), which works as follows: (i) The spatial network performs spatial feature learning using images from the range view (RV) representation of LiDAR and red, blue, and green (RGB) images as input, followed by a parallel ResNet18 network for feature extraction and fusion through an attention mechanism; (ii) the temporal network performs the learning of the temporal dimension of spatial features, and uses the current spatial features from the spatial network and the historical spatial features to do attention learning, which enhances the features relevant to the autonomous driving task; (iii) finally, the model uses spatial-temporal features to select a different prediction branch by navigation instructions to perform regression of waypoints. Our model was trained and tested in the CARLA simulator, and experiments showed that it enabled to complete autonomous driving tasks in complex environments, achieving a success rate of 85%, especially with many dynamic objects.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
时空多模式端到端自动驾驶
自动驾驶需要对周围环境的精确感知,考虑到传感器数据的互补性,我们提出了一个使用注意机制的端到端时空多模态融合模型。我们的模型使用相机和光探测和测距(LiDAR)的融合,其工作原理如下:(i)空间网络使用LiDAR的距离视图(RV)表示和红、蓝、绿(RGB)图像作为输入进行空间特征学习,然后通过注意机制并行ResNet18网络进行特征提取和融合;(ii)时间网络对空间特征的时间维度进行学习,利用空间网络中的当前空间特征和历史空间特征进行注意学习,增强与自动驾驶任务相关的特征;(iii)最后,模型利用时空特征,通过导航指令选择不同的预测分支进行路点回归。我们的模型在CARLA模拟器中进行了训练和测试,实验表明,该模型能够在复杂环境中完成自动驾驶任务,成功率达到85%,特别是在许多动态对象的情况下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Spatial-Temporal Multimodal End-to-End Autonomous Driving Research on Secret Sharing for Cyberspace Mimic Defense Electromagnetic Energy Theorems and the Energy-Viewpoint-Based Modal Analysis Theories Design and Research of the Whole Process Non-blocking Technology in High Concurrency Scenario Electromagnetic Modal Analysis for Multiple- Ports-Fed Antenna Array: From Waveguide-Array Integrative Analysis to Amplitude-Phase Factor Method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1