Rongchuan Cao, Yinan Wang, Kun Yan, Bo Chen, Tianqi Ding, Tianqi Zhang
{"title":"An End-To-End Visual Odometry based on Self-Attention Mechanism","authors":"Rongchuan Cao, Yinan Wang, Kun Yan, Bo Chen, Tianqi Ding, Tianqi Zhang","doi":"10.1109/ICPICS55264.2022.9873538","DOIUrl":null,"url":null,"abstract":"To address the problem of capturing and expressing key features in existing methods, we design an end-to-end visual odometry algorithm using a self-attention mechanism. The algorithm consists of two parts: Visual Transformer network structure and Bidirectional Attention Long Short-Term Memory network. The former can extract visual features from video or image sequences, and the latter can mine the correlation between images captured on long trajectories. The algorithm can enhance the localization accuracy and robustness of visual odometry. The extensive experiments based on the KITTI benchmark demonstrate that the proposed algorithm works better than other outstanding algorithms.","PeriodicalId":257180,"journal":{"name":"2022 IEEE 4th International Conference on Power, Intelligent Computing and Systems (ICPICS)","volume":"46 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 4th International Conference on Power, Intelligent Computing and Systems (ICPICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPICS55264.2022.9873538","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
To address the problem of capturing and expressing key features in existing methods, we design an end-to-end visual odometry algorithm using a self-attention mechanism. The algorithm consists of two parts: Visual Transformer network structure and Bidirectional Attention Long Short-Term Memory network. The former can extract visual features from video or image sequences, and the latter can mine the correlation between images captured on long trajectories. The algorithm can enhance the localization accuracy and robustness of visual odometry. The extensive experiments based on the KITTI benchmark demonstrate that the proposed algorithm works better than other outstanding algorithms.