Visual Odometry in Dynamic Environments using Light Weight Semantic Segmentation

2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM ) Pub Date : 2019-11-01 DOI:10.1109/HNICEM48295.2019.9073562

Richard Josiah C. Tan Ai, Dino Dominic F. Ligutan, Allysa Kate M. Brillantes, Jason L. Española, E. Dadios

{"title":"Visual Odometry in Dynamic Environments using Light Weight Semantic Segmentation","authors":"Richard Josiah C. Tan Ai, Dino Dominic F. Ligutan, Allysa Kate M. Brillantes, Jason L. Española, E. Dadios","doi":"10.1109/HNICEM48295.2019.9073562","DOIUrl":null,"url":null,"abstract":"Visual odometry is the method in which a robot tracks its position and orientation using a sequence of images. Feature based visual odometry matches feature between frames and estimates the pose of the robot according to the matched features. These methods typically assume a static environment and relies on statistical methods such as RANSAC to remove outliers such as moving objects. But in highly dynamic environment where majority of the scene is composed of moving objects these methods fail. This paper proposes to use the feature based visual odometry part of ORB-SLAM2 RGB-D and improve it using DeepLabv3-MobileNetV2 semantic segmentation. The semantic segmentation algorithm is used to segment the image, then extracted feature points that are on pixels of dynamic objects (people) are not tracked. The method is tested on TUM-RGBD dataset. Evaluation shows that the proposed algorithm performs significantly better in dynamic scenes compared to the base algorithm, with reduction in Absolute Trajectory Error (ATE) greater than 92.90% compared to the base algorithm in fr3w_xyz, fr3w_rpy and fr3_half sequences. Additionally, when comparing the algorithm that used DeepLabv3-MobileNetV2 to the computationally intensive DeepLabv3-Xception65, the largest increase in ATE was 27%, while the computation time is 3 times faster.","PeriodicalId":6733,"journal":{"name":"2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM )","volume":"12 1","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM )","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HNICEM48295.2019.9073562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Visual odometry is the method in which a robot tracks its position and orientation using a sequence of images. Feature based visual odometry matches feature between frames and estimates the pose of the robot according to the matched features. These methods typically assume a static environment and relies on statistical methods such as RANSAC to remove outliers such as moving objects. But in highly dynamic environment where majority of the scene is composed of moving objects these methods fail. This paper proposes to use the feature based visual odometry part of ORB-SLAM2 RGB-D and improve it using DeepLabv3-MobileNetV2 semantic segmentation. The semantic segmentation algorithm is used to segment the image, then extracted feature points that are on pixels of dynamic objects (people) are not tracked. The method is tested on TUM-RGBD dataset. Evaluation shows that the proposed algorithm performs significantly better in dynamic scenes compared to the base algorithm, with reduction in Absolute Trajectory Error (ATE) greater than 92.90% compared to the base algorithm in fr3w_xyz, fr3w_rpy and fr3_half sequences. Additionally, when comparing the algorithm that used DeepLabv3-MobileNetV2 to the computationally intensive DeepLabv3-Xception65, the largest increase in ATE was 27%, while the computation time is 3 times faster.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于轻量级语义分割的动态环境视觉里程计量

视觉里程计是机器人使用一系列图像跟踪其位置和方向的方法。基于特征的视觉里程法对帧间的特征进行匹配，并根据匹配的特征估计机器人的姿态。这些方法通常假设一个静态环境，并依赖于RANSAC等统计方法来移除移动对象等异常值。但是在高度动态的环境中，大多数场景都是由移动的物体组成的，这些方法就失效了。本文提出利用ORB-SLAM2 RGB-D中基于特征的视觉里程计量部分，并利用DeepLabv3-MobileNetV2语义分割对其进行改进。采用语义分割算法对图像进行分割，提取的特征点在动态对象(人)的像素上不被跟踪。在TUM-RGBD数据集上对该方法进行了测试。实验结果表明，该算法在动态场景下的性能明显优于基础算法，在fr3w_xyz、fr3w_rpy和fr3_half序列中，绝对轨迹误差(ATE)比基础算法降低了92.90%以上。此外，当将使用DeepLabv3-MobileNetV2的算法与使用计算密集型DeepLabv3-Xception65的算法进行比较时，ATE最大提高了27%，而计算时间提高了3倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE 11th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management ( HNICEM )

自引率

0.00%

发文量