{"title":"AFO-SLAM:利用加速特征提取和目标检测改进动态场景中的视觉 SLAM","authors":"Jinbi Wei, Heng Deng, Jihong Wang, Liguo Zhang","doi":"10.1088/1361-6501/ad6627","DOIUrl":null,"url":null,"abstract":"\n In visual simultaneous localization and mapping (SLAM) systems, traditional methods often excel due to rigid environmental assumptions, but face challenges in dynamic environments. To address this, learning-based approaches have been introduced, but their expensive computing costs hinder real-time performance, especially on embedded mobile platforms. In this article, we propose a robust and real-time visual SLAM method towards dynamic environments using acceleration of feature extraction and object detection (AFO-SLAM). First, AFO-SLAM employs an independent object detection thread that utilizes YOLOv5 to extract semantic information and identify the bounding boxes of moving objects. To preserve the background points within these boxes, depth information is utilized to segment target foreground and background with only a single frame, with the points of the foreground area considered as dynamic points and then rejected. To optimize performance, CUDA program accelerates feature extraction preceding point removal. Finally, extensive evaluations are performed on both TUM RGB-D dataset and real scenes using a low-power embedded platform. Experimental results demonstrate that AFO-SLAM offers a balance between accuracy and real-time performance on embedded platforms, and enables the generation of dense point cloud maps in dynamic scenarios.","PeriodicalId":510602,"journal":{"name":"Measurement Science and Technology","volume":"42 18","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AFO-SLAM: an improved visual SLAM in dynamic scenes using acceleration of feature extraction and object detection\",\"authors\":\"Jinbi Wei, Heng Deng, Jihong Wang, Liguo Zhang\",\"doi\":\"10.1088/1361-6501/ad6627\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n In visual simultaneous localization and mapping (SLAM) systems, traditional methods often excel due to rigid environmental assumptions, but face challenges in dynamic environments. To address this, learning-based approaches have been introduced, but their expensive computing costs hinder real-time performance, especially on embedded mobile platforms. In this article, we propose a robust and real-time visual SLAM method towards dynamic environments using acceleration of feature extraction and object detection (AFO-SLAM). First, AFO-SLAM employs an independent object detection thread that utilizes YOLOv5 to extract semantic information and identify the bounding boxes of moving objects. To preserve the background points within these boxes, depth information is utilized to segment target foreground and background with only a single frame, with the points of the foreground area considered as dynamic points and then rejected. To optimize performance, CUDA program accelerates feature extraction preceding point removal. Finally, extensive evaluations are performed on both TUM RGB-D dataset and real scenes using a low-power embedded platform. Experimental results demonstrate that AFO-SLAM offers a balance between accuracy and real-time performance on embedded platforms, and enables the generation of dense point cloud maps in dynamic scenarios.\",\"PeriodicalId\":510602,\"journal\":{\"name\":\"Measurement Science and Technology\",\"volume\":\"42 18\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Measurement Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/1361-6501/ad6627\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1361-6501/ad6627","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在视觉同步定位与映射(SLAM)系统中,传统方法往往因僵化的环境假设而表现出色,但在动态环境中却面临挑战。为解决这一问题,人们引入了基于学习的方法,但其昂贵的计算成本阻碍了实时性,尤其是在嵌入式移动平台上。在本文中,我们利用加速特征提取和目标检测(AFO-SLAM)提出了一种面向动态环境的稳健、实时视觉 SLAM 方法。首先,AFO-SLAM 采用独立的物体检测线程,利用 YOLOv5 提取语义信息并识别移动物体的边界框。为了保留这些框内的背景点,利用深度信息只需一帧就能分割目标前景和背景,前景区域的点被视为动态点,然后被剔除。为了优化性能,CUDA 程序在剔除点之前加速了特征提取。最后,使用低功耗嵌入式平台对 TUM RGB-D 数据集和真实场景进行了广泛评估。实验结果表明,AFO-SLAM 在嵌入式平台上实现了精度和实时性之间的平衡,并能在动态场景中生成密集的点云图。
AFO-SLAM: an improved visual SLAM in dynamic scenes using acceleration of feature extraction and object detection
In visual simultaneous localization and mapping (SLAM) systems, traditional methods often excel due to rigid environmental assumptions, but face challenges in dynamic environments. To address this, learning-based approaches have been introduced, but their expensive computing costs hinder real-time performance, especially on embedded mobile platforms. In this article, we propose a robust and real-time visual SLAM method towards dynamic environments using acceleration of feature extraction and object detection (AFO-SLAM). First, AFO-SLAM employs an independent object detection thread that utilizes YOLOv5 to extract semantic information and identify the bounding boxes of moving objects. To preserve the background points within these boxes, depth information is utilized to segment target foreground and background with only a single frame, with the points of the foreground area considered as dynamic points and then rejected. To optimize performance, CUDA program accelerates feature extraction preceding point removal. Finally, extensive evaluations are performed on both TUM RGB-D dataset and real scenes using a low-power embedded platform. Experimental results demonstrate that AFO-SLAM offers a balance between accuracy and real-time performance on embedded platforms, and enables the generation of dense point cloud maps in dynamic scenarios.