Saran Khaliq, Muhammad Latif Anjum, Wajahat Hussain, Muhammad Uzair Khattak, Momen Rasool
{"title":"为什么ORB-SLAM缺少常见的循环闭包?","authors":"Saran Khaliq, Muhammad Latif Anjum, Wajahat Hussain, Muhammad Uzair Khattak, Momen Rasool","doi":"10.1007/s10514-023-10149-x","DOIUrl":null,"url":null,"abstract":"<div><p>We analyse, for the first time, the popular loop closing module of a well known and widely used open-source visual SLAM (ORB-SLAM) pipeline. Investigating failures in the loop closure module of visual SLAM is challenging since it consists of multiple building blocks. Our meticulous investigations have revealed a few interesting findings. Contrary to reported results, ORB-SLAM frequently misses large fraction of loop closures on public (KITTI, TUM RGB-D) datasets. One common assumption is, in such scenarios, the visual place recognition (vPR) block of the loop closure module is unable to find a suitable match due to extreme conditions (dynamic scene, viewpoint/scale changes). We report that native vPR of ORB-SLAM is not the sole reason for these failures. Although recent deep vPR alternatives achieve impressive matching performance, replacing native vPR with these deep alternatives will only partially improve loop closure performance of visual SLAM. Our findings suggest that the problem lies with the subsequent relative pose estimation module between the matching pair. ORB-SLAM3 has improved the recall of the original loop closing module. However, even in ORB-SLAM3, the loop closing module is the major reason behind loop closing failures. Surprisingly, using <i>off-the-shelf</i> ORB and SIFT based relative pose estimators (non real-time) manages to close most of the loops missed by ORB-SLAM. This significant performance gap between the two available methods suggests that ORB-SLAM’s pipeline can be further matured by focusing on the relative pose estimators, to improve loop closure performance, rather than investing more resources on improving vPR. We also evaluate deep alternatives for relative pose estimation in the context of loop closures. Interestingly, the performance of deep relocalization methods (e.g. MapNet) is worse than classic methods even in loop closures scenarios. This finding further supports the fundamental limitation of deep relocalization methods recently diagnosed. Finally, we expose bias in well-known public dataset (KITTI) due to which these commonly occurring failures have eluded the community. We augment the KITTI dataset with detailed loop closing labels. In order to compensate for the bias in the public datasets, we provide a challenging loop closure dataset which contains challenging yet commonly occurring indoor navigation scenarios with loop closures. We hope our findings and the accompanying dataset will help the community in further improving the popular ORB-SLAM’s pipeline.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1519 - 1535"},"PeriodicalIF":3.7000,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Why ORB-SLAM is missing commonly occurring loop closures?\",\"authors\":\"Saran Khaliq, Muhammad Latif Anjum, Wajahat Hussain, Muhammad Uzair Khattak, Momen Rasool\",\"doi\":\"10.1007/s10514-023-10149-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We analyse, for the first time, the popular loop closing module of a well known and widely used open-source visual SLAM (ORB-SLAM) pipeline. Investigating failures in the loop closure module of visual SLAM is challenging since it consists of multiple building blocks. Our meticulous investigations have revealed a few interesting findings. Contrary to reported results, ORB-SLAM frequently misses large fraction of loop closures on public (KITTI, TUM RGB-D) datasets. One common assumption is, in such scenarios, the visual place recognition (vPR) block of the loop closure module is unable to find a suitable match due to extreme conditions (dynamic scene, viewpoint/scale changes). We report that native vPR of ORB-SLAM is not the sole reason for these failures. Although recent deep vPR alternatives achieve impressive matching performance, replacing native vPR with these deep alternatives will only partially improve loop closure performance of visual SLAM. Our findings suggest that the problem lies with the subsequent relative pose estimation module between the matching pair. ORB-SLAM3 has improved the recall of the original loop closing module. However, even in ORB-SLAM3, the loop closing module is the major reason behind loop closing failures. Surprisingly, using <i>off-the-shelf</i> ORB and SIFT based relative pose estimators (non real-time) manages to close most of the loops missed by ORB-SLAM. This significant performance gap between the two available methods suggests that ORB-SLAM’s pipeline can be further matured by focusing on the relative pose estimators, to improve loop closure performance, rather than investing more resources on improving vPR. We also evaluate deep alternatives for relative pose estimation in the context of loop closures. Interestingly, the performance of deep relocalization methods (e.g. MapNet) is worse than classic methods even in loop closures scenarios. This finding further supports the fundamental limitation of deep relocalization methods recently diagnosed. Finally, we expose bias in well-known public dataset (KITTI) due to which these commonly occurring failures have eluded the community. We augment the KITTI dataset with detailed loop closing labels. In order to compensate for the bias in the public datasets, we provide a challenging loop closure dataset which contains challenging yet commonly occurring indoor navigation scenarios with loop closures. We hope our findings and the accompanying dataset will help the community in further improving the popular ORB-SLAM’s pipeline.</p></div>\",\"PeriodicalId\":55409,\"journal\":{\"name\":\"Autonomous Robots\",\"volume\":\"47 8\",\"pages\":\"1519 - 1535\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2023-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Autonomous Robots\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10514-023-10149-x\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Robots","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10514-023-10149-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
我们首次分析了一个广为人知且广泛使用的开源可视化SLAM (ORB-SLAM)管道的流行闭环模块。由于可视化SLAM的闭环模块包含多个构建块,因此对其故障进行调查具有挑战性。我们细致的调查揭示了一些有趣的发现。与报道的结果相反,ORB-SLAM经常错过公共(KITTI, TUM RGB-D)数据集的大部分循环闭包。一个常见的假设是,在这种情况下,由于极端条件(动态场景,视点/尺度变化),闭环模块的视觉位置识别(vPR)块无法找到合适的匹配。我们报告ORB-SLAM的原生vPR不是导致这些失败的唯一原因。尽管最近的深度vPR替代品取得了令人印象深刻的匹配性能,但用这些深度替代品取代原生vPR只能部分提高视觉SLAM的闭环性能。我们的研究结果表明,问题在于匹配对之间的后续相对姿态估计模块。ORB-SLAM3改进了原回路关闭模块的召回。然而,即使在ORB-SLAM3中,循环关闭模块也是导致循环关闭失败的主要原因。令人惊讶的是,使用现成的ORB和基于SIFT的相对姿态估计器(非实时)可以关闭ORB- slam错过的大部分循环。两种可用方法之间的显著性能差距表明,ORB-SLAM的管道可以通过关注相对姿态估计器来进一步成熟,以提高环路闭合性能,而不是投入更多资源来提高vPR。我们还评估了在闭环环境中相对姿态估计的深度替代方法。有趣的是,即使在循环闭包场景下,深度重定位方法(例如MapNet)的性能也比经典方法差。这一发现进一步支持了最近诊断出的深度定位方法的基本局限性。最后,我们揭露了众所周知的公共数据集(KITTI)中的偏见,由于这些常见的故障已经避开了社区。我们用详细的循环结束标签来增强KITTI数据集。为了弥补公共数据集中的偏差,我们提供了一个具有挑战性的闭环数据集,其中包含具有挑战性但通常发生的室内导航场景。我们希望我们的发现和随附的数据集能够帮助社区进一步改进流行的ORB-SLAM管道。
Why ORB-SLAM is missing commonly occurring loop closures?
We analyse, for the first time, the popular loop closing module of a well known and widely used open-source visual SLAM (ORB-SLAM) pipeline. Investigating failures in the loop closure module of visual SLAM is challenging since it consists of multiple building blocks. Our meticulous investigations have revealed a few interesting findings. Contrary to reported results, ORB-SLAM frequently misses large fraction of loop closures on public (KITTI, TUM RGB-D) datasets. One common assumption is, in such scenarios, the visual place recognition (vPR) block of the loop closure module is unable to find a suitable match due to extreme conditions (dynamic scene, viewpoint/scale changes). We report that native vPR of ORB-SLAM is not the sole reason for these failures. Although recent deep vPR alternatives achieve impressive matching performance, replacing native vPR with these deep alternatives will only partially improve loop closure performance of visual SLAM. Our findings suggest that the problem lies with the subsequent relative pose estimation module between the matching pair. ORB-SLAM3 has improved the recall of the original loop closing module. However, even in ORB-SLAM3, the loop closing module is the major reason behind loop closing failures. Surprisingly, using off-the-shelf ORB and SIFT based relative pose estimators (non real-time) manages to close most of the loops missed by ORB-SLAM. This significant performance gap between the two available methods suggests that ORB-SLAM’s pipeline can be further matured by focusing on the relative pose estimators, to improve loop closure performance, rather than investing more resources on improving vPR. We also evaluate deep alternatives for relative pose estimation in the context of loop closures. Interestingly, the performance of deep relocalization methods (e.g. MapNet) is worse than classic methods even in loop closures scenarios. This finding further supports the fundamental limitation of deep relocalization methods recently diagnosed. Finally, we expose bias in well-known public dataset (KITTI) due to which these commonly occurring failures have eluded the community. We augment the KITTI dataset with detailed loop closing labels. In order to compensate for the bias in the public datasets, we provide a challenging loop closure dataset which contains challenging yet commonly occurring indoor navigation scenarios with loop closures. We hope our findings and the accompanying dataset will help the community in further improving the popular ORB-SLAM’s pipeline.
期刊介绍:
Autonomous Robots reports on the theory and applications of robotic systems capable of some degree of self-sufficiency. It features papers that include performance data on actual robots in the real world. Coverage includes: control of autonomous robots · real-time vision · autonomous wheeled and tracked vehicles · legged vehicles · computational architectures for autonomous systems · distributed architectures for learning, control and adaptation · studies of autonomous robot systems · sensor fusion · theory of autonomous systems · terrain mapping and recognition · self-calibration and self-repair for robots · self-reproducing intelligent structures · genetic algorithms as models for robot development.
The focus is on the ability to move and be self-sufficient, not on whether the system is an imitation of biology. Of course, biological models for robotic systems are of major interest to the journal since living systems are prototypes for autonomous behavior.