首页 > 最新文献

Autonomous Robots最新文献

英文 中文
Chasing millimeters: design, navigation and state estimation for precise in-flight marking on ceilings 追逐毫米:设计,导航和状态估计精确的飞行标记在天花板上
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-26 DOI: 10.1007/s10514-023-10141-5
Christian Lanegger, Michael Pantic, Rik Bähnemann, Roland Siegwart, Lionel Ott

Precise markings for drilling and assembly are crucial, laborious construction tasks. Aerial robots with suitable end-effectors are capable of markings at the millimeter scale. However, so far, they have only been demonstrated under laboratory conditions where rigid state estimation and navigation assumptions do not impede robustness and accuracy. This paper presents a complete aerial layouting system capable of precise markings on-site under realistic conditions. We use a compliant actuated end-effector on an omnidirectional flying base. Combining a two-stage factor-graph state estimator with a Riemannian Motion Policy-based navigation stack, we avoid the need for a globally consistent state estimate and increase robustness. The policy-based navigation is structured into individual behaviors in different state spaces. Through a comprehensive study, we show that the system creates highly precise markings at a relative precision of 1.5 mm and a global accuracy of 5–6 mm and discuss the results in the context of future construction robotics.

钻井和装配的精确标记是至关重要的,费力的施工任务。具有合适末端执行器的空中机器人能够在毫米尺度上进行标记。然而,到目前为止,它们只在实验室条件下进行了演示,在实验室条件下,刚性状态估计和导航假设不会妨碍鲁棒性和准确性。本文提出了一种能够在实际条件下进行精确现场标记的完整的空中布图系统。我们在一个全向飞行基地上使用了一个柔性驱动的末端执行器。将两阶段因子图状态估计器与基于黎曼运动策略的导航堆栈相结合,避免了对全局一致状态估计的需要,提高了鲁棒性。基于策略的导航被组织成不同状态空间中的单个行为。通过全面的研究,我们表明,该系统以1.5毫米的相对精度和5-6毫米的全局精度创建高精度标记,并在未来建筑机器人的背景下讨论了结果。
{"title":"Chasing millimeters: design, navigation and state estimation for precise in-flight marking on ceilings","authors":"Christian Lanegger,&nbsp;Michael Pantic,&nbsp;Rik Bähnemann,&nbsp;Roland Siegwart,&nbsp;Lionel Ott","doi":"10.1007/s10514-023-10141-5","DOIUrl":"10.1007/s10514-023-10141-5","url":null,"abstract":"<div><p>Precise markings for drilling and assembly are crucial, laborious construction tasks. Aerial robots with suitable end-effectors are capable of markings at the millimeter scale. However, so far, they have only been demonstrated under laboratory conditions where rigid state estimation and navigation assumptions do not impede robustness and accuracy. This paper presents a complete aerial layouting system capable of precise markings on-site under realistic conditions. We use a compliant actuated end-effector on an omnidirectional flying base. Combining a two-stage factor-graph state estimator with a Riemannian Motion Policy-based navigation stack, we avoid the need for a globally consistent state estimate and increase robustness. The policy-based navigation is structured into individual behaviors in different state spaces. Through a comprehensive study, we show that the system creates highly precise markings at a relative precision of 1.5 mm and a global accuracy of 5–6 mm and discuss the results in the context of future construction robotics.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1405 - 1418"},"PeriodicalIF":3.5,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10141-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134909983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation 基于图像的导航在现实世界环境中通过多个中层表示:融合模型,基准和有效评估
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-26 DOI: 10.1007/s10514-023-10147-z
Marco Rosano, Antonino Furnari, Luigi Gulino, Corrado Santoro, Giovanni Maria Farinella

Robot visual navigation is a relevant research topic. Current deep navigation models conveniently learn the navigation policies in simulation, given the large amount of experience they need to collect. Unfortunately, the resulting models show a limited generalization ability when deployed in the real world. In this work we explore solutions to facilitate the development of visual navigation policies trained in simulation that can be successfully transferred in the real world. We first propose an efficient evaluation tool to reproduce realistic navigation episodes in simulation. We then investigate a variety of deep fusion architectures to combine a set of mid-level representations, with the aim of finding the best merge strategy that maximize the real world performances. Our experiments, performed both in simulation and on a robotic platform, show the effectiveness of the considered mid-level representations-based models and confirm the reliability of the evaluation tool. The 3D models of the environment and the code of the validation tool are publicly available at the following link: https://iplab.dmi.unict.it/EmbodiedVN/.

机器人视觉导航是一个相关的研究课题。目前的深度导航模型需要收集大量的经验,因此可以方便地在仿真中学习导航策略。不幸的是,当在现实世界中部署时,得到的模型显示出有限的泛化能力。在这项工作中,我们探索解决方案,以促进在模拟中训练的视觉导航策略的发展,这些策略可以成功地转移到现实世界中。我们首先提出了一种有效的评估工具来重现模拟中的真实导航事件。然后,我们研究了各种深度融合架构,以组合一组中层表示,目的是找到最大化现实世界性能的最佳合并策略。我们在仿真和机器人平台上进行的实验显示了所考虑的基于中层表示的模型的有效性,并确认了评估工具的可靠性。环境的3D模型和验证工具的代码可在以下链接中公开获取:https://iplab.dmi.unict.it/EmbodiedVN/。
{"title":"Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation","authors":"Marco Rosano,&nbsp;Antonino Furnari,&nbsp;Luigi Gulino,&nbsp;Corrado Santoro,&nbsp;Giovanni Maria Farinella","doi":"10.1007/s10514-023-10147-z","DOIUrl":"10.1007/s10514-023-10147-z","url":null,"abstract":"<div><p>Robot visual navigation is a relevant research topic. Current deep navigation models conveniently learn the navigation policies in simulation, given the large amount of experience they need to collect. Unfortunately, the resulting models show a limited generalization ability when deployed in the real world. In this work we explore solutions to facilitate the development of visual navigation policies trained in simulation that can be successfully transferred in the real world. We first propose an efficient evaluation tool to reproduce realistic navigation episodes in simulation. We then investigate a variety of deep fusion architectures to combine a set of mid-level representations, with the aim of finding the best merge strategy that maximize the real world performances. Our experiments, performed both in simulation and on a robotic platform, show the effectiveness of the considered mid-level representations-based models and confirm the reliability of the evaluation tool. The 3D models of the environment and the code of the validation tool are publicly available at the following link: https://iplab.dmi.unict.it/EmbodiedVN/.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1483 - 1502"},"PeriodicalIF":3.5,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10147-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134907723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Large language models for chemistry robotics 用于化学机器人的大型语言模型
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-25 DOI: 10.1007/s10514-023-10136-2
Naruki Yoshikawa, Marta Skreta, Kourosh Darvish, Sebastian Arellano-Rubach, Zhi Ji, Lasse Bjørn Kristensen, Andrew Zou Li, Yuchi Zhao, Haoping Xu, Artur Kuramshin, Alán Aspuru-Guzik, Florian Shkurti, Animesh Garg

This paper proposes an approach to automate chemistry experiments using robots by translating natural language instructions into robot-executable plans, using large language models together with task and motion planning. Adding natural language interfaces to autonomous chemistry experiment systems lowers the barrier to using complicated robotics systems and increases utility for non-expert users, but translating natural language experiment descriptions from users into low-level robotics languages is nontrivial. Furthermore, while recent advances have used large language models to generate task plans, reliably executing those plans in the real world by an embodied agent remains challenging. To enable autonomous chemistry experiments and alleviate the workload of chemists, robots must interpret natural language commands, perceive the workspace, autonomously plan multi-step actions and motions, consider safety precautions, and interact with various laboratory equipment. Our approach, CLAIRify, combines automatic iterative prompting with program verification to ensure syntactically valid programs in a data-scarce domain-specific language that incorporates environmental constraints. The generated plan is executed through solving a constrained task and motion planning problem using PDDLStream solvers to prevent spillages of liquids as well as collisions in chemistry labs. We demonstrate the effectiveness of our approach in planning chemistry experiments, with plans successfully executed on a real robot using a repertoire of robot skills and lab tools. Specifically, we showcase the utility of our framework in pouring skills for various materials and two fundamental chemical experiments for materials synthesis: solubility and recrystallization. Further details about CLAIRify can be found at https://ac-rad.github.io/clairify/.

本文提出了一种使用机器人自动化化学实验的方法,通过将自然语言指令翻译成机器人可执行的计划,使用大型语言模型以及任务和运动规划。将自然语言接口添加到自主化学实验系统中降低了使用复杂机器人系统的障碍,并增加了非专业用户的实用性,但将用户的自然语言实验描述转换为低级机器人语言并非易事。此外,虽然最近的进展已经使用大型语言模型来生成任务计划,但在现实世界中由具体化的代理可靠地执行这些计划仍然具有挑战性。为了实现自主化学实验和减轻化学家的工作量,机器人必须解释自然语言命令,感知工作空间,自主规划多步骤动作和运动,考虑安全预防措施,并与各种实验室设备进行交互。我们的方法,CLAIRify,将自动迭代提示与程序验证相结合,以确保在包含环境约束的数据稀缺领域特定语言中语法有效的程序。生成的计划通过使用PDDLStream求解器解决约束任务和运动规划问题来执行,以防止化学实验室中的液体溢出和碰撞。我们证明了我们的方法在规划化学实验方面的有效性,并使用机器人技能和实验室工具成功地在真实的机器人上执行了计划。具体来说,我们展示了我们的框架在各种材料的浇注技能和材料合成的两个基本化学实验中的实用性:溶解度和再结晶。有关CLAIRify的更多详细信息,请访问https://ac-rad.github.io/clairify/。
{"title":"Large language models for chemistry robotics","authors":"Naruki Yoshikawa,&nbsp;Marta Skreta,&nbsp;Kourosh Darvish,&nbsp;Sebastian Arellano-Rubach,&nbsp;Zhi Ji,&nbsp;Lasse Bjørn Kristensen,&nbsp;Andrew Zou Li,&nbsp;Yuchi Zhao,&nbsp;Haoping Xu,&nbsp;Artur Kuramshin,&nbsp;Alán Aspuru-Guzik,&nbsp;Florian Shkurti,&nbsp;Animesh Garg","doi":"10.1007/s10514-023-10136-2","DOIUrl":"10.1007/s10514-023-10136-2","url":null,"abstract":"<div><p>This paper proposes an approach to automate chemistry experiments using robots by translating natural language instructions into robot-executable plans, using large language models together with task and motion planning. Adding natural language interfaces to autonomous chemistry experiment systems lowers the barrier to using complicated robotics systems and increases utility for non-expert users, but translating natural language experiment descriptions from users into low-level robotics languages is nontrivial. Furthermore, while recent advances have used large language models to generate task plans, reliably executing those plans in the real world by an embodied agent remains challenging. To enable autonomous chemistry experiments and alleviate the workload of chemists, robots must interpret natural language commands, perceive the workspace, autonomously plan multi-step actions and motions, consider safety precautions, and interact with various laboratory equipment. Our approach, <span>CLAIRify</span>, combines automatic iterative prompting with program verification to ensure syntactically valid programs in a data-scarce domain-specific language that incorporates environmental constraints. The generated plan is executed through solving a constrained task and motion planning problem using PDDLStream solvers to prevent spillages of liquids as well as collisions in chemistry labs. We demonstrate the effectiveness of our approach in planning chemistry experiments, with plans successfully executed on a real robot using a repertoire of robot skills and lab tools. Specifically, we showcase the utility of our framework in pouring skills for various materials and two fundamental chemical experiments for materials synthesis: solubility and recrystallization. Further details about <span>CLAIRify</span> can be found at https://ac-rad.github.io/clairify/.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1057 - 1086"},"PeriodicalIF":3.5,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10136-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135112102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic anomaly detection with large language models 基于大型语言模型的语义异常检测
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-23 DOI: 10.1007/s10514-023-10132-6
Amine Elhafsi, Rohan Sinha, Christopher Agia, Edward Schmerling, Issa A. D. Nesnas, Marco Pavone

As robots acquire increasingly sophisticated skills and see increasingly complex and varied environments, the threat of an edge case or anomalous failure is ever present. For example, Tesla cars have seen interesting failure modes ranging from autopilot disengagements due to inactive traffic lights carried by trucks to phantom braking caused by images of stop signs on roadside billboards. These system-level failures are not due to failures of any individual component of the autonomy stack but rather system-level deficiencies in semantic reasoning. Such edge cases, which we call semantic anomalies, are simple for a human to disentangle yet require insightful reasoning. To this end, we study the application of large language models (LLMs), endowed with broad contextual understanding and reasoning capabilities, to recognize such edge cases and introduce a monitoring framework for semantic anomaly detection in vision-based policies. Our experiments apply this framework to a finite state machine policy for autonomous driving and a learned policy for object manipulation. These experiments demonstrate that the LLM-based monitor can effectively identify semantic anomalies in a manner that shows agreement with human reasoning. Finally, we provide an extended discussion on the strengths and weaknesses of this approach and motivate a research outlook on how we can further use foundation models for semantic anomaly detection. Our project webpage can be found at https://sites.google.com/view/llm-anomaly-detection.

随着机器人获得越来越复杂的技能,看到越来越复杂和多变的环境,边缘情况或异常故障的威胁永远存在。例如,特斯拉汽车出现了一些有趣的故障模式,从卡车携带的红绿灯不活跃导致自动驾驶仪脱离,到路边广告牌上的停车标志图像导致的幻影制动。这些系统级故障不是由于自治堆栈的任何单个组件的故障,而是由于语义推理中的系统级缺陷。这种边缘情况,我们称之为语义异常,对人类来说很容易解开,但需要深刻的推理。为此,我们研究了具有广泛上下文理解和推理能力的大型语言模型(llm)的应用,以识别此类边缘情况,并引入了基于视觉的策略中语义异常检测的监控框架。我们的实验将此框架应用于自动驾驶的有限状态机策略和对象操作的学习策略。这些实验表明,基于llm的监视器可以有效地识别语义异常,并且与人类推理一致。最后,我们对该方法的优点和缺点进行了扩展讨论,并对如何进一步使用基础模型进行语义异常检测进行了研究展望。我们的项目网页可在https://sites.google.com/view/llm-anomaly-detection找到。
{"title":"Semantic anomaly detection with large language models","authors":"Amine Elhafsi,&nbsp;Rohan Sinha,&nbsp;Christopher Agia,&nbsp;Edward Schmerling,&nbsp;Issa A. D. Nesnas,&nbsp;Marco Pavone","doi":"10.1007/s10514-023-10132-6","DOIUrl":"10.1007/s10514-023-10132-6","url":null,"abstract":"<div><p>As robots acquire increasingly sophisticated skills and see increasingly complex and varied environments, the threat of an edge case or anomalous failure is ever present. For example, Tesla cars have seen interesting failure modes ranging from autopilot disengagements due to inactive traffic lights carried by trucks to phantom braking caused by images of stop signs on roadside billboards. These system-level failures are not due to failures of any individual component of the autonomy stack but rather system-level deficiencies in semantic reasoning. Such edge cases, which we call <i>semantic anomalies</i>, are simple for a human to disentangle yet require insightful reasoning. To this end, we study the application of large language models (LLMs), endowed with broad contextual understanding and reasoning capabilities, to recognize such edge cases and introduce a monitoring framework for semantic anomaly detection in vision-based policies. Our experiments apply this framework to a finite state machine policy for autonomous driving and a learned policy for object manipulation. These experiments demonstrate that the LLM-based monitor can effectively identify semantic anomalies in a manner that shows agreement with human reasoning. Finally, we provide an extended discussion on the strengths and weaknesses of this approach and motivate a research outlook on how we can further use foundation models for semantic anomaly detection. Our project webpage can be found at https://sites.google.com/view/llm-anomaly-detection. \u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1035 - 1055"},"PeriodicalIF":3.5,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135322901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Reinforcement learning for shared autonomy drone landings 共享自主无人机着陆的强化学习
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-21 DOI: 10.1007/s10514-023-10143-3
Kal Backman, Dana Kulić, Hoam Chung

Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study ((n=28)) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.

由于复杂的无人机动力学,深度感知的挑战,缺乏控制界面的专业知识以及来自地面效应的额外干扰,新手飞行员发现很难操作和降落无人机(UAV)。因此,我们提出了一种共享自主方法来帮助飞行员在深度感知困难和安全着陆区域有限的情况下安全着陆无人机。我们的方法由两个模块组成:一个感知模块,使用两个RGB-D相机将信息编码到压缩的潜在表示中;一个策略模块,使用强化学习算法TD3进行训练,以识别飞行员的意图,并提供控制输入,增加用户的输入以安全降落无人机。策略模块在模拟中使用一组模拟用户进行训练。模拟用户从一个参数模型中抽样,该模型有四个参数,分别模拟飞行员的服从倾向、熟练程度、侵略性和速度。我们进行了一项用户研究((n=28)),其中人类参与者的任务是在具有挑战性的观看条件下将实体无人机降落在几个平台之一上。仅使用模拟用户数据进行训练的助手将任务成功率从51.4提高到98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.
{"title":"Reinforcement learning for shared autonomy drone landings","authors":"Kal Backman,&nbsp;Dana Kulić,&nbsp;Hoam Chung","doi":"10.1007/s10514-023-10143-3","DOIUrl":"10.1007/s10514-023-10143-3","url":null,"abstract":"<div><p>Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (<span>(n=28)</span>) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1419 - 1438"},"PeriodicalIF":3.5,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10143-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135510764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Why ORB-SLAM is missing commonly occurring loop closures? 为什么ORB-SLAM缺少常见的循环闭包?
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-20 DOI: 10.1007/s10514-023-10149-x
Saran Khaliq, Muhammad Latif Anjum, Wajahat Hussain, Muhammad Uzair Khattak, Momen Rasool

We analyse, for the first time, the popular loop closing module of a well known and widely used open-source visual SLAM (ORB-SLAM) pipeline. Investigating failures in the loop closure module of visual SLAM is challenging since it consists of multiple building blocks. Our meticulous investigations have revealed a few interesting findings. Contrary to reported results, ORB-SLAM frequently misses large fraction of loop closures on public (KITTI, TUM RGB-D) datasets. One common assumption is, in such scenarios, the visual place recognition (vPR) block of the loop closure module is unable to find a suitable match due to extreme conditions (dynamic scene, viewpoint/scale changes). We report that native vPR of ORB-SLAM is not the sole reason for these failures. Although recent deep vPR alternatives achieve impressive matching performance, replacing native vPR with these deep alternatives will only partially improve loop closure performance of visual SLAM. Our findings suggest that the problem lies with the subsequent relative pose estimation module between the matching pair. ORB-SLAM3 has improved the recall of the original loop closing module. However, even in ORB-SLAM3, the loop closing module is the major reason behind loop closing failures. Surprisingly, using off-the-shelf ORB and SIFT based relative pose estimators (non real-time) manages to close most of the loops missed by ORB-SLAM. This significant performance gap between the two available methods suggests that ORB-SLAM’s pipeline can be further matured by focusing on the relative pose estimators, to improve loop closure performance, rather than investing more resources on improving vPR. We also evaluate deep alternatives for relative pose estimation in the context of loop closures. Interestingly, the performance of deep relocalization methods (e.g. MapNet) is worse than classic methods even in loop closures scenarios. This finding further supports the fundamental limitation of deep relocalization methods recently diagnosed. Finally, we expose bias in well-known public dataset (KITTI) due to which these commonly occurring failures have eluded the community. We augment the KITTI dataset with detailed loop closing labels. In order to compensate for the bias in the public datasets, we provide a challenging loop closure dataset which contains challenging yet commonly occurring indoor navigation scenarios with loop closures. We hope our findings and the accompanying dataset will help the community in further improving the popular ORB-SLAM’s pipeline.

我们首次分析了一个广为人知且广泛使用的开源可视化SLAM (ORB-SLAM)管道的流行闭环模块。由于可视化SLAM的闭环模块包含多个构建块,因此对其故障进行调查具有挑战性。我们细致的调查揭示了一些有趣的发现。与报道的结果相反,ORB-SLAM经常错过公共(KITTI, TUM RGB-D)数据集的大部分循环闭包。一个常见的假设是,在这种情况下,由于极端条件(动态场景,视点/尺度变化),闭环模块的视觉位置识别(vPR)块无法找到合适的匹配。我们报告ORB-SLAM的原生vPR不是导致这些失败的唯一原因。尽管最近的深度vPR替代品取得了令人印象深刻的匹配性能,但用这些深度替代品取代原生vPR只能部分提高视觉SLAM的闭环性能。我们的研究结果表明,问题在于匹配对之间的后续相对姿态估计模块。ORB-SLAM3改进了原回路关闭模块的召回。然而,即使在ORB-SLAM3中,循环关闭模块也是导致循环关闭失败的主要原因。令人惊讶的是,使用现成的ORB和基于SIFT的相对姿态估计器(非实时)可以关闭ORB- slam错过的大部分循环。两种可用方法之间的显著性能差距表明,ORB-SLAM的管道可以通过关注相对姿态估计器来进一步成熟,以提高环路闭合性能,而不是投入更多资源来提高vPR。我们还评估了在闭环环境中相对姿态估计的深度替代方法。有趣的是,即使在循环闭包场景下,深度重定位方法(例如MapNet)的性能也比经典方法差。这一发现进一步支持了最近诊断出的深度定位方法的基本局限性。最后,我们揭露了众所周知的公共数据集(KITTI)中的偏见,由于这些常见的故障已经避开了社区。我们用详细的循环结束标签来增强KITTI数据集。为了弥补公共数据集中的偏差,我们提供了一个具有挑战性的闭环数据集,其中包含具有挑战性但通常发生的室内导航场景。我们希望我们的发现和随附的数据集能够帮助社区进一步改进流行的ORB-SLAM管道。
{"title":"Why ORB-SLAM is missing commonly occurring loop closures?","authors":"Saran Khaliq,&nbsp;Muhammad Latif Anjum,&nbsp;Wajahat Hussain,&nbsp;Muhammad Uzair Khattak,&nbsp;Momen Rasool","doi":"10.1007/s10514-023-10149-x","DOIUrl":"10.1007/s10514-023-10149-x","url":null,"abstract":"<div><p>We analyse, for the first time, the popular loop closing module of a well known and widely used open-source visual SLAM (ORB-SLAM) pipeline. Investigating failures in the loop closure module of visual SLAM is challenging since it consists of multiple building blocks. Our meticulous investigations have revealed a few interesting findings. Contrary to reported results, ORB-SLAM frequently misses large fraction of loop closures on public (KITTI, TUM RGB-D) datasets. One common assumption is, in such scenarios, the visual place recognition (vPR) block of the loop closure module is unable to find a suitable match due to extreme conditions (dynamic scene, viewpoint/scale changes). We report that native vPR of ORB-SLAM is not the sole reason for these failures. Although recent deep vPR alternatives achieve impressive matching performance, replacing native vPR with these deep alternatives will only partially improve loop closure performance of visual SLAM. Our findings suggest that the problem lies with the subsequent relative pose estimation module between the matching pair. ORB-SLAM3 has improved the recall of the original loop closing module. However, even in ORB-SLAM3, the loop closing module is the major reason behind loop closing failures. Surprisingly, using <i>off-the-shelf</i> ORB and SIFT based relative pose estimators (non real-time) manages to close most of the loops missed by ORB-SLAM. This significant performance gap between the two available methods suggests that ORB-SLAM’s pipeline can be further matured by focusing on the relative pose estimators, to improve loop closure performance, rather than investing more resources on improving vPR. We also evaluate deep alternatives for relative pose estimation in the context of loop closures. Interestingly, the performance of deep relocalization methods (e.g. MapNet) is worse than classic methods even in loop closures scenarios. This finding further supports the fundamental limitation of deep relocalization methods recently diagnosed. Finally, we expose bias in well-known public dataset (KITTI) due to which these commonly occurring failures have eluded the community. We augment the KITTI dataset with detailed loop closing labels. In order to compensate for the bias in the public datasets, we provide a challenging loop closure dataset which contains challenging yet commonly occurring indoor navigation scenarios with loop closures. We hope our findings and the accompanying dataset will help the community in further improving the popular ORB-SLAM’s pipeline.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1519 - 1535"},"PeriodicalIF":3.5,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135569276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement learning with model-based feedforward inputs for robotic table tennis 基于模型前馈输入的乒乓球机器人强化学习
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-17 DOI: 10.1007/s10514-023-10140-6
Hao Ma, Dieter Büchler, Bernhard Schölkopf, Michael Muehlebach

We rethink the traditional reinforcement learning approach, which is based on optimizing over feedback policies, and propose a new framework that optimizes over feedforward inputs instead. This not only mitigates the risk of destabilizing the system during training but also reduces the bulk of the learning to a supervised learning task. As a result, efficient and well-understood supervised learning techniques can be applied and are tuned using a validation data set. The labels are generated with a variant of iterative learning control, which also includes prior knowledge about the underlying dynamics. Our framework is applied for intercepting and returning ping-pong balls that are played to a four-degrees-of-freedom robotic arm in real-world experiments. The robot arm is driven by pneumatic artificial muscles, which makes the control and learning tasks challenging. We highlight the potential of our framework by comparing it to a reinforcement learning approach that optimizes over feedback policies. We find that our framework achieves a higher success rate for the returns ((100%) vs. (96%), on 107 consecutive trials, see https://youtu.be/kR9jowEH7PY) while requiring only about one tenth of the samples during training. We also find that our approach is able to deal with a variant of different incoming trajectories.

我们重新思考了传统的基于反馈策略优化的强化学习方法,并提出了一个新的框架来优化前馈输入。这不仅降低了在训练期间系统不稳定的风险,而且还减少了大量的学习到监督学习任务。因此,可以应用高效且易于理解的监督学习技术,并使用验证数据集进行调优。标签是用迭代学习控制的一种变体生成的,它还包括关于潜在动力学的先验知识。我们的框架在现实世界的实验中被应用于乒乓球的拦截和返回,乒乓球被打给一个四自由度的机械臂。机器人手臂由气动人造肌肉驱动,这使得控制和学习任务具有挑战性。我们通过将其与优化反馈策略的强化学习方法进行比较,突出了我们框架的潜力。我们发现我们的框架在107次连续试验中获得了更高的成功率((100%) vs. (96%),见https://youtu.be/kR9jowEH7PY),而在训练期间只需要大约十分之一的样本。我们还发现,我们的方法能够处理各种不同的入射轨迹。
{"title":"Reinforcement learning with model-based feedforward inputs for robotic table tennis","authors":"Hao Ma,&nbsp;Dieter Büchler,&nbsp;Bernhard Schölkopf,&nbsp;Michael Muehlebach","doi":"10.1007/s10514-023-10140-6","DOIUrl":"10.1007/s10514-023-10140-6","url":null,"abstract":"<div><p>We rethink the traditional reinforcement learning approach, which is based on optimizing over feedback policies, and propose a new framework that optimizes over feedforward inputs instead. This not only mitigates the risk of destabilizing the system during training but also reduces the bulk of the learning to a supervised learning task. As a result, efficient and well-understood supervised learning techniques can be applied and are tuned using a validation data set. The labels are generated with a variant of iterative learning control, which also includes prior knowledge about the underlying dynamics. Our framework is applied for intercepting and returning ping-pong balls that are played to a four-degrees-of-freedom robotic arm in real-world experiments. The robot arm is driven by pneumatic artificial muscles, which makes the control and learning tasks challenging. We highlight the potential of our framework by comparing it to a reinforcement learning approach that optimizes over feedback policies. We find that our framework achieves a higher success rate for the returns (<span>(100%)</span> vs. <span>(96%)</span>, on 107 consecutive trials, see https://youtu.be/kR9jowEH7PY) while requiring only about one tenth of the samples during training. We also find that our approach is able to deal with a variant of different incoming trajectories.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1387 - 1403"},"PeriodicalIF":3.5,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10140-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135995053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RoLoMa: robust loco-manipulation for quadruped robots with arms RoLoMa:具有手臂的四足机器人的鲁棒位置操纵
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-10-15 DOI: 10.1007/s10514-023-10146-0
Henrique Ferrolho, Vladimir Ivan, Wolfgang Merkt, Ioannis Havoutis, Sethu Vijayakumar

Deployment of robotic systems in the real world requires a certain level of robustness in order to deal with uncertainty factors, such as mismatches in the dynamics model, noise in sensor readings, and communication delays. Some approaches tackle these issues reactively at the control stage. However, regardless of the controller, online motion execution can only be as robust as the system capabilities allow at any given state. This is why it is important to have good motion plans to begin with, where robustness is considered proactively. To this end, we propose a metric (derived from first principles) for representing robustness against external disturbances. We then use this metric within our trajectory optimization framework for solving complex loco-manipulation tasks. Through our experiments, we show that trajectories generated using our approach can resist a greater range of forces originating from any possible direction. By using our method, we can compute trajectories that solve tasks as effectively as before, with the added benefit of being able to counteract stronger disturbances in worst-case scenarios.

在现实世界中部署机器人系统需要一定程度的鲁棒性,以处理不确定性因素,如动力学模型中的不匹配、传感器读数中的噪声和通信延迟。有些方法在控制阶段反应性地解决这些问题。然而,无论控制器是什么,在线运动执行只能与系统功能在任何给定状态下允许的一样健壮。这就是为什么一开始就有一个好的运动计划是很重要的,在这个计划中,我们要主动考虑健全性。为此,我们提出了一个度量(来自第一原理)来表示对外部干扰的鲁棒性。然后,我们在轨迹优化框架中使用该度量来解决复杂的局部操作任务。通过我们的实验,我们表明使用我们的方法生成的轨迹可以抵抗来自任何可能方向的更大范围的力。通过使用我们的方法,我们可以计算出像以前一样有效地解决任务的轨迹,并且能够在最坏情况下抵消更强的干扰。
{"title":"RoLoMa: robust loco-manipulation for quadruped robots with arms","authors":"Henrique Ferrolho,&nbsp;Vladimir Ivan,&nbsp;Wolfgang Merkt,&nbsp;Ioannis Havoutis,&nbsp;Sethu Vijayakumar","doi":"10.1007/s10514-023-10146-0","DOIUrl":"10.1007/s10514-023-10146-0","url":null,"abstract":"<div><p>Deployment of robotic systems in the real world requires a certain level of robustness in order to deal with uncertainty factors, such as mismatches in the dynamics model, noise in sensor readings, and communication delays. Some approaches tackle these issues <i>reactively</i> at the control stage. However, regardless of the controller, online motion execution can only be as robust as the system capabilities allow at any given state. This is why it is important to have good motion plans to begin with, where robustness is considered <i>proactively</i>. To this end, we propose a metric (derived from first principles) for representing robustness against external disturbances. We then use this metric within our trajectory optimization framework for solving complex loco-manipulation tasks. Through our experiments, we show that trajectories generated using our approach can resist a greater range of forces originating from any possible direction. By using our method, we can compute trajectories that solve tasks as effectively as before, with the added benefit of being able to counteract stronger disturbances in worst-case scenarios.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1463 - 1481"},"PeriodicalIF":3.5,"publicationDate":"2023-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10146-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136185248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
FuseBot: mechanical search of rigid and deformable objects via multi-modal perception FuseBot:通过多模态感知对刚性和可变形物体进行机械搜索
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-23 DOI: 10.1007/s10514-023-10137-1
Tara Boroushaki, Laura Dodds, Nazish Naeem, Fadel Adib

Mechanical search is a robotic problem where a robot needs to retrieve a target item that is partially or fully-occluded from its camera. State-of-the-art approaches for mechanical search either require an expensive search process to find the target item, or they require the item to be tagged with a radio frequency identification tag (e.g., RFID), making their approach beneficial only to tagged items in the environment. We present FuseBot, the first robotic system for RF-Visual mechanical search that enables efficient retrieval of both RF-tagged and untagged items in a pile. Rather than requiring all target items in a pile to be RF-tagged, FuseBot leverages the mere existence of an RF-tagged item in the pile to benefit both tagged and untagged items. Our design introduces two key innovations. The first is RF-Visual Mapping, a technique that identifies and locates RF-tagged items in a pile and uses this information to construct an RF-Visual occupancy distribution map. The second is RF-Visual Extraction, a policy formulated as an optimization problem that minimizes the number of actions required to extract the target object by accounting for the probabilistic occupancy distribution, the expected grasp quality, and the expected information gain from future actions. We built a real-time end-to-end prototype of our system on a UR5e robotic arm with in-hand vision and RF perception modules. We conducted over 200 real-world experimental trials to evaluate FuseBot and compare its performance to a state-of-the-art vision-based system named X-Ray (Danielczuk et al., in: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, 2020). Our experimental results demonstrate that FuseBot outperforms X-Ray’s efficiency by more than 40% in terms of the number of actions required for successful mechanical search. Furthermore, in comparison to X-Ray’s success rate of 84%, FuseBot achieves a success rate of 95% in retrieving untagged items, demonstrating for the first time that the benefits of RF perception extend beyond tagged objects in the mechanical search problem.

机械搜索是一个机器人问题,机器人需要从相机中检索部分或完全被遮挡的目标物品。最先进的机械搜索方法要么需要一个昂贵的搜索过程来找到目标物品,要么需要用射频识别标签(例如RFID)标记物品,这使得他们的方法只对环境中标记的物品有益。我们介绍了FuseBot,这是第一个用于射频视觉机械搜索的机器人系统,可以有效地检索一堆带有射频标签和未标记的物品。FuseBot并没有要求堆中的所有目标物品都贴上射频标签,而是充分利用堆中存在的射频标签物品,从而使标签和未标签的物品都受益。我们的设计引入了两个关键的创新。第一种是射频视觉映射,这种技术可以识别和定位一堆带有射频标签的物品,并利用这些信息构建一个射频视觉占用分布图。第二个是rf视觉提取,这是一个优化问题,通过考虑概率占用分布、预期抓取质量和未来操作的预期信息增益,最小化提取目标物体所需的操作数量。我们在UR5e机械臂上建立了一个实时端到端原型系统,该系统具有手持视觉和射频感知模块。我们进行了200多个真实世界的实验试验来评估FuseBot,并将其性能与最先进的基于视觉的系统x -射线进行比较(Danielczuk等人,在:2020年IEEE/RSJ智能机器人和系统国际会议(IROS), IEEE, 2020)。我们的实验结果表明,就成功的机械搜索所需的动作数量而言,FuseBot的效率超过了X-Ray的40%以上。此外,与x射线的84%成功率相比,FuseBot在检索未标记物品方面的成功率为95%,这首次证明了射频感知的好处超出了机械搜索问题中标记物体的范围。
{"title":"FuseBot: mechanical search of rigid and deformable objects via multi-modal perception","authors":"Tara Boroushaki,&nbsp;Laura Dodds,&nbsp;Nazish Naeem,&nbsp;Fadel Adib","doi":"10.1007/s10514-023-10137-1","DOIUrl":"10.1007/s10514-023-10137-1","url":null,"abstract":"<div><p>Mechanical search is a robotic problem where a robot needs to retrieve a target item that is partially or fully-occluded from its camera. State-of-the-art approaches for mechanical search either require an expensive search process to find the target item, or they require the item to be tagged with a radio frequency identification tag (e.g., RFID), making their approach beneficial only to tagged items in the environment. We present FuseBot, the first robotic system for RF-Visual mechanical search that enables efficient retrieval of both RF-tagged and untagged items in a pile. Rather than requiring all target items in a pile to be RF-tagged, FuseBot leverages the mere existence of an RF-tagged item in the pile to benefit both tagged and untagged items. Our design introduces two key innovations. The first is <i>RF-Visual Mapping</i>, a technique that identifies and locates RF-tagged items in a pile and uses this information to construct an RF-Visual occupancy distribution map. The second is <i>RF-Visual Extraction</i>, a policy formulated as an optimization problem that minimizes the number of actions required to extract the target object by accounting for the probabilistic occupancy distribution, the expected grasp quality, and the expected information gain from future actions. We built a real-time end-to-end prototype of our system on a UR5e robotic arm with in-hand vision and RF perception modules. We conducted over 200 real-world experimental trials to evaluate FuseBot and compare its performance to a state-of-the-art vision-based system named X-Ray (Danielczuk et al., in: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, 2020). Our experimental results demonstrate that FuseBot outperforms X-Ray’s efficiency by more than 40% in terms of the number of actions required for successful mechanical search. Furthermore, in comparison to X-Ray’s success rate of 84%, FuseBot achieves a success rate of 95% in retrieving untagged items, demonstrating for the first time that the benefits of RF perception extend beyond tagged objects in the mechanical search problem.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1137 - 1154"},"PeriodicalIF":3.5,"publicationDate":"2023-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10137-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135958951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
UVS: underwater visual SLAM—a robust monocular visual SLAM system for lifelong underwater operations UVS:水下视觉SLAM -一个强大的单目视觉SLAM系统,用于终身水下操作
IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-09-22 DOI: 10.1007/s10514-023-10138-0
Marco Leonardi, Annette Stahl, Edmund Førland Brekke, Martin Ludvigsen

In this paper, a visual simultaneous localization and mapping (VSLAM/visual SLAM) system called underwater visual SLAM (UVS) system is presented, specifically tailored for camera-only navigation in natural underwater environments. The UVS system is particularly optimized towards precision and robustness, as well as lifelong operations. We build upon Oriented features from accelerated segment test and Rotated Binary robust independent elementary features simultaneous localization and mapping (ORB-SLAM) and improve the accuracy by performing an exact search in the descriptor space during triangulation and the robustness by utilizing a unified initialization method and a motion model. In addition, we present a scale-agnostic station-keeping detection, which aims to optimize the map and poses during station-keeping, and a pruning strategy, which takes into account the point’s age and distance to the active keyframe. An exhaustive evaluation is presented to the reader, using a total of 38 in-air and underwater sequences.

本文提出了一种专门针对自然水下环境中仅相机导航的水下视觉SLAM (UVS)系统,即视觉同时定位与制图(VSLAM/visual SLAM)系统。UVS系统特别针对精度和稳健性进行了优化,并且可以终身使用。基于加速段测试的定向特征和旋转二进制的鲁棒独立基本特征同步定位和映射(ORB-SLAM),通过在三角剖分过程中对描述子空间进行精确搜索来提高精度,并利用统一的初始化方法和运动模型来提高鲁棒性。此外,我们提出了一种尺度无关的站位保持检测方法,该方法旨在优化站位保持过程中的地图和姿态,并提出了一种考虑点的年龄和到活动关键帧的距离的修剪策略。一个详尽的评估是呈现给读者,共使用38空中和水下序列。
{"title":"UVS: underwater visual SLAM—a robust monocular visual SLAM system for lifelong underwater operations","authors":"Marco Leonardi,&nbsp;Annette Stahl,&nbsp;Edmund Førland Brekke,&nbsp;Martin Ludvigsen","doi":"10.1007/s10514-023-10138-0","DOIUrl":"10.1007/s10514-023-10138-0","url":null,"abstract":"<div><p>In this paper, a visual simultaneous localization and mapping (VSLAM/visual SLAM) system called underwater visual SLAM (UVS) system is presented, specifically tailored for camera-only navigation in natural underwater environments. The UVS system is particularly optimized towards precision and robustness, as well as lifelong operations. We build upon Oriented features from accelerated segment test and Rotated Binary robust independent elementary features simultaneous localization and mapping (ORB-SLAM) and improve the accuracy by performing an exact search in the descriptor space during triangulation and the robustness by utilizing a unified initialization method and a motion model. In addition, we present a scale-agnostic station-keeping detection, which aims to optimize the map and poses during station-keeping, and a pruning strategy, which takes into account the point’s age and distance to the active keyframe. An exhaustive evaluation is presented to the reader, using a total of 38 in-air and underwater sequences.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1367 - 1385"},"PeriodicalIF":3.5,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10138-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136015937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Autonomous Robots
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1