Autonomous Robots最新文献_第6页

Semantic anomaly detection with large language models 基于大型语言模型的语义异常检测

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-23 DOI: 10.1007/s10514-023-10132-6

Amine Elhafsi, Rohan Sinha, Christopher Agia, Edward Schmerling, Issa A. D. Nesnas, Marco Pavone

As robots acquire increasingly sophisticated skills and see increasingly complex and varied environments, the threat of an edge case or anomalous failure is ever present. For example, Tesla cars have seen interesting failure modes ranging from autopilot disengagements due to inactive traffic lights carried by trucks to phantom braking caused by images of stop signs on roadside billboards. These system-level failures are not due to failures of any individual component of the autonomy stack but rather system-level deficiencies in semantic reasoning. Such edge cases, which we call semantic anomalies, are simple for a human to disentangle yet require insightful reasoning. To this end, we study the application of large language models (LLMs), endowed with broad contextual understanding and reasoning capabilities, to recognize such edge cases and introduce a monitoring framework for semantic anomaly detection in vision-based policies. Our experiments apply this framework to a finite state machine policy for autonomous driving and a learned policy for object manipulation. These experiments demonstrate that the LLM-based monitor can effectively identify semantic anomalies in a manner that shows agreement with human reasoning. Finally, we provide an extended discussion on the strengths and weaknesses of this approach and motivate a research outlook on how we can further use foundation models for semantic anomaly detection. Our project webpage can be found at https://sites.google.com/view/llm-anomaly-detection.

随着机器人获得越来越复杂的技能，看到越来越复杂和多变的环境，边缘情况或异常故障的威胁永远存在。例如，特斯拉汽车出现了一些有趣的故障模式，从卡车携带的红绿灯不活跃导致自动驾驶仪脱离，到路边广告牌上的停车标志图像导致的幻影制动。这些系统级故障不是由于自治堆栈的任何单个组件的故障，而是由于语义推理中的系统级缺陷。这种边缘情况，我们称之为语义异常，对人类来说很容易解开，但需要深刻的推理。为此，我们研究了具有广泛上下文理解和推理能力的大型语言模型(llm)的应用，以识别此类边缘情况，并引入了基于视觉的策略中语义异常检测的监控框架。我们的实验将此框架应用于自动驾驶的有限状态机策略和对象操作的学习策略。这些实验表明，基于llm的监视器可以有效地识别语义异常，并且与人类推理一致。最后，我们对该方法的优点和缺点进行了扩展讨论，并对如何进一步使用基础模型进行语义异常检测进行了研究展望。我们的项目网页可在https://sites.google.com/view/llm-anomaly-detection找到。

{"title":"Semantic anomaly detection with large language models","authors":"Amine Elhafsi, Rohan Sinha, Christopher Agia, Edward Schmerling, Issa A. D. Nesnas, Marco Pavone","doi":"10.1007/s10514-023-10132-6","DOIUrl":"10.1007/s10514-023-10132-6","url":null,"abstract":"<div><p>As robots acquire increasingly sophisticated skills and see increasingly complex and varied environments, the threat of an edge case or anomalous failure is ever present. For example, Tesla cars have seen interesting failure modes ranging from autopilot disengagements due to inactive traffic lights carried by trucks to phantom braking caused by images of stop signs on roadside billboards. These system-level failures are not due to failures of any individual component of the autonomy stack but rather system-level deficiencies in semantic reasoning. Such edge cases, which we call <i>semantic anomalies</i>, are simple for a human to disentangle yet require insightful reasoning. To this end, we study the application of large language models (LLMs), endowed with broad contextual understanding and reasoning capabilities, to recognize such edge cases and introduce a monitoring framework for semantic anomaly detection in vision-based policies. Our experiments apply this framework to a finite state machine policy for autonomous driving and a learned policy for object manipulation. These experiments demonstrate that the LLM-based monitor can effectively identify semantic anomalies in a manner that shows agreement with human reasoning. Finally, we provide an extended discussion on the strengths and weaknesses of this approach and motivate a research outlook on how we can further use foundation models for semantic anomaly detection. Our project webpage can be found at https://sites.google.com/view/llm-anomaly-detection. \u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1035 - 1055"},"PeriodicalIF":3.5,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135322901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Reinforcement learning for shared autonomy drone landings 共享自主无人机着陆的强化学习

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-21 DOI: 10.1007/s10514-023-10143-3

Kal Backman, Dana Kulić, Hoam Chung

Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study ((n=28)) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.

由于复杂的无人机动力学，深度感知的挑战，缺乏控制界面的专业知识以及来自地面效应的额外干扰，新手飞行员发现很难操作和降落无人机(UAV)。因此，我们提出了一种共享自主方法来帮助飞行员在深度感知困难和安全着陆区域有限的情况下安全着陆无人机。我们的方法由两个模块组成:一个感知模块，使用两个RGB-D相机将信息编码到压缩的潜在表示中;一个策略模块，使用强化学习算法TD3进行训练，以识别飞行员的意图，并提供控制输入，增加用户的输入以安全降落无人机。策略模块在模拟中使用一组模拟用户进行训练。模拟用户从一个参数模型中抽样，该模型有四个参数，分别模拟飞行员的服从倾向、熟练程度、侵略性和速度。我们进行了一项用户研究((n=28))，其中人类参与者的任务是在具有挑战性的观看条件下将实体无人机降落在几个平台之一上。仅使用模拟用户数据进行训练的助手将任务成功率从51.4提高到98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.

{"title":"Reinforcement learning for shared autonomy drone landings","authors":"Kal Backman, Dana Kulić, Hoam Chung","doi":"10.1007/s10514-023-10143-3","DOIUrl":"10.1007/s10514-023-10143-3","url":null,"abstract":"<div><p>Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach is comprised of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot’s intent and to provide control inputs that augment the user’s input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot’s tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (<span>(n=28)</span>) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4 to 98.2% despite being unaware of the human participants’ goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1419 - 1438"},"PeriodicalIF":3.5,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10143-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135510764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Why ORB-SLAM is missing commonly occurring loop closures? 为什么ORB-SLAM缺少常见的循环闭包?

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-20 DOI: 10.1007/s10514-023-10149-x

Saran Khaliq, Muhammad Latif Anjum, Wajahat Hussain, Muhammad Uzair Khattak, Momen Rasool

We analyse, for the first time, the popular loop closing module of a well known and widely used open-source visual SLAM (ORB-SLAM) pipeline. Investigating failures in the loop closure module of visual SLAM is challenging since it consists of multiple building blocks. Our meticulous investigations have revealed a few interesting findings. Contrary to reported results, ORB-SLAM frequently misses large fraction of loop closures on public (KITTI, TUM RGB-D) datasets. One common assumption is, in such scenarios, the visual place recognition (vPR) block of the loop closure module is unable to find a suitable match due to extreme conditions (dynamic scene, viewpoint/scale changes). We report that native vPR of ORB-SLAM is not the sole reason for these failures. Although recent deep vPR alternatives achieve impressive matching performance, replacing native vPR with these deep alternatives will only partially improve loop closure performance of visual SLAM. Our findings suggest that the problem lies with the subsequent relative pose estimation module between the matching pair. ORB-SLAM3 has improved the recall of the original loop closing module. However, even in ORB-SLAM3, the loop closing module is the major reason behind loop closing failures. Surprisingly, using off-the-shelf ORB and SIFT based relative pose estimators (non real-time) manages to close most of the loops missed by ORB-SLAM. This significant performance gap between the two available methods suggests that ORB-SLAM’s pipeline can be further matured by focusing on the relative pose estimators, to improve loop closure performance, rather than investing more resources on improving vPR. We also evaluate deep alternatives for relative pose estimation in the context of loop closures. Interestingly, the performance of deep relocalization methods (e.g. MapNet) is worse than classic methods even in loop closures scenarios. This finding further supports the fundamental limitation of deep relocalization methods recently diagnosed. Finally, we expose bias in well-known public dataset (KITTI) due to which these commonly occurring failures have eluded the community. We augment the KITTI dataset with detailed loop closing labels. In order to compensate for the bias in the public datasets, we provide a challenging loop closure dataset which contains challenging yet commonly occurring indoor navigation scenarios with loop closures. We hope our findings and the accompanying dataset will help the community in further improving the popular ORB-SLAM’s pipeline.

我们首次分析了一个广为人知且广泛使用的开源可视化SLAM (ORB-SLAM)管道的流行闭环模块。由于可视化SLAM的闭环模块包含多个构建块，因此对其故障进行调查具有挑战性。我们细致的调查揭示了一些有趣的发现。与报道的结果相反，ORB-SLAM经常错过公共(KITTI, TUM RGB-D)数据集的大部分循环闭包。一个常见的假设是，在这种情况下，由于极端条件(动态场景，视点/尺度变化)，闭环模块的视觉位置识别(vPR)块无法找到合适的匹配。我们报告ORB-SLAM的原生vPR不是导致这些失败的唯一原因。尽管最近的深度vPR替代品取得了令人印象深刻的匹配性能，但用这些深度替代品取代原生vPR只能部分提高视觉SLAM的闭环性能。我们的研究结果表明，问题在于匹配对之间的后续相对姿态估计模块。ORB-SLAM3改进了原回路关闭模块的召回。然而，即使在ORB-SLAM3中，循环关闭模块也是导致循环关闭失败的主要原因。令人惊讶的是，使用现成的ORB和基于SIFT的相对姿态估计器(非实时)可以关闭ORB- slam错过的大部分循环。两种可用方法之间的显著性能差距表明，ORB-SLAM的管道可以通过关注相对姿态估计器来进一步成熟，以提高环路闭合性能，而不是投入更多资源来提高vPR。我们还评估了在闭环环境中相对姿态估计的深度替代方法。有趣的是，即使在循环闭包场景下，深度重定位方法(例如MapNet)的性能也比经典方法差。这一发现进一步支持了最近诊断出的深度定位方法的基本局限性。最后，我们揭露了众所周知的公共数据集(KITTI)中的偏见，由于这些常见的故障已经避开了社区。我们用详细的循环结束标签来增强KITTI数据集。为了弥补公共数据集中的偏差，我们提供了一个具有挑战性的闭环数据集，其中包含具有挑战性但通常发生的室内导航场景。我们希望我们的发现和随附的数据集能够帮助社区进一步改进流行的ORB-SLAM管道。

{"title":"Why ORB-SLAM is missing commonly occurring loop closures?","authors":"Saran Khaliq, Muhammad Latif Anjum, Wajahat Hussain, Muhammad Uzair Khattak, Momen Rasool","doi":"10.1007/s10514-023-10149-x","DOIUrl":"10.1007/s10514-023-10149-x","url":null,"abstract":"<div><p>We analyse, for the first time, the popular loop closing module of a well known and widely used open-source visual SLAM (ORB-SLAM) pipeline. Investigating failures in the loop closure module of visual SLAM is challenging since it consists of multiple building blocks. Our meticulous investigations have revealed a few interesting findings. Contrary to reported results, ORB-SLAM frequently misses large fraction of loop closures on public (KITTI, TUM RGB-D) datasets. One common assumption is, in such scenarios, the visual place recognition (vPR) block of the loop closure module is unable to find a suitable match due to extreme conditions (dynamic scene, viewpoint/scale changes). We report that native vPR of ORB-SLAM is not the sole reason for these failures. Although recent deep vPR alternatives achieve impressive matching performance, replacing native vPR with these deep alternatives will only partially improve loop closure performance of visual SLAM. Our findings suggest that the problem lies with the subsequent relative pose estimation module between the matching pair. ORB-SLAM3 has improved the recall of the original loop closing module. However, even in ORB-SLAM3, the loop closing module is the major reason behind loop closing failures. Surprisingly, using <i>off-the-shelf</i> ORB and SIFT based relative pose estimators (non real-time) manages to close most of the loops missed by ORB-SLAM. This significant performance gap between the two available methods suggests that ORB-SLAM’s pipeline can be further matured by focusing on the relative pose estimators, to improve loop closure performance, rather than investing more resources on improving vPR. We also evaluate deep alternatives for relative pose estimation in the context of loop closures. Interestingly, the performance of deep relocalization methods (e.g. MapNet) is worse than classic methods even in loop closures scenarios. This finding further supports the fundamental limitation of deep relocalization methods recently diagnosed. Finally, we expose bias in well-known public dataset (KITTI) due to which these commonly occurring failures have eluded the community. We augment the KITTI dataset with detailed loop closing labels. In order to compensate for the bias in the public datasets, we provide a challenging loop closure dataset which contains challenging yet commonly occurring indoor navigation scenarios with loop closures. We hope our findings and the accompanying dataset will help the community in further improving the popular ORB-SLAM’s pipeline.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1519 - 1535"},"PeriodicalIF":3.5,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135569276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reinforcement learning with model-based feedforward inputs for robotic table tennis 基于模型前馈输入的乒乓球机器人强化学习

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-17 DOI: 10.1007/s10514-023-10140-6

Hao Ma, Dieter Büchler, Bernhard Schölkopf, Michael Muehlebach

We rethink the traditional reinforcement learning approach, which is based on optimizing over feedback policies, and propose a new framework that optimizes over feedforward inputs instead. This not only mitigates the risk of destabilizing the system during training but also reduces the bulk of the learning to a supervised learning task. As a result, efficient and well-understood supervised learning techniques can be applied and are tuned using a validation data set. The labels are generated with a variant of iterative learning control, which also includes prior knowledge about the underlying dynamics. Our framework is applied for intercepting and returning ping-pong balls that are played to a four-degrees-of-freedom robotic arm in real-world experiments. The robot arm is driven by pneumatic artificial muscles, which makes the control and learning tasks challenging. We highlight the potential of our framework by comparing it to a reinforcement learning approach that optimizes over feedback policies. We find that our framework achieves a higher success rate for the returns ((100%) vs. (96%), on 107 consecutive trials, see https://youtu.be/kR9jowEH7PY) while requiring only about one tenth of the samples during training. We also find that our approach is able to deal with a variant of different incoming trajectories.

我们重新思考了传统的基于反馈策略优化的强化学习方法，并提出了一个新的框架来优化前馈输入。这不仅降低了在训练期间系统不稳定的风险，而且还减少了大量的学习到监督学习任务。因此，可以应用高效且易于理解的监督学习技术，并使用验证数据集进行调优。标签是用迭代学习控制的一种变体生成的，它还包括关于潜在动力学的先验知识。我们的框架在现实世界的实验中被应用于乒乓球的拦截和返回，乒乓球被打给一个四自由度的机械臂。机器人手臂由气动人造肌肉驱动，这使得控制和学习任务具有挑战性。我们通过将其与优化反馈策略的强化学习方法进行比较，突出了我们框架的潜力。我们发现我们的框架在107次连续试验中获得了更高的成功率((100%) vs. (96%)，见https://youtu.be/kR9jowEH7PY)，而在训练期间只需要大约十分之一的样本。我们还发现，我们的方法能够处理各种不同的入射轨迹。

{"title":"Reinforcement learning with model-based feedforward inputs for robotic table tennis","authors":"Hao Ma, Dieter Büchler, Bernhard Schölkopf, Michael Muehlebach","doi":"10.1007/s10514-023-10140-6","DOIUrl":"10.1007/s10514-023-10140-6","url":null,"abstract":"<div><p>We rethink the traditional reinforcement learning approach, which is based on optimizing over feedback policies, and propose a new framework that optimizes over feedforward inputs instead. This not only mitigates the risk of destabilizing the system during training but also reduces the bulk of the learning to a supervised learning task. As a result, efficient and well-understood supervised learning techniques can be applied and are tuned using a validation data set. The labels are generated with a variant of iterative learning control, which also includes prior knowledge about the underlying dynamics. Our framework is applied for intercepting and returning ping-pong balls that are played to a four-degrees-of-freedom robotic arm in real-world experiments. The robot arm is driven by pneumatic artificial muscles, which makes the control and learning tasks challenging. We highlight the potential of our framework by comparing it to a reinforcement learning approach that optimizes over feedback policies. We find that our framework achieves a higher success rate for the returns (<span>(100%)</span> vs. <span>(96%)</span>, on 107 consecutive trials, see https://youtu.be/kR9jowEH7PY) while requiring only about one tenth of the samples during training. We also find that our approach is able to deal with a variant of different incoming trajectories.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1387 - 1403"},"PeriodicalIF":3.5,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10140-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135995053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RoLoMa: robust loco-manipulation for quadruped robots with arms RoLoMa:具有手臂的四足机器人的鲁棒位置操纵

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-10-15 DOI: 10.1007/s10514-023-10146-0

Henrique Ferrolho, Vladimir Ivan, Wolfgang Merkt, Ioannis Havoutis, Sethu Vijayakumar

Deployment of robotic systems in the real world requires a certain level of robustness in order to deal with uncertainty factors, such as mismatches in the dynamics model, noise in sensor readings, and communication delays. Some approaches tackle these issues reactively at the control stage. However, regardless of the controller, online motion execution can only be as robust as the system capabilities allow at any given state. This is why it is important to have good motion plans to begin with, where robustness is considered proactively. To this end, we propose a metric (derived from first principles) for representing robustness against external disturbances. We then use this metric within our trajectory optimization framework for solving complex loco-manipulation tasks. Through our experiments, we show that trajectories generated using our approach can resist a greater range of forces originating from any possible direction. By using our method, we can compute trajectories that solve tasks as effectively as before, with the added benefit of being able to counteract stronger disturbances in worst-case scenarios.

在现实世界中部署机器人系统需要一定程度的鲁棒性，以处理不确定性因素，如动力学模型中的不匹配、传感器读数中的噪声和通信延迟。有些方法在控制阶段反应性地解决这些问题。然而，无论控制器是什么，在线运动执行只能与系统功能在任何给定状态下允许的一样健壮。这就是为什么一开始就有一个好的运动计划是很重要的，在这个计划中，我们要主动考虑健全性。为此，我们提出了一个度量(来自第一原理)来表示对外部干扰的鲁棒性。然后，我们在轨迹优化框架中使用该度量来解决复杂的局部操作任务。通过我们的实验，我们表明使用我们的方法生成的轨迹可以抵抗来自任何可能方向的更大范围的力。通过使用我们的方法，我们可以计算出像以前一样有效地解决任务的轨迹，并且能够在最坏情况下抵消更强的干扰。

{"title":"RoLoMa: robust loco-manipulation for quadruped robots with arms","authors":"Henrique Ferrolho, Vladimir Ivan, Wolfgang Merkt, Ioannis Havoutis, Sethu Vijayakumar","doi":"10.1007/s10514-023-10146-0","DOIUrl":"10.1007/s10514-023-10146-0","url":null,"abstract":"<div><p>Deployment of robotic systems in the real world requires a certain level of robustness in order to deal with uncertainty factors, such as mismatches in the dynamics model, noise in sensor readings, and communication delays. Some approaches tackle these issues <i>reactively</i> at the control stage. However, regardless of the controller, online motion execution can only be as robust as the system capabilities allow at any given state. This is why it is important to have good motion plans to begin with, where robustness is considered <i>proactively</i>. To this end, we propose a metric (derived from first principles) for representing robustness against external disturbances. We then use this metric within our trajectory optimization framework for solving complex loco-manipulation tasks. Through our experiments, we show that trajectories generated using our approach can resist a greater range of forces originating from any possible direction. By using our method, we can compute trajectories that solve tasks as effectively as before, with the added benefit of being able to counteract stronger disturbances in worst-case scenarios.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1463 - 1481"},"PeriodicalIF":3.5,"publicationDate":"2023-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10146-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136185248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

FuseBot: mechanical search of rigid and deformable objects via multi-modal perception FuseBot:通过多模态感知对刚性和可变形物体进行机械搜索

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-09-23 DOI: 10.1007/s10514-023-10137-1

Tara Boroushaki, Laura Dodds, Nazish Naeem, Fadel Adib

Mechanical search is a robotic problem where a robot needs to retrieve a target item that is partially or fully-occluded from its camera. State-of-the-art approaches for mechanical search either require an expensive search process to find the target item, or they require the item to be tagged with a radio frequency identification tag (e.g., RFID), making their approach beneficial only to tagged items in the environment. We present FuseBot, the first robotic system for RF-Visual mechanical search that enables efficient retrieval of both RF-tagged and untagged items in a pile. Rather than requiring all target items in a pile to be RF-tagged, FuseBot leverages the mere existence of an RF-tagged item in the pile to benefit both tagged and untagged items. Our design introduces two key innovations. The first is RF-Visual Mapping, a technique that identifies and locates RF-tagged items in a pile and uses this information to construct an RF-Visual occupancy distribution map. The second is RF-Visual Extraction, a policy formulated as an optimization problem that minimizes the number of actions required to extract the target object by accounting for the probabilistic occupancy distribution, the expected grasp quality, and the expected information gain from future actions. We built a real-time end-to-end prototype of our system on a UR5e robotic arm with in-hand vision and RF perception modules. We conducted over 200 real-world experimental trials to evaluate FuseBot and compare its performance to a state-of-the-art vision-based system named X-Ray (Danielczuk et al., in: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, 2020). Our experimental results demonstrate that FuseBot outperforms X-Ray’s efficiency by more than 40% in terms of the number of actions required for successful mechanical search. Furthermore, in comparison to X-Ray’s success rate of 84%, FuseBot achieves a success rate of 95% in retrieving untagged items, demonstrating for the first time that the benefits of RF perception extend beyond tagged objects in the mechanical search problem.

机械搜索是一个机器人问题，机器人需要从相机中检索部分或完全被遮挡的目标物品。最先进的机械搜索方法要么需要一个昂贵的搜索过程来找到目标物品，要么需要用射频识别标签(例如RFID)标记物品，这使得他们的方法只对环境中标记的物品有益。我们介绍了FuseBot，这是第一个用于射频视觉机械搜索的机器人系统，可以有效地检索一堆带有射频标签和未标记的物品。FuseBot并没有要求堆中的所有目标物品都贴上射频标签，而是充分利用堆中存在的射频标签物品，从而使标签和未标签的物品都受益。我们的设计引入了两个关键的创新。第一种是射频视觉映射，这种技术可以识别和定位一堆带有射频标签的物品，并利用这些信息构建一个射频视觉占用分布图。第二个是rf视觉提取，这是一个优化问题，通过考虑概率占用分布、预期抓取质量和未来操作的预期信息增益，最小化提取目标物体所需的操作数量。我们在UR5e机械臂上建立了一个实时端到端原型系统，该系统具有手持视觉和射频感知模块。我们进行了200多个真实世界的实验试验来评估FuseBot，并将其性能与最先进的基于视觉的系统x -射线进行比较(Danielczuk等人，在:2020年IEEE/RSJ智能机器人和系统国际会议(IROS)， IEEE, 2020)。我们的实验结果表明，就成功的机械搜索所需的动作数量而言，FuseBot的效率超过了X-Ray的40%以上。此外，与x射线的84%成功率相比，FuseBot在检索未标记物品方面的成功率为95%，这首次证明了射频感知的好处超出了机械搜索问题中标记物体的范围。

{"title":"FuseBot: mechanical search of rigid and deformable objects via multi-modal perception","authors":"Tara Boroushaki, Laura Dodds, Nazish Naeem, Fadel Adib","doi":"10.1007/s10514-023-10137-1","DOIUrl":"10.1007/s10514-023-10137-1","url":null,"abstract":"<div><p>Mechanical search is a robotic problem where a robot needs to retrieve a target item that is partially or fully-occluded from its camera. State-of-the-art approaches for mechanical search either require an expensive search process to find the target item, or they require the item to be tagged with a radio frequency identification tag (e.g., RFID), making their approach beneficial only to tagged items in the environment. We present FuseBot, the first robotic system for RF-Visual mechanical search that enables efficient retrieval of both RF-tagged and untagged items in a pile. Rather than requiring all target items in a pile to be RF-tagged, FuseBot leverages the mere existence of an RF-tagged item in the pile to benefit both tagged and untagged items. Our design introduces two key innovations. The first is <i>RF-Visual Mapping</i>, a technique that identifies and locates RF-tagged items in a pile and uses this information to construct an RF-Visual occupancy distribution map. The second is <i>RF-Visual Extraction</i>, a policy formulated as an optimization problem that minimizes the number of actions required to extract the target object by accounting for the probabilistic occupancy distribution, the expected grasp quality, and the expected information gain from future actions. We built a real-time end-to-end prototype of our system on a UR5e robotic arm with in-hand vision and RF perception modules. We conducted over 200 real-world experimental trials to evaluate FuseBot and compare its performance to a state-of-the-art vision-based system named X-Ray (Danielczuk et al., in: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, 2020). Our experimental results demonstrate that FuseBot outperforms X-Ray’s efficiency by more than 40% in terms of the number of actions required for successful mechanical search. Furthermore, in comparison to X-Ray’s success rate of 84%, FuseBot achieves a success rate of 95% in retrieving untagged items, demonstrating for the first time that the benefits of RF perception extend beyond tagged objects in the mechanical search problem.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1137 - 1154"},"PeriodicalIF":3.5,"publicationDate":"2023-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10137-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135958951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

UVS: underwater visual SLAM—a robust monocular visual SLAM system for lifelong underwater operations UVS:水下视觉SLAM -一个强大的单目视觉SLAM系统，用于终身水下操作

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-09-22 DOI: 10.1007/s10514-023-10138-0

Marco Leonardi, Annette Stahl, Edmund Førland Brekke, Martin Ludvigsen

In this paper, a visual simultaneous localization and mapping (VSLAM/visual SLAM) system called underwater visual SLAM (UVS) system is presented, specifically tailored for camera-only navigation in natural underwater environments. The UVS system is particularly optimized towards precision and robustness, as well as lifelong operations. We build upon Oriented features from accelerated segment test and Rotated Binary robust independent elementary features simultaneous localization and mapping (ORB-SLAM) and improve the accuracy by performing an exact search in the descriptor space during triangulation and the robustness by utilizing a unified initialization method and a motion model. In addition, we present a scale-agnostic station-keeping detection, which aims to optimize the map and poses during station-keeping, and a pruning strategy, which takes into account the point’s age and distance to the active keyframe. An exhaustive evaluation is presented to the reader, using a total of 38 in-air and underwater sequences.

本文提出了一种专门针对自然水下环境中仅相机导航的水下视觉SLAM (UVS)系统，即视觉同时定位与制图(VSLAM/visual SLAM)系统。UVS系统特别针对精度和稳健性进行了优化，并且可以终身使用。基于加速段测试的定向特征和旋转二进制的鲁棒独立基本特征同步定位和映射(ORB-SLAM)，通过在三角剖分过程中对描述子空间进行精确搜索来提高精度，并利用统一的初始化方法和运动模型来提高鲁棒性。此外，我们提出了一种尺度无关的站位保持检测方法，该方法旨在优化站位保持过程中的地图和姿态，并提出了一种考虑点的年龄和到活动关键帧的距离的修剪策略。一个详尽的评估是呈现给读者，共使用38空中和水下序列。

{"title":"UVS: underwater visual SLAM—a robust monocular visual SLAM system for lifelong underwater operations","authors":"Marco Leonardi, Annette Stahl, Edmund Førland Brekke, Martin Ludvigsen","doi":"10.1007/s10514-023-10138-0","DOIUrl":"10.1007/s10514-023-10138-0","url":null,"abstract":"<div><p>In this paper, a visual simultaneous localization and mapping (VSLAM/visual SLAM) system called underwater visual SLAM (UVS) system is presented, specifically tailored for camera-only navigation in natural underwater environments. The UVS system is particularly optimized towards precision and robustness, as well as lifelong operations. We build upon Oriented features from accelerated segment test and Rotated Binary robust independent elementary features simultaneous localization and mapping (ORB-SLAM) and improve the accuracy by performing an exact search in the descriptor space during triangulation and the robustness by utilizing a unified initialization method and a motion model. In addition, we present a scale-agnostic station-keeping detection, which aims to optimize the map and poses during station-keeping, and a pruning strategy, which takes into account the point’s age and distance to the active keyframe. An exhaustive evaluation is presented to the reader, using a total of 38 in-air and underwater sequences.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1367 - 1385"},"PeriodicalIF":3.5,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10138-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136015937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Formation control for autonomous fixed-wing air vehicles with strict speed constraints 具有严格速度约束的自主固定翼飞行器编队控制

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-09-13 DOI: 10.1007/s10514-023-10126-4

Christopher Heintz, Sean C. C. Bailey, Jesse B. Hoagg

We present a formation-control algorithm for autonomous fixed-wing air vehicles. The desired inter-vehicle positions are time-varying, and we assume that at least one vehicle has access to a measurement its position relative to the leader, which can be a physical or virtual member of the formation. Each vehicle is modeled with extended unicycle dynamics that include orientation kinematics on SO(3), speed dynamics, and strict constraints on speed (i.e., ground speed). The analytic result shows that the vehicles converge exponentially to the desired relative positions with each other and the leader. We also show that each vehicle’s speed satisfies the speed constraints. The formation algorithm is demonstrated in software-in-the-loop (SITL) simulations and experiments with fixed-wing air vehicles. To implement the formation-control algorithm, each vehicle has middle-loop controllers to determine roll, pitch, and throttle commands from the outer-loop formation control. We present SITL simulations with 4 fixed-wing air vehicles that demonstrate formation control with different communication structures. Finally, we present formation-control experiments with up to 3 fixed-wing air vehicles.

提出了一种自主固定翼飞行器编队控制算法。期望的车辆间位置是时变的，我们假设至少有一辆车辆可以测量其相对于领导者的位置，领导者可以是队列中的物理成员或虚拟成员。每辆车都用扩展的独轮车动力学建模，包括SO(3)上的方向运动学、速度动力学和对速度的严格约束(即地面速度)。分析结果表明，各车辆均呈指数收敛到期望的相对位置。我们还证明了每辆车的速度满足速度约束。在固定翼飞行器的软件在环仿真和实验中验证了编队算法。为了实现编队控制算法，每辆车都有中回路控制器来确定来自外回路编队控制的滚转、俯仰和油门命令。我们用4架固定翼飞行器进行了SITL仿真，演示了不同通信结构下的编队控制。最后，我们进行了多达3架固定翼飞行器的编队控制实验。

{"title":"Formation control for autonomous fixed-wing air vehicles with strict speed constraints","authors":"Christopher Heintz, Sean C. C. Bailey, Jesse B. Hoagg","doi":"10.1007/s10514-023-10126-4","DOIUrl":"10.1007/s10514-023-10126-4","url":null,"abstract":"<div><p>We present a formation-control algorithm for autonomous fixed-wing air vehicles. The desired inter-vehicle positions are time-varying, and we assume that at least one vehicle has access to a measurement its position relative to the leader, which can be a physical or virtual member of the formation. Each vehicle is modeled with extended unicycle dynamics that include orientation kinematics on SO(3), speed dynamics, and strict constraints on speed (i.e., ground speed). The analytic result shows that the vehicles converge exponentially to the desired relative positions with each other and the leader. We also show that each vehicle’s speed satisfies the speed constraints. The formation algorithm is demonstrated in software-in-the-loop (SITL) simulations and experiments with fixed-wing air vehicles. To implement the formation-control algorithm, each vehicle has middle-loop controllers to determine roll, pitch, and throttle commands from the outer-loop formation control. We present SITL simulations with 4 fixed-wing air vehicles that demonstrate formation control with different communication structures. Finally, we present formation-control experiments with up to 3 fixed-wing air vehicles.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1299 - 1323"},"PeriodicalIF":3.5,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135742202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sim-to-real transfer of co-optimized soft robot crawlers 协同优化软机器人履带的模拟到真实迁移

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-09-08 DOI: 10.1007/s10514-023-10130-8

Charles Schaff, Audrey Sedal, Shiyao Ni, Matthew R. Walter

This work provides a complete framework for the simulation, co-optimization, and sim-to-real transfer of the design and control of soft legged robots. Soft robots have “mechanical intelligence”: the ability to passively exhibit behaviors that would otherwise be difficult to program. Exploiting this capacity requires consideration of the coupling between design and control. Co-optimization provides a way to reason over this coupling. Yet, it is difficult to achieve simulations that are both sufficiently accurate to allow for sim-to-real transfer and fast enough for contemporary co-optimization algorithms. We describe a modularized model order reduction algorithm that improves simulation efficiency, while preserving the accuracy required to learn effective soft robot design and control. We propose a reinforcement learning-based co-optimization framework that identifies several soft crawling robots that outperform an expert baseline with zero-shot sim-to-real transfer. We study generalization of the framework to new terrains, and the efficacy of domain randomization as a means to improve sim-to-real transfer.

这项工作为软腿机器人的设计和控制的仿真、协同优化和模拟到真实的转移提供了一个完整的框架。软机器人具有“机械智能”:能够被动地表现出原本难以编程的行为。利用这种能力需要考虑设计和控制之间的耦合。协同优化提供了一种对这种耦合进行推理的方法。然而，很难实现既足够精确到允许模拟到真实的传输，又足够快到当代协同优化算法的模拟。我们描述了一种模块化的模型降阶算法，该算法提高了仿真效率，同时保持了学习有效的软机器人设计和控制所需的精度。我们提出了一个基于强化学习的协同优化框架，该框架识别了几个软爬行机器人，这些机器人通过零射击模拟到真实的转移优于专家基线。我们研究了框架在新地形上的泛化，以及区域随机化作为一种提高模拟到真实转移的手段的有效性。

引用次数: 0

Multi-directional Interaction Force Control with an Aerial Manipulator Under External Disturbances 外部扰动下航空机械臂的多向相互作用力控制

IF 3.5 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Autonomous Robots

Pub Date : 2023-09-01 DOI: 10.1007/s10514-023-10128-2

Grzegorz Malczyk, Maximilian Brunner, Eugenio Cuniato, Marco Tognon, Roland Siegwart

To improve accuracy and robustness of interactive aerial robots, the knowledge of the forces acting on the platform is of uttermost importance. The robot should distinguish interaction forces from external disturbances in order to be compliant with the firsts and reject the seconds. This represents a challenge since disturbances might be of different nature (physical contact, aerodynamic, modeling errors) and be applied to different points of the robot. This work presents a new (hbox {extended Kalman filter (EKF)}) based estimator for both external disturbance and interaction forces. The estimator fuses information coming from the system’s dynamic model and it’s state with wrench measurements coming from a Force-Torque sensor. This allows for robust interaction control at the tool’s tip even in presence of external disturbance wrenches acting on the platform. We employ the filter estimates in a novel hybrid force/motion controller to perform force tracking not only along the tool direction, but from any platform’s orientation, without losing the stability of the pose controller. The proposed framework is extensively tested on an omnidirectional aerial manipulator (AM) performing push and slide operations and transitioning between different interaction surfaces, while subject to external disturbances. The experiments are done equipping the AM with two different tools: a rigid interaction stick and an actuated delta manipulator, showing the generality of the approach. Moreover, the estimation results are compared to a state-of-the-art momentum-based estimator, clearly showing the superiority of the EKF approach.

为了提高交互式空中机器人的精度和鲁棒性，了解作用在平台上的力是至关重要的。机器人应区分外力与外界干扰的相互作用，以适应前者，拒绝后者。这是一个挑战，因为干扰可能具有不同的性质(物理接触、空气动力学、建模错误)，并应用于机器人的不同点。这项工作提出了一种新的基于(hbox {extended Kalman filter (EKF)})的外部干扰和相互作用力估计器。该估计器将来自系统动态模型的信息和来自力-扭矩传感器的扳手测量信息融合在一起。这允许在工具尖端进行强大的交互控制，即使存在外部干扰扳手作用于平台。我们在一种新的混合力/运动控制器中使用滤波器估计，不仅沿着工具方向，而且从任何平台的方向执行力跟踪，而不会失去姿态控制器的稳定性。所提出的框架在全向航空机械臂(AM)上进行了广泛的测试，该机械臂在受到外部干扰的情况下执行推滑操作以及在不同交互表面之间的转换。实验用两种不同的工具来装备AM:刚性相互作用杆和驱动delta机械手，显示了该方法的通用性。此外，将估计结果与最先进的基于动量的估计器进行了比较，清楚地显示了EKF方法的优越性。

{"title":"Multi-directional Interaction Force Control with an Aerial Manipulator Under External Disturbances","authors":"Grzegorz Malczyk, Maximilian Brunner, Eugenio Cuniato, Marco Tognon, Roland Siegwart","doi":"10.1007/s10514-023-10128-2","DOIUrl":"10.1007/s10514-023-10128-2","url":null,"abstract":"<div><p>To improve accuracy and robustness of interactive aerial robots, the knowledge of the forces acting on the platform is of uttermost importance. The robot should distinguish interaction forces from external disturbances in order to be compliant with the firsts and reject the seconds. This represents a challenge since disturbances might be of different nature (physical contact, aerodynamic, modeling errors) and be applied to different points of the robot. This work presents a new <span>(hbox {extended Kalman filter (EKF)})</span> based estimator for both external disturbance and interaction forces. The estimator fuses information coming from the system’s dynamic model and it’s state with wrench measurements coming from a Force-Torque sensor. This allows for robust interaction control at the tool’s tip even in presence of external disturbance wrenches acting on the platform. We employ the filter estimates in a novel hybrid force/motion controller to perform force tracking not only along the tool direction, but from any platform’s orientation, without losing the stability of the pose controller. The proposed framework is extensively tested on an omnidirectional aerial manipulator (AM) performing push and slide operations and transitioning between different interaction surfaces, while subject to external disturbances. The experiments are done equipping the AM with two different tools: a rigid interaction stick and an actuated delta manipulator, showing the generality of the approach. Moreover, the estimation results are compared to a state-of-the-art momentum-based estimator, clearly showing the superiority of the EKF approach.\u0000</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 8","pages":"1325 - 1343"},"PeriodicalIF":3.5,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10514-023-10128-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43984917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0