International Journal of Robotics Research最新文献_第5页

Stabilizing deep Q-learning with Q-graph-based bounds 用基于Q图的边界稳定深度Q学习

IF 9.2 1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-07-25 DOI: 10.1177/02783649231185165

Sabrina Hoppe, Markus Giftthaler, R. Krug, Marc Toussaint

State-of-the art deep reinforcement learning has enabled autonomous agents to learn complex strategies from scratch on many problems including continuous control tasks. Deep Q-networks (DQN) and deep deterministic policy gradients (DDPGs) are two such algorithms which are both based on Q-learning. They therefore all share function approximation, off-policy behavior, and bootstrapping—the constituents of the so-called deadly triad that is known for its convergence issues. We suggest to take a graph perspective on the data an agent has collected and show that the structure of this data graph is linked to the degree of divergence that can be expected. We further demonstrate that a subset of states and actions from the data graph can be selected such that the resulting finite graph can be interpreted as a simplified Markov decision process (MDP) for which the Q-values can be computed analytically. These Q-values are lower bounds for the Q-values in the original problem, and enforcing these bounds in temporal difference learning can help to prevent soft divergence. We show further effects on a simulated continuous control task, including improved sample efficiency, increased robustness toward hyperparameters as well as a better ability to cope with limited replay memory. Finally, we demonstrate the benefits of our method on a large robotic benchmark with an industrial assembly task and approximately 60 h of real-world interaction.

最先进的深度强化学习使自主主体能够从头开始学习包括连续控制任务在内的许多问题的复杂策略。深度Q网络（DQN）和深度确定性策略梯度（DDPG）是两种基于Q学习的算法。因此，它们都共享函数近似、非策略行为和自举——这是所谓的致命三元组的组成部分，以其收敛问题而闻名。我们建议从图的角度看待代理收集的数据，并表明该数据图的结构与可预期的分歧程度有关。我们进一步证明，可以从数据图中选择状态和动作的子集，使得得到的有限图可以被解释为简化的马尔可夫决策过程（MDP），对于该过程可以分析地计算Q值。这些Q值是原始问题中Q值的下限，在时间差学习中强制执行这些边界有助于防止软发散。我们展示了对模拟连续控制任务的进一步影响，包括提高了样本效率，增强了对超参数的鲁棒性，以及更好地处理有限回放内存的能力。最后，我们在一个大型机器人基准上展示了我们的方法的优势，该基准具有工业装配任务和大约60小时的真实世界交互。

{"title":"Stabilizing deep Q-learning with Q-graph-based bounds","authors":"Sabrina Hoppe, Markus Giftthaler, R. Krug, Marc Toussaint","doi":"10.1177/02783649231185165","DOIUrl":"https://doi.org/10.1177/02783649231185165","url":null,"abstract":"State-of-the art deep reinforcement learning has enabled autonomous agents to learn complex strategies from scratch on many problems including continuous control tasks. Deep Q-networks (DQN) and deep deterministic policy gradients (DDPGs) are two such algorithms which are both based on Q-learning. They therefore all share function approximation, off-policy behavior, and bootstrapping—the constituents of the so-called deadly triad that is known for its convergence issues. We suggest to take a graph perspective on the data an agent has collected and show that the structure of this data graph is linked to the degree of divergence that can be expected. We further demonstrate that a subset of states and actions from the data graph can be selected such that the resulting finite graph can be interpreted as a simplified Markov decision process (MDP) for which the Q-values can be computed analytically. These Q-values are lower bounds for the Q-values in the original problem, and enforcing these bounds in temporal difference learning can help to prevent soft divergence. We show further effects on a simulated continuous control task, including improved sample efficiency, increased robustness toward hyperparameters as well as a better ability to cope with limited replay memory. Finally, we demonstrate the benefits of our method on a large robotic benchmark with an industrial assembly task and approximately 60 h of real-world interaction.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"633 - 654"},"PeriodicalIF":9.2,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46535776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robotic drilling for the Chinese Chang’E 5 lunar sample-return mission 中国嫦娥五号月球样本返回任务的机器人钻探

IF 9.2 1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-07-01 DOI: 10.1177/02783649231187918

Zhang Tao, Yong Pang, Ting Zeng, Guoxing Wang, Shen Yin, Kun Xu, Guidong Mo, Xingwang Zhang, Lusi Wang, Shuai Yang, Zengzeng Zhao, Junjie Qin, Junshan Gong, Zhongxiang Zhao, Xuefeng Tong, Zhongwang Yin, Haiyuan Wang, Fan Zhao, Yanhong Zheng, Xiangjin Deng, Bin Wang, Jinchang Xu, Wei Wang, Shuangfei Yu, Xiaoming Lai, Xilun Ding

On December 2, 2020, a 2-m class robotic drill onboard the Chinese Chang’E 5 lunar lander successfully penetrated 1 m into the lunar regolith and collected 259.72 g of samples. This paper presents the design and development, terrestrial tests, and lunar sampling results of the robotic drill. First, the system design of the robotic drill, including its engineering objectives, drill configuration, drilling and coring methods, and rotational speed determination, was studied. Subsequently, a control strategy was proposed to address the geological uncertainty and complexity of the lunar surface. Terrestrial tests were conducted to assess the sampling performance of the robotic drill under both atmospheric and vacuum conditions. Finally, the results of drilling on the lunar surface were obtained, and the complex geological conditions encountered were analyzed. The success of the Chinese Chang’E 5 lunar sample-return mission demonstrates the feasibility of the proposed robotic drill. This study can serve as an important reference for future extraterrestrial robotic regolith-sampling missions.

2020年12月2日，中国“嫦娥五号”月球着陆器上的一个2米级机器人钻头成功深入月球表层1米，采集了259.72克样本。本文介绍了机器人钻机的设计与研制、地面试验和月球取样结果。首先，对机器人钻机的系统设计进行了研究，包括工程目标、钻机配置、钻进取芯方法、转速确定等。随后，针对月球表面地质的不确定性和复杂性，提出了一种控制策略。进行了地面测试，以评估机器人钻机在大气和真空条件下的取样性能。最后，给出了在月球表面钻探的结果，并对所遇到的复杂地质条件进行了分析。中国嫦娥五号月球样本返回任务的成功证明了机器人演练的可行性。该研究可为未来的地外机器人风化层取样任务提供重要参考。

{"title":"Robotic drilling for the Chinese Chang’E 5 lunar sample-return mission","authors":"Zhang Tao, Yong Pang, Ting Zeng, Guoxing Wang, Shen Yin, Kun Xu, Guidong Mo, Xingwang Zhang, Lusi Wang, Shuai Yang, Zengzeng Zhao, Junjie Qin, Junshan Gong, Zhongxiang Zhao, Xuefeng Tong, Zhongwang Yin, Haiyuan Wang, Fan Zhao, Yanhong Zheng, Xiangjin Deng, Bin Wang, Jinchang Xu, Wei Wang, Shuangfei Yu, Xiaoming Lai, Xilun Ding","doi":"10.1177/02783649231187918","DOIUrl":"https://doi.org/10.1177/02783649231187918","url":null,"abstract":"On December 2, 2020, a 2-m class robotic drill onboard the Chinese Chang’E 5 lunar lander successfully penetrated 1 m into the lunar regolith and collected 259.72 g of samples. This paper presents the design and development, terrestrial tests, and lunar sampling results of the robotic drill. First, the system design of the robotic drill, including its engineering objectives, drill configuration, drilling and coring methods, and rotational speed determination, was studied. Subsequently, a control strategy was proposed to address the geological uncertainty and complexity of the lunar surface. Terrestrial tests were conducted to assess the sampling performance of the robotic drill under both atmospheric and vacuum conditions. Finally, the results of drilling on the lunar surface were obtained, and the complex geological conditions encountered were analyzed. The success of the Chinese Chang’E 5 lunar sample-return mission demonstrates the feasibility of the proposed robotic drill. This study can serve as an important reference for future extraterrestrial robotic regolith-sampling missions.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"586 - 613"},"PeriodicalIF":9.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41882942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ViF-GTAD: A new automotive dataset with ground truth for ADAS/AD development, testing, and validation ViF GTAD：一个新的汽车数据集，具有ADAS/AD开发、测试和验证的基本事实

IF 9.2 1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-07-01 DOI: 10.1177/02783649231188146

Sarah Haas, Selim Solmaz, Jakob Reckenzaun, Simon Genser

A new dataset for automated driving, which is the subject matter of this paper, identifies and addresses a gap in existing similar perception datasets. While most state-of-the-art perception datasets primarily focus on the provision of various onboard sensor measurements along with the semantic information under various driving conditions, the provided information is often insufficient since the object list and position data provided include unknown and time-varying errors. The current paper and the associated dataset describe the first publicly available perception measurement data that include not only the onboard sensor information from the camera, Lidar, and radar with semantically classified objects but also the high-precision ground-truth position measurements enabled by the accurate RTK-assisted GPS localization systems available on both the ego vehicle and the dynamic target objects. This paper provides insight on the capturing of the data, explicitly explaining the metadata structure and the content, as well as the potential application examples where it has been, and can potentially be, applied and implemented in relation to automated driving and environmental perception systems development, testing, and validation.

本文的主题是一个新的自动驾驶数据集，它识别并解决了现有类似感知数据集的差距。虽然大多数最先进的感知数据集主要侧重于提供各种车载传感器测量以及各种驾驶条件下的语义信息，但由于所提供的目标列表和位置数据包含未知和时变误差，因此所提供的信息通常不足。目前的论文和相关数据集描述了第一个公开可用的感知测量数据，其中不仅包括来自相机、激光雷达和雷达的车载传感器信息，以及具有语义分类对象的雷达，还包括由精确的rtk辅助GPS定位系统在ego车辆和动态目标对象上实现的高精度地面真实位置测量。本文提供了对数据捕获的见解，明确地解释了元数据的结构和内容，以及潜在的应用示例，在这些示例中，元数据已经或可能应用于自动驾驶和环境感知系统的开发、测试和验证。

{"title":"ViF-GTAD: A new automotive dataset with ground truth for ADAS/AD development, testing, and validation","authors":"Sarah Haas, Selim Solmaz, Jakob Reckenzaun, Simon Genser","doi":"10.1177/02783649231188146","DOIUrl":"https://doi.org/10.1177/02783649231188146","url":null,"abstract":"A new dataset for automated driving, which is the subject matter of this paper, identifies and addresses a gap in existing similar perception datasets. While most state-of-the-art perception datasets primarily focus on the provision of various onboard sensor measurements along with the semantic information under various driving conditions, the provided information is often insufficient since the object list and position data provided include unknown and time-varying errors. The current paper and the associated dataset describe the first publicly available perception measurement data that include not only the onboard sensor information from the camera, Lidar, and radar with semantically classified objects but also the high-precision ground-truth position measurements enabled by the accurate RTK-assisted GPS localization systems available on both the ego vehicle and the dynamic target objects. This paper provides insight on the capturing of the data, explicitly explaining the metadata structure and the content, as well as the potential application examples where it has been, and can potentially be, applied and implemented in relation to automated driving and environmental perception systems development, testing, and validation.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"614 - 630"},"PeriodicalIF":9.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43905647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Adaptive Robotic Information Gathering via non-stationary Gaussian processes 基于非平稳高斯过程的自适应机器人信息采集

1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-06-27 DOI: 10.1177/02783649231184498

Weizhe Chen, Roni Khardon, Lantao Liu

Robotic Information Gathering (RIG) is a foundational research topic that answers how a robot (team) collects informative data to efficiently build an accurate model of an unknown target function under robot embodiment constraints. RIG has many applications, including but not limited to autonomous exploration and mapping, 3D reconstruction or inspection, search and rescue, and environmental monitoring. A RIG system relies on a probabilistic model’s prediction uncertainty to identify critical areas for informative data collection. Gaussian processes (GPs) with stationary kernels have been widely adopted for spatial modeling. However, real-world spatial data is typically non-stationary—different locations do not have the same degree of variability. As a result, the prediction uncertainty does not accurately reveal prediction error, limiting the success of RIG algorithms. We propose a family of non-stationary kernels named Attentive Kernel (AK), which is simple and robust and can extend any existing kernel to a non-stationary one. We evaluate the new kernel in elevation mapping tasks, where AK provides better accuracy and uncertainty quantification over the commonly used stationary kernels and the leading non-stationary kernels. The improved uncertainty quantification guides the downstream informative planner to collect more valuable data around the high-error area, further increasing prediction accuracy. A field experiment demonstrates that the proposed method can guide an Autonomous Surface Vehicle (ASV) to prioritize data collection in locations with significant spatial variations, enabling the model to characterize salient environmental features.

机器人信息采集(Robotic Information Gathering, RIG)是机器人(团队)在机器人实施体约束下，如何收集信息数据以高效地建立未知目标函数的精确模型的基础研究课题。RIG有许多应用，包括但不限于自主勘探和测绘、3D重建或检查、搜索和救援以及环境监测。RIG系统依靠概率模型的预测不确定性来识别信息数据收集的关键区域。具有平稳核的高斯过程在空间建模中得到了广泛的应用。然而，现实世界的空间数据通常是非平稳的——不同的位置不具有相同程度的可变性。因此，预测不确定性不能准确地反映预测误差，限制了RIG算法的成功。我们提出了一类非平稳核，称为注意核(attention Kernel, AK)，它具有简单和鲁棒性，可以将任何现有核扩展为非平稳核。我们在高程映射任务中评估了新核，其中AK比常用的平稳核和领先的非平稳核提供了更好的精度和不确定性量化。改进后的不确定性量化可以引导下游信息规划者在高误差区域周围收集更多有价值的数据，进一步提高预测精度。现场实验表明，该方法可以引导自动地面车辆(ASV)在空间变化显著的位置优先收集数据，使模型能够表征显著的环境特征。

{"title":"Adaptive Robotic Information Gathering via non-stationary Gaussian processes","authors":"Weizhe Chen, Roni Khardon, Lantao Liu","doi":"10.1177/02783649231184498","DOIUrl":"https://doi.org/10.1177/02783649231184498","url":null,"abstract":"Robotic Information Gathering (RIG) is a foundational research topic that answers how a robot (team) collects informative data to efficiently build an accurate model of an unknown target function under robot embodiment constraints. RIG has many applications, including but not limited to autonomous exploration and mapping, 3D reconstruction or inspection, search and rescue, and environmental monitoring. A RIG system relies on a probabilistic model’s prediction uncertainty to identify critical areas for informative data collection. Gaussian processes (GPs) with stationary kernels have been widely adopted for spatial modeling. However, real-world spatial data is typically non-stationary—different locations do not have the same degree of variability. As a result, the prediction uncertainty does not accurately reveal prediction error, limiting the success of RIG algorithms. We propose a family of non-stationary kernels named Attentive Kernel (AK), which is simple and robust and can extend any existing kernel to a non-stationary one. We evaluate the new kernel in elevation mapping tasks, where AK provides better accuracy and uncertainty quantification over the commonly used stationary kernels and the leading non-stationary kernels. The improved uncertainty quantification guides the downstream informative planner to collect more valuable data around the high-error area, further increasing prediction accuracy. A field experiment demonstrates that the proposed method can guide an Autonomous Surface Vehicle (ASV) to prioritize data collection in locations with significant spatial variations, enabling the model to characterize salient environmental features.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135454366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Composable energy policies for reactive motion generation and reinforcement learning 反应性运动生成和强化学习的可组合能量策略

1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-06-26 DOI: 10.1177/02783649231179499

Julen Urain, Anqi Li, Puze Liu, Carlo D’Eramo, Jan Peters

In this work, we introduce composable energy policies (CEP), a novel framework for multi-objective motion generation. We frame the problem of composing multiple policy components from a probabilistic view. We consider a set of stochastic policies represented in arbitrary task spaces, where each policy represents a distribution of the actions to solve a particular task. Then, we aim to find the action in the configuration space that optimally satisfies all the policy components. The presented framework allows the fusion of motion generators from different sources: optimal control, data-driven policies, motion planning, and handcrafted policies. Classically, the problem of multi-objective motion generation is solved by the composition of a set of deterministic policies, rather than stochastic policies. However, there are common situations where different policy components have conflicting behaviors, leading to oscillations or the robot getting stuck in an undesirable state. While our approach is not directly able to solve the conflicting policies problem, we claim that modeling each policy as a stochastic policy allows more expressive representations for each component in contrast with the classical reactive motion generation approaches. In some tasks, such as reaching a target in a cluttered environment, we show experimentally that CEP additional expressivity allows us to model policies that reduce these conflicting behaviors. A field that benefits from these reactive motion generators is the one of robot reinforcement learning. Integrating these policy architectures with reinforcement learning allows us to include a set of inductive biases in the learning problem. These inductive biases guide the reinforcement learning agent towards informative regions or improve collision safety while exploring. In our work, we show how to integrate our proposed reactive motion generator as a structured policy for reinforcement learning. Combining the reinforcement learning agent exploration with the prior-based CEP, we can improve the learning performance and explore safer.

在这项工作中，我们引入了可组合能量策略(CEP)，这是一种新的多目标运动生成框架。我们从概率的角度来描述组合多个策略组件的问题。我们考虑在任意任务空间中表示的一组随机策略，其中每个策略表示解决特定任务的操作的分布。然后，我们的目标是在配置空间中找到最优地满足所有策略组件的操作。所提出的框架允许融合来自不同来源的运动生成器:最优控制、数据驱动策略、运动规划和手工制作策略。经典的多目标运动生成问题是通过一组确定性策略的组合来解决的，而不是随机策略。然而，通常情况下，不同的策略组件具有冲突的行为，导致振荡或机器人陷入不希望的状态。虽然我们的方法不能直接解决冲突策略问题，但我们声称，与经典的反应运动生成方法相比，将每个策略建模为随机策略可以为每个组件提供更具表现力的表示。在某些任务中，例如在混乱的环境中到达目标，我们通过实验表明，CEP额外的表现力允许我们对减少这些冲突行为的策略进行建模。从这些反应性运动发生器中受益的一个领域是机器人强化学习。将这些策略架构与强化学习集成，使我们能够在学习问题中包含一组归纳偏差。这些归纳偏差引导强化学习代理进入信息区域或在探索时提高碰撞安全性。在我们的工作中，我们展示了如何将我们提出的反应运动生成器集成为强化学习的结构化策略。将强化学习智能体探索与基于先验的CEP相结合，可以提高学习性能，更安全地进行探索。

{"title":"Composable energy policies for reactive motion generation and reinforcement learning","authors":"Julen Urain, Anqi Li, Puze Liu, Carlo D’Eramo, Jan Peters","doi":"10.1177/02783649231179499","DOIUrl":"https://doi.org/10.1177/02783649231179499","url":null,"abstract":"In this work, we introduce composable energy policies (CEP), a novel framework for multi-objective motion generation. We frame the problem of composing multiple policy components from a probabilistic view. We consider a set of stochastic policies represented in arbitrary task spaces, where each policy represents a distribution of the actions to solve a particular task. Then, we aim to find the action in the configuration space that optimally satisfies all the policy components. The presented framework allows the fusion of motion generators from different sources: optimal control, data-driven policies, motion planning, and handcrafted policies. Classically, the problem of multi-objective motion generation is solved by the composition of a set of deterministic policies, rather than stochastic policies. However, there are common situations where different policy components have conflicting behaviors, leading to oscillations or the robot getting stuck in an undesirable state. While our approach is not directly able to solve the conflicting policies problem, we claim that modeling each policy as a stochastic policy allows more expressive representations for each component in contrast with the classical reactive motion generation approaches. In some tasks, such as reaching a target in a cluttered environment, we show experimentally that CEP additional expressivity allows us to model policies that reduce these conflicting behaviors. A field that benefits from these reactive motion generators is the one of robot reinforcement learning. Integrating these policy architectures with reinforcement learning allows us to include a set of inductive biases in the learning problem. These inductive biases guide the reinforcement learning agent towards informative regions or improve collision safety while exploring. In our work, we show how to integrate our proposed reactive motion generator as a structured policy for reinforcement learning. Combining the reinforcement learning agent exploration with the prior-based CEP, we can improve the learning performance and explore safer.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135607974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Minimizing running buffers for tabletop object rearrangement: Complexity, fast algorithms, and applications 最小化桌面对象重排的运行缓冲区:复杂性、快速算法和应用程序

1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-06-08 DOI: 10.1177/02783649231178565

Kai Gao, Si Wei Feng, Baichuan Huang, Jingjin Yu

For rearranging objects on tabletops with overhand grasps, temporarily relocating objects to some buffer space may be necessary. This raises the natural question of how many simultaneous storage spaces, or “running buffers,” are required so that certain classes of tabletop rearrangement problems are feasible. In this work, we examine the problem for both labeled and unlabeled settings. On the structural side, we observe that finding the minimum number of running buffers (MRB) can be carried out on a dependency graph abstracted from a problem instance and show that computing MRB is NP-hard. We then prove that under both labeled and unlabeled settings, even for uniform cylindrical objects, the number of required running buffers may grow unbounded as the number of objects to be rearranged increases. We further show that the bound for the unlabeled case is tight. On the algorithmic side, we develop effective exact algorithms for finding MRB for both labeled and unlabeled tabletop rearrangement problems, scalable to over a hundred objects under very high object density. More importantly, our algorithms also compute a sequence witnessing the computed MRB that can be used for solving object rearrangement tasks. Employing these algorithms, empirical evaluations reveal that random labeled and unlabeled instances, which more closely mimic real-world setups generally have fairly small MRBs. Using real robot experiments, we demonstrate that the running buffer abstraction leads to state-of-the-art solutions for the in-place rearrangement of many objects in a tight, bounded workspace.

对于重新排列桌面上的对象，可能需要将对象临时重新定位到某个缓冲区空间。这自然提出了一个问题，即需要多少同时存储空间或“运行缓冲区”才能使某些类型的桌面重排问题可行。在这项工作中，我们研究了标记和未标记设置的问题。在结构方面，我们观察到可以在从问题实例抽象的依赖图上找到最小运行缓冲区(MRB)的数量，并表明计算MRB是np困难的。然后，我们证明了在标记和未标记设置下，即使对于均匀的圆柱形对象，所需的运行缓冲区的数量也可能随着要重排的对象数量的增加而无界增长。我们进一步证明了未标记情况的界是紧的。在算法方面，我们开发了有效的精确算法，用于寻找标记和未标记桌面重排问题的MRB，在非常高的对象密度下可扩展到100多个对象。更重要的是，我们的算法还计算了一个序列，见证了计算的MRB，可用于解决对象重排任务。使用这些算法，经验评估表明，随机标记和未标记的实例，更接近于模拟现实世界的设置，通常具有相当小的mrb。通过真实的机器人实验，我们证明了运行缓冲区抽象可以为在紧密的有界工作空间中对许多对象进行就地重排提供最先进的解决方案。

{"title":"Minimizing running buffers for tabletop object rearrangement: Complexity, fast algorithms, and applications","authors":"Kai Gao, Si Wei Feng, Baichuan Huang, Jingjin Yu","doi":"10.1177/02783649231178565","DOIUrl":"https://doi.org/10.1177/02783649231178565","url":null,"abstract":"For rearranging objects on tabletops with overhand grasps, temporarily relocating objects to some buffer space may be necessary. This raises the natural question of how many simultaneous storage spaces, or “running buffers,” are required so that certain classes of tabletop rearrangement problems are feasible. In this work, we examine the problem for both labeled and unlabeled settings. On the structural side, we observe that finding the minimum number of running buffers (MRB) can be carried out on a dependency graph abstracted from a problem instance and show that computing MRB is NP-hard. We then prove that under both labeled and unlabeled settings, even for uniform cylindrical objects, the number of required running buffers may grow unbounded as the number of objects to be rearranged increases. We further show that the bound for the unlabeled case is tight. On the algorithmic side, we develop effective exact algorithms for finding MRB for both labeled and unlabeled tabletop rearrangement problems, scalable to over a hundred objects under very high object density. More importantly, our algorithms also compute a sequence witnessing the computed MRB that can be used for solving object rearrangement tasks. Employing these algorithms, empirical evaluations reveal that random labeled and unlabeled instances, which more closely mimic real-world setups generally have fairly small MRBs. Using real robot experiments, we demonstrate that the running buffer abstraction leads to state-of-the-art solutions for the in-place rearrangement of many objects in a tight, bounded workspace.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135268539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Control-oriented meta-learning Control-oriented元学习

1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-06-07 DOI: 10.1177/02783649231165085

Spencer M. Richards, Navid Azizan, Jean-Jacques Slotine, Marco Pavone

Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.

实时自适应是机器人在复杂、动态环境中控制的必要条件。如果任何不确定动力学项都是已知非线性特征的线性参数化项，那么自适应控制律可以赋予非线性系统良好的轨迹跟踪性能。然而，通常很难先验地确定这些特征，例如旋翼飞机上的气动干扰或机械臂与各种物体之间的相互作用力。在本文中，我们转向使用神经网络的数据驱动建模，从过去的数据中离线学习具有这些非线性特征的内部参数模型的自适应控制器。我们的关键见解是，我们可以更好地为控制器的部署做好准备，在闭环仿真中使用面向控制的元学习特征，而不是面向回归的元学习特征来适应输入输出数据。具体来说，我们以闭环跟踪仿真为基础学习者，以平均跟踪误差为元目标，对自适应控制器进行元学习。对于受风影响的完全驱动和欠驱动非线性平面旋翼机，我们证明了我们的自适应控制器在部署在闭环中进行轨迹跟踪控制时优于其他使用面向回归的元学习训练的控制器。

{"title":"Control-oriented meta-learning","authors":"Spencer M. Richards, Navid Azizan, Jean-Jacques Slotine, Marco Pavone","doi":"10.1177/02783649231165085","DOIUrl":"https://doi.org/10.1177/02783649231165085","url":null,"abstract":"Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135403611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Enabling four-arm laparoscopic surgery by controlling two robotic assistants via haptic foot interfaces 通过触觉足接口控制两个机器人助手，实现四臂腹腔镜手术

IF 9.2 1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-06-01 DOI: 10.1177/02783649231180366

Jacob Hernandez Sanchez, Walid Amanhoud, A. Billard, M. Bouri

Robotic surgery is a promising direction to improve surgeons and assistants’ daily life with respect to conventional surgery. In this work, we propose solo laparoscopic surgery in which two robotic arms, controlled via haptic foot interfaces, assist the task of the hands. Such a system opens the door for simultaneous control of four laparoscopic tools by the same user. Each hand controls a manipulative tool while a foot controls an endoscope/camera and another controls an actuated gripper. In this scenario, the surgeon and robots need to work collaboratively within a concurrent workspace, while meeting the precision demands of surgery. To this end, we propose a control framework for the robotic arms that deals with all the task- and safety-related constraints. Furthermore, to ease the control through the feet, two assistance modalities are proposed: adaptive visual tracking of the laparoscopic instruments with the camera and grasping assistance for the gripper. A user study is conducted on twelve subjects to highlight the ease of use of the system and to evaluate the relevance of the proposed shared control strategies. The results confirm the feasibility of four-arm surgical-like tasks without extensive training in tasks that involve visual-tracking and manipulation goals for the feet, as well as coordination with both hands. Moreover, our study characterizes and motivates the use of robotic assistance for reducing task load, improving performance, increasing fluency, and eliciting higher coordination during four-arm laparoscopic tasks.

与传统手术相比，机器人手术是改善外科医生和助手日常生活的一个很有前途的方向。在这项工作中，我们提出了一种单独的腹腔镜手术，其中两个机械臂通过触觉脚接口控制，协助手的任务。这样的系统打开了由同一用户同时控制四个腹腔镜工具的门。每只手控制一个操纵工具，而一只脚控制内窥镜/相机，另一只手控制致动的夹具。在这种情况下，外科医生和机器人需要在并行的工作空间内协同工作，同时满足手术的精度要求。为此，我们提出了一个机器人手臂的控制框架，该框架处理所有与任务和安全相关的约束。此外，为了便于通过脚进行控制，提出了两种辅助模式：使用相机对腹腔镜器械进行自适应视觉跟踪和对夹具进行抓取辅助。对12名受试者进行了用户研究，以强调系统的易用性，并评估所提出的共享控制策略的相关性。研究结果证实了在没有经过广泛训练的情况下进行四臂手术式任务的可行性，这些任务涉及脚部的视觉跟踪和操作目标，以及双手的协调。此外，我们的研究表征并推动了机器人辅助在四臂腹腔镜任务中减少任务负荷、提高性能、提高流畅性和提高协调性的使用。

{"title":"Enabling four-arm laparoscopic surgery by controlling two robotic assistants via haptic foot interfaces","authors":"Jacob Hernandez Sanchez, Walid Amanhoud, A. Billard, M. Bouri","doi":"10.1177/02783649231180366","DOIUrl":"https://doi.org/10.1177/02783649231180366","url":null,"abstract":"Robotic surgery is a promising direction to improve surgeons and assistants’ daily life with respect to conventional surgery. In this work, we propose solo laparoscopic surgery in which two robotic arms, controlled via haptic foot interfaces, assist the task of the hands. Such a system opens the door for simultaneous control of four laparoscopic tools by the same user. Each hand controls a manipulative tool while a foot controls an endoscope/camera and another controls an actuated gripper. In this scenario, the surgeon and robots need to work collaboratively within a concurrent workspace, while meeting the precision demands of surgery. To this end, we propose a control framework for the robotic arms that deals with all the task- and safety-related constraints. Furthermore, to ease the control through the feet, two assistance modalities are proposed: adaptive visual tracking of the laparoscopic instruments with the camera and grasping assistance for the gripper. A user study is conducted on twelve subjects to highlight the ease of use of the system and to evaluate the relevance of the proposed shared control strategies. The results confirm the feasibility of four-arm surgical-like tasks without extensive training in tasks that involve visual-tracking and manipulation goals for the feet, as well as coordination with both hands. Moreover, our study characterizes and motivates the use of robotic assistance for reducing task load, improving performance, increasing fluency, and eliciting higher coordination during four-arm laparoscopic tasks.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"475 - 503"},"PeriodicalIF":9.2,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41513852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrated planning and control of robotic surgical instruments for task autonomy 机器人手术器械任务自主性的综合规划与控制

IF 9.2 1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-05-13 DOI: 10.1177/02783649231179753

Fangxun Zhong, Yun-hui Liu

Agile maneuvers are essential for robot-enabled complex tasks such as surgical procedures. Prior explorations on surgery autonomy are limited to feasibility study of completing a single task without systematically addressing generic manipulation safety across different tasks. We present an integrated planning and control framework for 6-DoF robotic instruments for pipeline automation of surgical tasks. We leverage the geometry of a robotic instrument and propose the nodal state space to represent the robot state in SE(3) space. Each elementary robot motion could be encoded by regulation of the state parameters via a dynamical system. This theoretically ensures that every in-process trajectory is globally feasible and stably reached to an admissible target, and the controller is of closed-form without computing 6-DoF inverse kinematics. Then, to plan the motion steps reliably, we propose an interactive (instant) goal state of the robot that transforms manipulation planning through desired path constraints into a goal-varying manipulation (GVM) problem. We detail how GVM could adaptively and smoothly plan the procedure (could proceed or rewind the process as needed) based on on-the-fly situations under dynamic or disturbed environment. Finally, we extend the above policy to characterize complete pipelines of various surgical tasks. Simulations show that our framework could smoothly solve twisted maneuvers while avoiding collisions. Physical experiments using the da Vinci Research Kit validates the capability of automating individual tasks including tissue debridement, dissection, and wound suturing. The results confirm good task-level consistency and reliability compared to state-of-the-art automation algorithms.

对于外科手术等由机器人完成的复杂任务来说，灵活的操作是必不可少的。先前对手术自主性的探索仅限于完成单一任务的可行性研究，而没有系统地解决跨不同任务的通用操作安全性问题。我们提出了一个集成的规划和控制框架的六自由度机器人仪器的管道自动化手术任务。我们利用机器人仪器的几何结构，提出节点状态空间来表示SE(3)空间中的机器人状态。机器人的每个基本运动都可以通过动力系统对状态参数的调节来编码。这在理论上保证了进程中的每条轨迹都是全局可行的，并且稳定地到达一个允许的目标，并且控制器是封闭形式的，不需要计算六自由度逆运动学。然后，为了可靠地规划运动步骤，我们提出了机器人的交互式(即时)目标状态，将通过期望路径约束的操作规划转换为目标变化操作(GVM)问题。我们详细介绍了GVM如何在动态或受干扰的环境下根据动态情况自适应地顺利地计划过程(可以根据需要继续或倒回过程)。最后，我们将上述策略扩展到描述各种手术任务的完整管道。仿真结果表明，该框架可以在避免碰撞的同时平稳地解决扭曲机动问题。使用达芬奇研究套件进行的物理实验验证了自动完成组织清创、解剖和伤口缝合等单个任务的能力。与最先进的自动化算法相比，结果证实了良好的任务级一致性和可靠性。

{"title":"Integrated planning and control of robotic surgical instruments for task autonomy","authors":"Fangxun Zhong, Yun-hui Liu","doi":"10.1177/02783649231179753","DOIUrl":"https://doi.org/10.1177/02783649231179753","url":null,"abstract":"Agile maneuvers are essential for robot-enabled complex tasks such as surgical procedures. Prior explorations on surgery autonomy are limited to feasibility study of completing a single task without systematically addressing generic manipulation safety across different tasks. We present an integrated planning and control framework for 6-DoF robotic instruments for pipeline automation of surgical tasks. We leverage the geometry of a robotic instrument and propose the nodal state space to represent the robot state in SE(3) space. Each elementary robot motion could be encoded by regulation of the state parameters via a dynamical system. This theoretically ensures that every in-process trajectory is globally feasible and stably reached to an admissible target, and the controller is of closed-form without computing 6-DoF inverse kinematics. Then, to plan the motion steps reliably, we propose an interactive (instant) goal state of the robot that transforms manipulation planning through desired path constraints into a goal-varying manipulation (GVM) problem. We detail how GVM could adaptively and smoothly plan the procedure (could proceed or rewind the process as needed) based on on-the-fly situations under dynamic or disturbed environment. Finally, we extend the above policy to characterize complete pipelines of various surgical tasks. Simulations show that our framework could smoothly solve twisted maneuvers while avoiding collisions. Physical experiments using the da Vinci Research Kit validates the capability of automating individual tasks including tissue debridement, dissection, and wound suturing. The results confirm good task-level consistency and reliability compared to state-of-the-art automation algorithms.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"504 - 536"},"PeriodicalIF":9.2,"publicationDate":"2023-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42888141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Eiffel Tower: A deep-sea underwater dataset for long-term visual localization 埃菲尔铁塔：用于长期视觉定位的深海水下数据集

IF 9.2 1区计算机科学 Q1 ROBOTICS

International Journal of Robotics Research

Pub Date : 2023-05-09 DOI: 10.1177/02783649231177322

Clémentin Boittiaux, C. Dune, Maxime Ferrera, A. Arnaubec, R. Marxer, M. Matabos, Loïc Van Audenhaege, Vincent Hugel

Visual localization plays an important role in the positioning and navigation of robotics systems within previously visited environments. When visits occur over long periods of time, changes in the environment related to seasons or day-night cycles present a major challenge. Under water, the sources of variability are due to other factors such as water conditions or growth of marine organisms. Yet, it remains a major obstacle and a much less studied one, partly due to the lack of data. This paper presents a new deep-sea dataset to benchmark underwater long-term visual localization. The dataset is composed of images from four visits to the same hydrothermal vent edifice over the course of 5 years. Camera poses and a common geometry of the scene were estimated using navigation data and Structure-from-Motion. This serves as a reference when evaluating visual localization techniques. An analysis of the data provides insights about the major changes observed throughout the years. Furthermore, several well-established visual localization methods are evaluated on the dataset, showing there is still room for improvement in underwater long-term visual localization. The data is made publicly available at seanoe.org/data/00810/92226/.

视觉定位在机器人系统在先前访问过的环境中的定位和导航中起着重要作用。当访问发生在很长一段时间时，与季节或昼夜周期有关的环境变化是一个重大挑战。在水下，变化的来源是由于其他因素，如水条件或海洋生物的生长。然而，这仍然是一个主要障碍，而且研究较少，部分原因是缺乏数据。本文提出了一种新的深海数据集，用于水下长期视觉定位。该数据集由5年来对同一热液喷口大厦的四次访问所获得的图像组成。相机姿势和场景的共同几何估计使用导航数据和结构从运动。这可以作为评价视觉定位技术的参考。对数据的分析提供了对这些年来观察到的主要变化的见解。此外，在数据集上对几种成熟的视觉定位方法进行了评估，表明水下长期视觉定位仍有改进的空间。这些数据可在seanoe.org/data/00810/92226/上公开获取。

{"title":"Eiffel Tower: A deep-sea underwater dataset for long-term visual localization","authors":"Clémentin Boittiaux, C. Dune, Maxime Ferrera, A. Arnaubec, R. Marxer, M. Matabos, Loïc Van Audenhaege, Vincent Hugel","doi":"10.1177/02783649231177322","DOIUrl":"https://doi.org/10.1177/02783649231177322","url":null,"abstract":"Visual localization plays an important role in the positioning and navigation of robotics systems within previously visited environments. When visits occur over long periods of time, changes in the environment related to seasons or day-night cycles present a major challenge. Under water, the sources of variability are due to other factors such as water conditions or growth of marine organisms. Yet, it remains a major obstacle and a much less studied one, partly due to the lack of data. This paper presents a new deep-sea dataset to benchmark underwater long-term visual localization. The dataset is composed of images from four visits to the same hydrothermal vent edifice over the course of 5 years. Camera poses and a common geometry of the scene were estimated using navigation data and Structure-from-Motion. This serves as a reference when evaluating visual localization techniques. An analysis of the data provides insights about the major changes observed throughout the years. Furthermore, several well-established visual localization methods are evaluated on the dataset, showing there is still room for improvement in underwater long-term visual localization. The data is made publicly available at seanoe.org/data/00810/92226/.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"689 - 699"},"PeriodicalIF":9.2,"publicationDate":"2023-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46759361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3