首页 > 最新文献

International Journal of Robotics Research最新文献

英文 中文
Survey of maps of dynamics for mobile robots 移动机器人动力学图综述
IF 9.2 1区 计算机科学 Q1 Mathematics Pub Date : 2023-08-03 DOI: 10.1177/02783649231190428
T. Kucner, Martin Magnusson, Sariah Mghames, Luigi Palmieri, Francesco Verdoja, Chittaranjan Srinivas Swaminathan, T. Krajník, E. Schaffernicht, N. Bellotto, Marc Hanheide, A. Lilienthal
Robotic mapping provides spatial information for autonomous agents. Depending on the tasks they seek to enable, the maps created range from simple 2D representations of the environment geometry to complex, multilayered semantic maps. This survey article is about maps of dynamics (MoDs), which store semantic information about typical motion patterns in a given environment. Some MoDs use trajectories as input, and some can be built from short, disconnected observations of motion. Robots can use MoDs, for example, for global motion planning, improved localization, or human motion prediction. Accounting for the increasing importance of maps of dynamics, we present a comprehensive survey that organizes the knowledge accumulated in the field and identifies promising directions for future work. Specifically, we introduce field-specific vocabulary, summarize existing work according to a novel taxonomy, and describe possible applications and open research problems. We conclude that the field is mature enough, and we expect that maps of dynamics will be increasingly used to improve robot performance in real-world use cases. At the same time, the field is still in a phase of rapid development where novel contributions could significantly impact this research area.
机器人映射为自主代理提供空间信息。根据他们寻求实现的任务,创建的地图范围从环境几何的简单2D表示到复杂的多层语义地图。这篇调查文章是关于动态地图(MoDs),它存储了给定环境中典型运动模式的语义信息。有些mod使用轨迹作为输入,有些可以通过对运动的短暂、不连贯的观察来构建。例如,机器人可以使用mod进行全局运动规划、改进定位或人类运动预测。考虑到动态图的重要性日益增加,我们提出了一项全面的调查,组织了该领域积累的知识,并确定了未来工作的有希望的方向。具体而言,我们介绍了特定领域的词汇,根据新的分类法总结了现有的工作,并描述了可能的应用和开放的研究问题。我们得出的结论是,该领域已经足够成熟,我们预计动态地图将越来越多地用于改善现实世界用例中的机器人性能。与此同时,该领域仍处于快速发展阶段,新的贡献可能会对该研究领域产生重大影响。
{"title":"Survey of maps of dynamics for mobile robots","authors":"T. Kucner, Martin Magnusson, Sariah Mghames, Luigi Palmieri, Francesco Verdoja, Chittaranjan Srinivas Swaminathan, T. Krajník, E. Schaffernicht, N. Bellotto, Marc Hanheide, A. Lilienthal","doi":"10.1177/02783649231190428","DOIUrl":"https://doi.org/10.1177/02783649231190428","url":null,"abstract":"Robotic mapping provides spatial information for autonomous agents. Depending on the tasks they seek to enable, the maps created range from simple 2D representations of the environment geometry to complex, multilayered semantic maps. This survey article is about maps of dynamics (MoDs), which store semantic information about typical motion patterns in a given environment. Some MoDs use trajectories as input, and some can be built from short, disconnected observations of motion. Robots can use MoDs, for example, for global motion planning, improved localization, or human motion prediction. Accounting for the increasing importance of maps of dynamics, we present a comprehensive survey that organizes the knowledge accumulated in the field and identifies promising directions for future work. Specifically, we introduce field-specific vocabulary, summarize existing work according to a novel taxonomy, and describe possible applications and open research problems. We conclude that the field is mature enough, and we expect that maps of dynamics will be increasingly used to improve robot performance in real-world use cases. At the same time, the field is still in a phase of rapid development where novel contributions could significantly impact this research area.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42978612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BED-BPP: Benchmarking dataset for robotic bin packing problems BED-BPP:机器人装箱问题的基准数据集
IF 9.2 1区 计算机科学 Q1 Mathematics Pub Date : 2023-08-02 DOI: 10.1177/02783649231193048
Florian Kagerer, Maximilian Beinhofer, Stefan Stricker, A. Nüchter
Many algorithms that were developed for solving three-dimensional bin packing problems use generic data for either experiments or evaluation. However, none of these datasets became accepted for benchmarking 3D bin packing algorithms throughout the community. To close this gap, this paper presents the benchmarking dataset for robotic bin packing problems (BED-BPP), which is based on realistic data. We show the variety of the dataset by elaborating an n-gram analysis. Besides, we propose an evaluation function, which contains a stability check that uses rigid body simulation. We demonstrated the application of our dataset on four different approaches, which we integrated in our software environment.
许多为解决三维装箱问题而开发的算法使用通用数据进行实验或评估。然而,这些数据集都没有被整个社区接受用于基准测试3D装箱算法。为了缩小这一差距,本文提出了基于实际数据的机器人装箱问题基准数据集(BED-BPP)。我们通过详细的n-gram分析来展示数据集的多样性。此外,我们提出了一个评估函数,其中包含使用刚体模拟的稳定性检查。我们展示了数据集在四种不同方法上的应用,并将其集成到软件环境中。
{"title":"BED-BPP: Benchmarking dataset for robotic bin packing problems","authors":"Florian Kagerer, Maximilian Beinhofer, Stefan Stricker, A. Nüchter","doi":"10.1177/02783649231193048","DOIUrl":"https://doi.org/10.1177/02783649231193048","url":null,"abstract":"Many algorithms that were developed for solving three-dimensional bin packing problems use generic data for either experiments or evaluation. However, none of these datasets became accepted for benchmarking 3D bin packing algorithms throughout the community. To close this gap, this paper presents the benchmarking dataset for robotic bin packing problems (BED-BPP), which is based on realistic data. We show the variety of the dataset by elaborating an n-gram analysis. Besides, we propose an evaluation function, which contains a stability check that uses rigid body simulation. We demonstrated the application of our dataset on four different approaches, which we integrated in our software environment.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43744369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust feedback motion planning via contraction theory 基于收缩理论的鲁棒反馈运动规划
IF 9.2 1区 计算机科学 Q1 Mathematics Pub Date : 2023-08-01 DOI: 10.1177/02783649231186165
Sumeet Singh, Benoit Landry, Anirudha Majumdar, J. Slotine, M. Pavone
We present a framework for online generation of robust motion plans for robotic systems with nonlinear dynamics subject to bounded disturbances, control constraints, and online state constraints such as obstacles. In an offline phase, one computes the structure of a feedback controller that can be efficiently implemented online to track any feasible nominal trajectory. The offline phase leverages contraction theory, specifically, Control Contraction Metrics, and convex optimization to characterize a fixed-size “tube” that the state is guaranteed to remain within while tracking a nominal trajectory (representing the center of the tube). In the online phase, when the robot is faced with obstacles, a motion planner uses such a tube as a robustness margin for collision checking, yielding nominal trajectories that can be safely executed, that is, tracked without collisions under disturbances. In contrast to recent work on robust online planning using funnel libraries, our approach is not restricted to a fixed library of maneuvers computed offline and is thus particularly well-suited to applications such as UAV flight in densely cluttered environments where complex maneuvers may be required to reach a goal. We demonstrate our approach through numerical simulations of planar and 3D quadrotors, and hardware results on a quadrotor platform navigating a complex obstacle environment while subject to aerodynamic disturbances. The results demonstrate the ability of our approach to jointly balance motion safety and efficiency for agile robotic systems.
我们提出了一个在线生成具有非线性动力学的机器人系统鲁棒运动计划的框架,该系统受有界扰动、控制约束和在线状态约束(如障碍物)的影响。在离线阶段,计算反馈控制器的结构,该反馈控制器可以有效地在线实现以跟踪任何可行的标称轨迹。离线阶段利用收缩理论,特别是控制收缩度量和凸优化来表征固定尺寸的“管”,在跟踪标称轨迹(代表管的中心)时,状态保证保持在该“管”内。在在线阶段,当机器人面临障碍物时,运动规划器使用这样的管作为碰撞检查的鲁棒性裕度,产生可以安全执行的标称轨迹,即在扰动下跟踪而不会发生碰撞。与最近使用漏斗库进行稳健在线规划的工作相比,我们的方法并不局限于离线计算的固定机动库,因此特别适合无人机在密集杂乱环境中飞行等应用,在这些环境中可能需要复杂的机动才能达到目标。我们通过平面和三维四旋翼机的数值模拟,以及四旋翼机平台在受到空气动力学扰动的情况下在复杂障碍物环境中导航的硬件结果,展示了我们的方法。结果证明了我们的方法能够共同平衡敏捷机器人系统的运动安全性和效率。
{"title":"Robust feedback motion planning via contraction theory","authors":"Sumeet Singh, Benoit Landry, Anirudha Majumdar, J. Slotine, M. Pavone","doi":"10.1177/02783649231186165","DOIUrl":"https://doi.org/10.1177/02783649231186165","url":null,"abstract":"We present a framework for online generation of robust motion plans for robotic systems with nonlinear dynamics subject to bounded disturbances, control constraints, and online state constraints such as obstacles. In an offline phase, one computes the structure of a feedback controller that can be efficiently implemented online to track any feasible nominal trajectory. The offline phase leverages contraction theory, specifically, Control Contraction Metrics, and convex optimization to characterize a fixed-size “tube” that the state is guaranteed to remain within while tracking a nominal trajectory (representing the center of the tube). In the online phase, when the robot is faced with obstacles, a motion planner uses such a tube as a robustness margin for collision checking, yielding nominal trajectories that can be safely executed, that is, tracked without collisions under disturbances. In contrast to recent work on robust online planning using funnel libraries, our approach is not restricted to a fixed library of maneuvers computed offline and is thus particularly well-suited to applications such as UAV flight in densely cluttered environments where complex maneuvers may be required to reach a goal. We demonstrate our approach through numerical simulations of planar and 3D quadrotors, and hardware results on a quadrotor platform navigating a complex obstacle environment while subject to aerodynamic disturbances. The results demonstrate the ability of our approach to jointly balance motion safety and efficiency for agile robotic systems.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46116013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Action-conditional implicit visual dynamics for deformable object manipulation 用于可变形对象操作的动作条件隐式视觉动力学
1区 计算机科学 Q1 Mathematics Pub Date : 2023-07-28 DOI: 10.1177/02783649231191222
Bokui Shen, Zhenyu Jiang, Christopher Choy, Silvio Savarese, Leonidas J. Guibas, Anima Anandkumar, Yuke Zhu
Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, brings substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit representations for action-conditional dynamics and geodesics-based contrastive learning. To represent deformable dynamics from partial RGB-D observations, we learn implicit representations of occupancy and flow-based forward dynamics. To accurately identify state change under large non-rigid deformations, we learn a correspondence embedding field through a novel geodesics-based contrastive loss. To evaluate our approach, we develop a simulation framework for manipulating complex deformable shapes in realistic scenes and a benchmark containing over 17,000 action trajectories with six types of plush toys and 78 variants. Our model achieves the best performance in geometry, correspondence, and dynamics predictions over existing approaches. The ACID dynamics models are successfully employed for goal-conditioned deformable manipulation tasks, resulting in a 30% increase in task success rate over the strongest baseline. Furthermore, we apply the simulation-trained ACID model directly to real-world objects and show success in manipulating them into target configurations. https://b0ku1.github.io/acid/
在现实世界中操纵体积可变形的物体,如毛绒玩具和披萨面团,由于无限的形状变化、非刚性运动和部分可观察性,带来了巨大的挑战。我们介绍了基于结构化隐式神经表征的动作条件视觉动态模型ACID。ACID集成了两种新技术:用于动作条件动力学的隐式表示和基于测地线的对比学习。为了表示来自部分RGB-D观测的可变形动力学,我们学习了占用和基于流的前向动力学的隐式表示。为了准确识别大非刚性变形下的状态变化,我们通过一种新的基于测地线的对比损失来学习对应嵌入场。为了评估我们的方法,我们开发了一个模拟框架,用于在现实场景中操纵复杂的可变形形状,以及一个包含超过17,000个动作轨迹的基准,其中包含六种类型的毛绒玩具和78种变体。与现有方法相比,我们的模型在几何、对应和动态预测方面实现了最佳性能。ACID动力学模型成功地应用于目标条件下的可变形操作任务,使任务成功率比最强基线提高了30%。此外,我们将模拟训练的ACID模型直接应用于现实世界的对象,并成功地将它们操纵成目标配置。https://b0ku1.github.io/acid/
{"title":"Action-conditional implicit visual dynamics for deformable object manipulation","authors":"Bokui Shen, Zhenyu Jiang, Christopher Choy, Silvio Savarese, Leonidas J. Guibas, Anima Anandkumar, Yuke Zhu","doi":"10.1177/02783649231191222","DOIUrl":"https://doi.org/10.1177/02783649231191222","url":null,"abstract":"Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, brings substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit representations for action-conditional dynamics and geodesics-based contrastive learning. To represent deformable dynamics from partial RGB-D observations, we learn implicit representations of occupancy and flow-based forward dynamics. To accurately identify state change under large non-rigid deformations, we learn a correspondence embedding field through a novel geodesics-based contrastive loss. To evaluate our approach, we develop a simulation framework for manipulating complex deformable shapes in realistic scenes and a benchmark containing over 17,000 action trajectories with six types of plush toys and 78 variants. Our model achieves the best performance in geometry, correspondence, and dynamics predictions over existing approaches. The ACID dynamics models are successfully employed for goal-conditioned deformable manipulation tasks, resulting in a 30% increase in task success rate over the strongest baseline. Furthermore, we apply the simulation-trained ACID model directly to real-world objects and show success in manipulating them into target configurations. https://b0ku1.github.io/acid/","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135557263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GROUNDED: A localizing ground penetrating radar evaluation dataset for learning to localize in inclement weather ground:用于在恶劣天气下学习定位的探地雷达定位评估数据集
IF 9.2 1区 计算机科学 Q1 Mathematics Pub Date : 2023-07-25 DOI: 10.1177/02783649231183460
Teddy Ort, Igor Gilitschenski, Daniela Rus
Mapping and localization using surface features is prone to failure due to environment changes such as inclement weather. Recently, Localizing Ground Penetrating Radar (LGPR) has been proposed as an alternative means of localizing using underground features that are stable over time and less affected by surface conditions. However, due to the lack of commercially available LGPR sensors, the wider research community has been largely unable to replicate this work or build new and innovative solutions. We present GROUNDED, an open dataset of LGPR scans collected in a variety of environments and weather conditions. By labeling these data with ground truth localization from an RTK-GPS/Inertial Navigation System, and carefully calibrating and time-synchronizing the radar scans with ground truth positions, camera imagery, and lidar data, we enable researchers to build novel localization solutions that are resilient to changing surface conditions. We include 108 individual runs totaling 450 km of driving with LGPR, GPS, odometry, camera, and lidar measurements. We also present two new evaluation benchmarks for 1) localizing in weather and 2) multi-lane localization, to enable comparisons of future work supported by the dataset. Additionally, we present a first application of the new dataset in the form of LGPRNet: an inception-based CNN architecture for learning localization that is resilient to changing weather conditions. The dataset can be accessed at http://lgprdata.com .
由于恶劣天气等环境变化,利用地物进行测绘和定位容易失败。最近,地面探地雷达(LGPR)作为一种替代方法被提出,利用地下特征进行定位,这些地下特征随着时间的推移是稳定的,受地面条件的影响较小。然而,由于缺乏商用的LGPR传感器,更广泛的研究界在很大程度上无法复制这项工作或建立新的创新解决方案。我们展示了ground,这是一个在各种环境和天气条件下收集的LGPR扫描的开放数据集。通过将这些数据标记为来自RTK-GPS/惯性导航系统的地面真实定位,并仔细校准雷达扫描与地面真实位置,相机图像和激光雷达数据的时间同步,我们使研究人员能够构建适应不断变化的地面条件的新型定位解决方案。我们包括108个单独的跑步,总计450公里的驾驶,使用LGPR, GPS,里程计,相机和激光雷达测量。我们还提出了两个新的评估基准:1)天气定位和2)多车道定位,以便对数据集支持的未来工作进行比较。此外,我们以LGPRNet的形式提出了新数据集的第一个应用:一个基于初始化的CNN架构,用于学习本地化,该架构能够适应不断变化的天气条件。该数据集可以在http://lgprdata.com上访问。
{"title":"GROUNDED: A localizing ground penetrating radar evaluation dataset for learning to localize in inclement weather","authors":"Teddy Ort, Igor Gilitschenski, Daniela Rus","doi":"10.1177/02783649231183460","DOIUrl":"https://doi.org/10.1177/02783649231183460","url":null,"abstract":"Mapping and localization using surface features is prone to failure due to environment changes such as inclement weather. Recently, Localizing Ground Penetrating Radar (LGPR) has been proposed as an alternative means of localizing using underground features that are stable over time and less affected by surface conditions. However, due to the lack of commercially available LGPR sensors, the wider research community has been largely unable to replicate this work or build new and innovative solutions. We present GROUNDED, an open dataset of LGPR scans collected in a variety of environments and weather conditions. By labeling these data with ground truth localization from an RTK-GPS/Inertial Navigation System, and carefully calibrating and time-synchronizing the radar scans with ground truth positions, camera imagery, and lidar data, we enable researchers to build novel localization solutions that are resilient to changing surface conditions. We include 108 individual runs totaling 450 km of driving with LGPR, GPS, odometry, camera, and lidar measurements. We also present two new evaluation benchmarks for 1) localizing in weather and 2) multi-lane localization, to enable comparisons of future work supported by the dataset. Additionally, we present a first application of the new dataset in the form of LGPRNet: an inception-based CNN architecture for learning localization that is resilient to changing weather conditions. The dataset can be accessed at http://lgprdata.com .","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42925325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stabilizing deep Q-learning with Q-graph-based bounds 用基于Q图的边界稳定深度Q学习
IF 9.2 1区 计算机科学 Q1 Mathematics Pub Date : 2023-07-25 DOI: 10.1177/02783649231185165
Sabrina Hoppe, Markus Giftthaler, R. Krug, Marc Toussaint
State-of-the art deep reinforcement learning has enabled autonomous agents to learn complex strategies from scratch on many problems including continuous control tasks. Deep Q-networks (DQN) and deep deterministic policy gradients (DDPGs) are two such algorithms which are both based on Q-learning. They therefore all share function approximation, off-policy behavior, and bootstrapping—the constituents of the so-called deadly triad that is known for its convergence issues. We suggest to take a graph perspective on the data an agent has collected and show that the structure of this data graph is linked to the degree of divergence that can be expected. We further demonstrate that a subset of states and actions from the data graph can be selected such that the resulting finite graph can be interpreted as a simplified Markov decision process (MDP) for which the Q-values can be computed analytically. These Q-values are lower bounds for the Q-values in the original problem, and enforcing these bounds in temporal difference learning can help to prevent soft divergence. We show further effects on a simulated continuous control task, including improved sample efficiency, increased robustness toward hyperparameters as well as a better ability to cope with limited replay memory. Finally, we demonstrate the benefits of our method on a large robotic benchmark with an industrial assembly task and approximately 60 h of real-world interaction.
最先进的深度强化学习使自主主体能够从头开始学习包括连续控制任务在内的许多问题的复杂策略。深度Q网络(DQN)和深度确定性策略梯度(DDPG)是两种基于Q学习的算法。因此,它们都共享函数近似、非策略行为和自举——这是所谓的致命三元组的组成部分,以其收敛问题而闻名。我们建议从图的角度看待代理收集的数据,并表明该数据图的结构与可预期的分歧程度有关。我们进一步证明,可以从数据图中选择状态和动作的子集,使得得到的有限图可以被解释为简化的马尔可夫决策过程(MDP),对于该过程可以分析地计算Q值。这些Q值是原始问题中Q值的下限,在时间差学习中强制执行这些边界有助于防止软发散。我们展示了对模拟连续控制任务的进一步影响,包括提高了样本效率,增强了对超参数的鲁棒性,以及更好地处理有限回放内存的能力。最后,我们在一个大型机器人基准上展示了我们的方法的优势,该基准具有工业装配任务和大约60小时的真实世界交互。
{"title":"Stabilizing deep Q-learning with Q-graph-based bounds","authors":"Sabrina Hoppe, Markus Giftthaler, R. Krug, Marc Toussaint","doi":"10.1177/02783649231185165","DOIUrl":"https://doi.org/10.1177/02783649231185165","url":null,"abstract":"State-of-the art deep reinforcement learning has enabled autonomous agents to learn complex strategies from scratch on many problems including continuous control tasks. Deep Q-networks (DQN) and deep deterministic policy gradients (DDPGs) are two such algorithms which are both based on Q-learning. They therefore all share function approximation, off-policy behavior, and bootstrapping—the constituents of the so-called deadly triad that is known for its convergence issues. We suggest to take a graph perspective on the data an agent has collected and show that the structure of this data graph is linked to the degree of divergence that can be expected. We further demonstrate that a subset of states and actions from the data graph can be selected such that the resulting finite graph can be interpreted as a simplified Markov decision process (MDP) for which the Q-values can be computed analytically. These Q-values are lower bounds for the Q-values in the original problem, and enforcing these bounds in temporal difference learning can help to prevent soft divergence. We show further effects on a simulated continuous control task, including improved sample efficiency, increased robustness toward hyperparameters as well as a better ability to cope with limited replay memory. Finally, we demonstrate the benefits of our method on a large robotic benchmark with an industrial assembly task and approximately 60 h of real-world interaction.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2023-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46535776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robotic drilling for the Chinese Chang’E 5 lunar sample-return mission 中国嫦娥五号月球样本返回任务的机器人钻探
IF 9.2 1区 计算机科学 Q1 Mathematics Pub Date : 2023-07-01 DOI: 10.1177/02783649231187918
Zhang Tao, Yong Pang, Ting Zeng, Guoxing Wang, Shen Yin, Kun Xu, Guidong Mo, Xingwang Zhang, Lusi Wang, Shuai Yang, Zengzeng Zhao, Junjie Qin, Junshan Gong, Zhongxiang Zhao, Xuefeng Tong, Zhongwang Yin, Haiyuan Wang, Fan Zhao, Yanhong Zheng, Xiangjin Deng, Bin Wang, Jinchang Xu, Wei Wang, Shuangfei Yu, Xiaoming Lai, Xilun Ding
On December 2, 2020, a 2-m class robotic drill onboard the Chinese Chang’E 5 lunar lander successfully penetrated 1 m into the lunar regolith and collected 259.72 g of samples. This paper presents the design and development, terrestrial tests, and lunar sampling results of the robotic drill. First, the system design of the robotic drill, including its engineering objectives, drill configuration, drilling and coring methods, and rotational speed determination, was studied. Subsequently, a control strategy was proposed to address the geological uncertainty and complexity of the lunar surface. Terrestrial tests were conducted to assess the sampling performance of the robotic drill under both atmospheric and vacuum conditions. Finally, the results of drilling on the lunar surface were obtained, and the complex geological conditions encountered were analyzed. The success of the Chinese Chang’E 5 lunar sample-return mission demonstrates the feasibility of the proposed robotic drill. This study can serve as an important reference for future extraterrestrial robotic regolith-sampling missions.
2020年12月2日,中国“嫦娥五号”月球着陆器上的一个2米级机器人钻头成功深入月球表层1米,采集了259.72克样本。本文介绍了机器人钻机的设计与研制、地面试验和月球取样结果。首先,对机器人钻机的系统设计进行了研究,包括工程目标、钻机配置、钻进取芯方法、转速确定等。随后,针对月球表面地质的不确定性和复杂性,提出了一种控制策略。进行了地面测试,以评估机器人钻机在大气和真空条件下的取样性能。最后,给出了在月球表面钻探的结果,并对所遇到的复杂地质条件进行了分析。中国嫦娥五号月球样本返回任务的成功证明了机器人演练的可行性。该研究可为未来的地外机器人风化层取样任务提供重要参考。
{"title":"Robotic drilling for the Chinese Chang’E 5 lunar sample-return mission","authors":"Zhang Tao, Yong Pang, Ting Zeng, Guoxing Wang, Shen Yin, Kun Xu, Guidong Mo, Xingwang Zhang, Lusi Wang, Shuai Yang, Zengzeng Zhao, Junjie Qin, Junshan Gong, Zhongxiang Zhao, Xuefeng Tong, Zhongwang Yin, Haiyuan Wang, Fan Zhao, Yanhong Zheng, Xiangjin Deng, Bin Wang, Jinchang Xu, Wei Wang, Shuangfei Yu, Xiaoming Lai, Xilun Ding","doi":"10.1177/02783649231187918","DOIUrl":"https://doi.org/10.1177/02783649231187918","url":null,"abstract":"On December 2, 2020, a 2-m class robotic drill onboard the Chinese Chang’E 5 lunar lander successfully penetrated 1 m into the lunar regolith and collected 259.72 g of samples. This paper presents the design and development, terrestrial tests, and lunar sampling results of the robotic drill. First, the system design of the robotic drill, including its engineering objectives, drill configuration, drilling and coring methods, and rotational speed determination, was studied. Subsequently, a control strategy was proposed to address the geological uncertainty and complexity of the lunar surface. Terrestrial tests were conducted to assess the sampling performance of the robotic drill under both atmospheric and vacuum conditions. Finally, the results of drilling on the lunar surface were obtained, and the complex geological conditions encountered were analyzed. The success of the Chinese Chang’E 5 lunar sample-return mission demonstrates the feasibility of the proposed robotic drill. This study can serve as an important reference for future extraterrestrial robotic regolith-sampling missions.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41882942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ViF-GTAD: A new automotive dataset with ground truth for ADAS/AD development, testing, and validation ViF GTAD:一个新的汽车数据集,具有ADAS/AD开发、测试和验证的基本事实
IF 9.2 1区 计算机科学 Q1 Mathematics Pub Date : 2023-07-01 DOI: 10.1177/02783649231188146
Sarah Haas, Selim Solmaz, Jakob Reckenzaun, Simon Genser
A new dataset for automated driving, which is the subject matter of this paper, identifies and addresses a gap in existing similar perception datasets. While most state-of-the-art perception datasets primarily focus on the provision of various onboard sensor measurements along with the semantic information under various driving conditions, the provided information is often insufficient since the object list and position data provided include unknown and time-varying errors. The current paper and the associated dataset describe the first publicly available perception measurement data that include not only the onboard sensor information from the camera, Lidar, and radar with semantically classified objects but also the high-precision ground-truth position measurements enabled by the accurate RTK-assisted GPS localization systems available on both the ego vehicle and the dynamic target objects. This paper provides insight on the capturing of the data, explicitly explaining the metadata structure and the content, as well as the potential application examples where it has been, and can potentially be, applied and implemented in relation to automated driving and environmental perception systems development, testing, and validation.
本文的主题是一个新的自动驾驶数据集,它识别并解决了现有类似感知数据集的差距。虽然大多数最先进的感知数据集主要侧重于提供各种车载传感器测量以及各种驾驶条件下的语义信息,但由于所提供的目标列表和位置数据包含未知和时变误差,因此所提供的信息通常不足。目前的论文和相关数据集描述了第一个公开可用的感知测量数据,其中不仅包括来自相机、激光雷达和雷达的车载传感器信息,以及具有语义分类对象的雷达,还包括由精确的rtk辅助GPS定位系统在ego车辆和动态目标对象上实现的高精度地面真实位置测量。本文提供了对数据捕获的见解,明确地解释了元数据的结构和内容,以及潜在的应用示例,在这些示例中,元数据已经或可能应用于自动驾驶和环境感知系统的开发、测试和验证。
{"title":"ViF-GTAD: A new automotive dataset with ground truth for ADAS/AD development, testing, and validation","authors":"Sarah Haas, Selim Solmaz, Jakob Reckenzaun, Simon Genser","doi":"10.1177/02783649231188146","DOIUrl":"https://doi.org/10.1177/02783649231188146","url":null,"abstract":"A new dataset for automated driving, which is the subject matter of this paper, identifies and addresses a gap in existing similar perception datasets. While most state-of-the-art perception datasets primarily focus on the provision of various onboard sensor measurements along with the semantic information under various driving conditions, the provided information is often insufficient since the object list and position data provided include unknown and time-varying errors. The current paper and the associated dataset describe the first publicly available perception measurement data that include not only the onboard sensor information from the camera, Lidar, and radar with semantically classified objects but also the high-precision ground-truth position measurements enabled by the accurate RTK-assisted GPS localization systems available on both the ego vehicle and the dynamic target objects. This paper provides insight on the capturing of the data, explicitly explaining the metadata structure and the content, as well as the potential application examples where it has been, and can potentially be, applied and implemented in relation to automated driving and environmental perception systems development, testing, and validation.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":9.2,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43905647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Composable energy policies for reactive motion generation and reinforcement learning 反应性运动生成和强化学习的可组合能量策略
1区 计算机科学 Q1 Mathematics Pub Date : 2023-06-26 DOI: 10.1177/02783649231179499
Julen Urain, Anqi Li, Puze Liu, Carlo D’Eramo, Jan Peters
In this work, we introduce composable energy policies (CEP), a novel framework for multi-objective motion generation. We frame the problem of composing multiple policy components from a probabilistic view. We consider a set of stochastic policies represented in arbitrary task spaces, where each policy represents a distribution of the actions to solve a particular task. Then, we aim to find the action in the configuration space that optimally satisfies all the policy components. The presented framework allows the fusion of motion generators from different sources: optimal control, data-driven policies, motion planning, and handcrafted policies. Classically, the problem of multi-objective motion generation is solved by the composition of a set of deterministic policies, rather than stochastic policies. However, there are common situations where different policy components have conflicting behaviors, leading to oscillations or the robot getting stuck in an undesirable state. While our approach is not directly able to solve the conflicting policies problem, we claim that modeling each policy as a stochastic policy allows more expressive representations for each component in contrast with the classical reactive motion generation approaches. In some tasks, such as reaching a target in a cluttered environment, we show experimentally that CEP additional expressivity allows us to model policies that reduce these conflicting behaviors. A field that benefits from these reactive motion generators is the one of robot reinforcement learning. Integrating these policy architectures with reinforcement learning allows us to include a set of inductive biases in the learning problem. These inductive biases guide the reinforcement learning agent towards informative regions or improve collision safety while exploring. In our work, we show how to integrate our proposed reactive motion generator as a structured policy for reinforcement learning. Combining the reinforcement learning agent exploration with the prior-based CEP, we can improve the learning performance and explore safer.
在这项工作中,我们引入了可组合能量策略(CEP),这是一种新的多目标运动生成框架。我们从概率的角度来描述组合多个策略组件的问题。我们考虑在任意任务空间中表示的一组随机策略,其中每个策略表示解决特定任务的操作的分布。然后,我们的目标是在配置空间中找到最优地满足所有策略组件的操作。所提出的框架允许融合来自不同来源的运动生成器:最优控制、数据驱动策略、运动规划和手工制作策略。经典的多目标运动生成问题是通过一组确定性策略的组合来解决的,而不是随机策略。然而,通常情况下,不同的策略组件具有冲突的行为,导致振荡或机器人陷入不希望的状态。虽然我们的方法不能直接解决冲突策略问题,但我们声称,与经典的反应运动生成方法相比,将每个策略建模为随机策略可以为每个组件提供更具表现力的表示。在某些任务中,例如在混乱的环境中到达目标,我们通过实验表明,CEP额外的表现力允许我们对减少这些冲突行为的策略进行建模。从这些反应性运动发生器中受益的一个领域是机器人强化学习。将这些策略架构与强化学习集成,使我们能够在学习问题中包含一组归纳偏差。这些归纳偏差引导强化学习代理进入信息区域或在探索时提高碰撞安全性。在我们的工作中,我们展示了如何将我们提出的反应运动生成器集成为强化学习的结构化策略。将强化学习智能体探索与基于先验的CEP相结合,可以提高学习性能,更安全地进行探索。
{"title":"Composable energy policies for reactive motion generation and reinforcement learning","authors":"Julen Urain, Anqi Li, Puze Liu, Carlo D’Eramo, Jan Peters","doi":"10.1177/02783649231179499","DOIUrl":"https://doi.org/10.1177/02783649231179499","url":null,"abstract":"In this work, we introduce composable energy policies (CEP), a novel framework for multi-objective motion generation. We frame the problem of composing multiple policy components from a probabilistic view. We consider a set of stochastic policies represented in arbitrary task spaces, where each policy represents a distribution of the actions to solve a particular task. Then, we aim to find the action in the configuration space that optimally satisfies all the policy components. The presented framework allows the fusion of motion generators from different sources: optimal control, data-driven policies, motion planning, and handcrafted policies. Classically, the problem of multi-objective motion generation is solved by the composition of a set of deterministic policies, rather than stochastic policies. However, there are common situations where different policy components have conflicting behaviors, leading to oscillations or the robot getting stuck in an undesirable state. While our approach is not directly able to solve the conflicting policies problem, we claim that modeling each policy as a stochastic policy allows more expressive representations for each component in contrast with the classical reactive motion generation approaches. In some tasks, such as reaching a target in a cluttered environment, we show experimentally that CEP additional expressivity allows us to model policies that reduce these conflicting behaviors. A field that benefits from these reactive motion generators is the one of robot reinforcement learning. Integrating these policy architectures with reinforcement learning allows us to include a set of inductive biases in the learning problem. These inductive biases guide the reinforcement learning agent towards informative regions or improve collision safety while exploring. In our work, we show how to integrate our proposed reactive motion generator as a structured policy for reinforcement learning. Combining the reinforcement learning agent exploration with the prior-based CEP, we can improve the learning performance and explore safer.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135607974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Control-oriented meta-learning Control-oriented元学习
1区 计算机科学 Q1 Mathematics Pub Date : 2023-06-07 DOI: 10.1177/02783649231165085
Spencer M. Richards, Navid Azizan, Jean-Jacques Slotine, Marco Pavone
Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.
实时自适应是机器人在复杂、动态环境中控制的必要条件。如果任何不确定动力学项都是已知非线性特征的线性参数化项,那么自适应控制律可以赋予非线性系统良好的轨迹跟踪性能。然而,通常很难先验地确定这些特征,例如旋翼飞机上的气动干扰或机械臂与各种物体之间的相互作用力。在本文中,我们转向使用神经网络的数据驱动建模,从过去的数据中离线学习具有这些非线性特征的内部参数模型的自适应控制器。我们的关键见解是,我们可以更好地为控制器的部署做好准备,在闭环仿真中使用面向控制的元学习特征,而不是面向回归的元学习特征来适应输入输出数据。具体来说,我们以闭环跟踪仿真为基础学习者,以平均跟踪误差为元目标,对自适应控制器进行元学习。对于受风影响的完全驱动和欠驱动非线性平面旋翼机,我们证明了我们的自适应控制器在部署在闭环中进行轨迹跟踪控制时优于其他使用面向回归的元学习训练的控制器。
{"title":"Control-oriented meta-learning","authors":"Spencer M. Richards, Navid Azizan, Jean-Jacques Slotine, Marco Pavone","doi":"10.1177/02783649231165085","DOIUrl":"https://doi.org/10.1177/02783649231165085","url":null,"abstract":"Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With both fully actuated and underactuated nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135403611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
International Journal of Robotics Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1