Pub Date : 2023-10-04DOI: 10.1177/02783649231204659
Ankit J. Shah, Pritish Kamath, Shen Li, Patrick L. Craven, Kevin J. Landers, Kevin Oden, Julie Shah
When observing task demonstrations, human apprentices are able to identify whether a given task is executed correctly long before they gain expertise in actually performing that task. Prior research into learning from demonstrations (LfD) has failed to capture this notion of the acceptability of a task’s execution; meanwhile, temporal logics provide a flexible language for expressing task specifications. Inspired by this, we present Bayesian specification inference, a probabilistic model for inferring task specification as a temporal logic formula. We incorporate methods from probabilistic programming to define our priors, along with a domain-independent likelihood function to enable sampling-based inference. We demonstrate the efficacy of our model for inferring specifications, with over 90% similarity observed between the inferred specification and the ground truth—both within a synthetic domain and during a real-world table setting task.
{"title":"Supervised Bayesian specification inference from demonstrations","authors":"Ankit J. Shah, Pritish Kamath, Shen Li, Patrick L. Craven, Kevin J. Landers, Kevin Oden, Julie Shah","doi":"10.1177/02783649231204659","DOIUrl":"https://doi.org/10.1177/02783649231204659","url":null,"abstract":"When observing task demonstrations, human apprentices are able to identify whether a given task is executed correctly long before they gain expertise in actually performing that task. Prior research into learning from demonstrations (LfD) has failed to capture this notion of the acceptability of a task’s execution; meanwhile, temporal logics provide a flexible language for expressing task specifications. Inspired by this, we present Bayesian specification inference, a probabilistic model for inferring task specification as a temporal logic formula. We incorporate methods from probabilistic programming to define our priors, along with a domain-independent likelihood function to enable sampling-based inference. We demonstrate the efficacy of our model for inferring specifications, with over 90% similarity observed between the inferred specification and the ground truth—both within a synthetic domain and during a real-world table setting task.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135547512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1177/02783649231191184
Zachary J. Harris, Annie M. Mao, Tyler M. Paine, Louis L. Whitcomb
Model-based approaches to navigation, control, and fault detection that utilize precise nonlinear models of vehicle plant dynamics will enable more accurate control and navigation, assured autonomy, and more complex missions for such vehicles. This paper reports novel theoretical and experimental results addressing the problem of parameter estimation of plant and actuator models for underactuated underwater vehicles operating in 6 degrees-of-freedom (DOF) whose dynamics are modeled by finite-dimensional Newton-Euler equations. This paper reports the first theoretical approach and experimental validation to identify simultaneously plant-model parameters (parameters such as mass, added mass, hydrodynamic drag, and buoyancy) and control-actuator parameters (control-surface models and thruster models) in 6-DOF. Most previously reported studies on parameter identification assume that the control-actuator parameters are known a priori. Moreover, this paper reports the first proof of convergence of the parameter estimates to the true set of parameters for this class of vehicles under a persistence of excitation condition. The reported adaptive identification (AID) algorithm does not require instrumentation of 6-DOF vehicle acceleration, which is required by conventional approaches to parameter estimation such as least squares. Additionally, the reported AID algorithm is applicable under any arbitrary open-loop or closed-loop control law. We report simulation and experimental results for identifying the plant-model and control-actuator parameters for an L3 OceanServer Iver3 autonomous underwater vehicle. We believe this general approach to AID could be extended to apply to other classes of machines and other classes of marine, land, aerial, and space vehicles.
{"title":"Stable nullspace adaptive parameter identification of 6 degree-of-freedom plant and actuator models for underactuated vehicles: Theory and experimental evaluation","authors":"Zachary J. Harris, Annie M. Mao, Tyler M. Paine, Louis L. Whitcomb","doi":"10.1177/02783649231191184","DOIUrl":"https://doi.org/10.1177/02783649231191184","url":null,"abstract":"Model-based approaches to navigation, control, and fault detection that utilize precise nonlinear models of vehicle plant dynamics will enable more accurate control and navigation, assured autonomy, and more complex missions for such vehicles. This paper reports novel theoretical and experimental results addressing the problem of parameter estimation of plant and actuator models for underactuated underwater vehicles operating in 6 degrees-of-freedom (DOF) whose dynamics are modeled by finite-dimensional Newton-Euler equations. This paper reports the first theoretical approach and experimental validation to identify simultaneously plant-model parameters (parameters such as mass, added mass, hydrodynamic drag, and buoyancy) and control-actuator parameters (control-surface models and thruster models) in 6-DOF. Most previously reported studies on parameter identification assume that the control-actuator parameters are known a priori. Moreover, this paper reports the first proof of convergence of the parameter estimates to the true set of parameters for this class of vehicles under a persistence of excitation condition. The reported adaptive identification (AID) algorithm does not require instrumentation of 6-DOF vehicle acceleration, which is required by conventional approaches to parameter estimation such as least squares. Additionally, the reported AID algorithm is applicable under any arbitrary open-loop or closed-loop control law. We report simulation and experimental results for identifying the plant-model and control-actuator parameters for an L3 OceanServer Iver3 autonomous underwater vehicle. We believe this general approach to AID could be extended to apply to other classes of machines and other classes of marine, land, aerial, and space vehicles.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136247626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-29DOI: 10.1177/02783649231195380
Fang Bai, Kanzhi Wu, Adrien Bartoli
We study the generalized Procrustes analysis (GPA), as a minimal formulation to the simultaneous localization and mapping (SLAM) problem. We propose KernelGPA, a novel global registration technique to solve SLAM in the deformable environment. We propose the concept of deformable transformation which encodes the entangled pose and deformation. We define deformable transformations using a kernel method, and show that both the deformable transformations and the environment map can be solved globally in closed-form, up to global scale ambiguities. We solve the scale ambiguities by an optimization formulation that maximizes rigidity. We demonstrate KernelGPA using the Gaussian kernel, and validate the superiority of KernelGPA with various datasets. Code and data are available at url{https://bitbucket.org/FangBai/deformableprocrustes}.
{"title":"Kernel-GPA: A globally optimal solution to deformable SLAM in closed-form","authors":"Fang Bai, Kanzhi Wu, Adrien Bartoli","doi":"10.1177/02783649231195380","DOIUrl":"https://doi.org/10.1177/02783649231195380","url":null,"abstract":"We study the generalized Procrustes analysis (GPA), as a minimal formulation to the simultaneous localization and mapping (SLAM) problem. We propose KernelGPA, a novel global registration technique to solve SLAM in the deformable environment. We propose the concept of deformable transformation which encodes the entangled pose and deformation. We define deformable transformations using a kernel method, and show that both the deformable transformations and the environment map can be solved globally in closed-form, up to global scale ambiguities. We solve the scale ambiguities by an optimization formulation that maximizes rigidity. We demonstrate KernelGPA using the Gaussian kernel, and validate the superiority of KernelGPA with various datasets. Code and data are available at url{https://bitbucket.org/FangBai/deformableprocrustes}.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135199085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-23DOI: 10.1177/02783649231201196
Matteo Saveriano, Fares J. Abu-Dakka, Aljaz Kramberger, Luka Peternel
Biological systems, including human beings, have the innate ability to perform complex tasks in a versatile and agile manner. Researchers in sensorimotor control have aimed to comprehend and formally define this innate characteristic. The idea, supported by several experimental findings, that biological systems are able to combine and adapt basic units of motion into complex tasks finally leads to the formulation of the motor primitives’ theory. In this respect, Dynamic Movement Primitives (DMPs) represent an elegant mathematical formulation of the motor primitives as stable dynamical systems and are well suited to generate motor commands for artificial systems like robots. In the last decades, DMPs have inspired researchers in different robotic fields including imitation and reinforcement learning, optimal control, physical interaction, and human–robot co-working, resulting in a considerable amount of published papers. The goal of this tutorial survey is two-fold. On one side, we present the existing DMP formulations in rigorous mathematical terms and discuss the advantages and limitations of each approach as well as practical implementation details. In the tutorial vein, we also search for existing implementations of presented approaches and release several others. On the other side, we provide a systematic and comprehensive review of existing literature and categorize state-of-the-art work on DMP. The paper concludes with a discussion on the limitations of DMPs and an outline of possible research directions.
生物系统,包括人类,具有以灵活多样的方式执行复杂任务的天生能力。感觉运动控制的研究人员一直致力于理解和正式定义这种先天特征。生物系统能够将基本的运动单位组合并适应于复杂的任务,这一观点得到了几个实验结果的支持,最终导致了运动原语理论的形成。在这方面,动态运动原语(Dynamic Movement Primitives, dmp)代表了一种优雅的数学公式,将运动原语作为稳定的动力系统,非常适合为机器人等人工系统生成运动命令。在过去的几十年里,dmp启发了不同机器人领域的研究人员,包括模仿和强化学习、最优控制、物理交互和人机协同工作,并发表了大量论文。本教程调查的目的有两个。一方面,我们用严格的数学术语介绍了现有的DMP公式,并讨论了每种方法的优点和局限性以及实际实现细节。在本教程中,我们还搜索了所提供方法的现有实现,并发布了其他一些实现。另一方面,我们对现有文献进行了系统和全面的回顾,并对DMP的最新工作进行了分类。本文最后讨论了DMPs的局限性,并概述了可能的研究方向。
{"title":"Dynamic movement primitives in robotics: A tutorial survey","authors":"Matteo Saveriano, Fares J. Abu-Dakka, Aljaz Kramberger, Luka Peternel","doi":"10.1177/02783649231201196","DOIUrl":"https://doi.org/10.1177/02783649231201196","url":null,"abstract":"Biological systems, including human beings, have the innate ability to perform complex tasks in a versatile and agile manner. Researchers in sensorimotor control have aimed to comprehend and formally define this innate characteristic. The idea, supported by several experimental findings, that biological systems are able to combine and adapt basic units of motion into complex tasks finally leads to the formulation of the motor primitives’ theory. In this respect, Dynamic Movement Primitives (DMPs) represent an elegant mathematical formulation of the motor primitives as stable dynamical systems and are well suited to generate motor commands for artificial systems like robots. In the last decades, DMPs have inspired researchers in different robotic fields including imitation and reinforcement learning, optimal control, physical interaction, and human–robot co-working, resulting in a considerable amount of published papers. The goal of this tutorial survey is two-fold. On one side, we present the existing DMP formulations in rigorous mathematical terms and discuss the advantages and limitations of each approach as well as practical implementation details. In the tutorial vein, we also search for existing implementations of presented approaches and release several others. On the other side, we provide a systematic and comprehensive review of existing literature and categorize state-of-the-art work on DMP. The paper concludes with a discussion on the limitations of DMPs and an outline of possible research directions.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135957920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-19DOI: 10.1177/02783649231198898
Basak Sakcak, Kalle G Timperi, Vadim Weinstein, Steven M LaValle
This paper addresses the lower limits of encoding and processing the information acquired through interactions between an internal system (robot algorithms or software) and an external system (robot body and its environment) in terms of action and observation histories. Both are modeled as transition systems. We want to know the weakest internal system that is sufficient for achieving passive (filtering) and active (planning) tasks. We introduce the notion of an information transition system (ITS) for the internal system which is a transition system over a space of information states that reflect a robot’s or other observer’s perspective based on limited sensing, memory, computation, and actuation. An ITS is viewed as a filter and a policy or plan is viewed as a function that labels the states of this ITS. Regardless of whether internal systems are obtained by learning algorithms, planning algorithms, or human insight, we want to know the limits of feasibility for given robot hardware and tasks. We establish, in a general setting, that minimal information transition systems (ITSs) exist up to reasonable equivalence assumptions, and are unique under some general conditions. We then apply the theory to generate new insights into several problems, including optimal sensor fusion/filtering, solving basic planning tasks, and finding minimal representations for modeling a system given input-output relations.
{"title":"A mathematical characterization of minimally sufficient robot brains","authors":"Basak Sakcak, Kalle G Timperi, Vadim Weinstein, Steven M LaValle","doi":"10.1177/02783649231198898","DOIUrl":"https://doi.org/10.1177/02783649231198898","url":null,"abstract":"This paper addresses the lower limits of encoding and processing the information acquired through interactions between an internal system (robot algorithms or software) and an external system (robot body and its environment) in terms of action and observation histories. Both are modeled as transition systems. We want to know the weakest internal system that is sufficient for achieving passive (filtering) and active (planning) tasks. We introduce the notion of an information transition system (ITS) for the internal system which is a transition system over a space of information states that reflect a robot’s or other observer’s perspective based on limited sensing, memory, computation, and actuation. An ITS is viewed as a filter and a policy or plan is viewed as a function that labels the states of this ITS. Regardless of whether internal systems are obtained by learning algorithms, planning algorithms, or human insight, we want to know the limits of feasibility for given robot hardware and tasks. We establish, in a general setting, that minimal information transition systems (ITSs) exist up to reasonable equivalence assumptions, and are unique under some general conditions. We then apply the theory to generate new insights into several problems, including optimal sensor fusion/filtering, solving basic planning tasks, and finding minimal representations for modeling a system given input-output relations.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135061052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-11DOI: 10.1177/02783649231196925
Maria Bauza, Antonia Bronars, Alberto Rodriguez
In this paper, we present Tac2Pose, an object-specific approach to tactile pose estimation from the first touch for known objects. Given the object geometry, we learn a tailored perception model in simulation that estimates a probability distribution over possible object poses given a tactile observation. To do so, we simulate the contact shapes that a dense set of object poses would produce on the sensor. Then, given a new contact shape obtained from the sensor, we match it against the pre-computed set using an object-specific embedding learned using contrastive learning. We obtain contact shapes from the sensor with an object-agnostic calibration step that maps RGB (red, green, blue) tactile observations to binary contact shapes. This mapping, which can be reused across object and sensor instances, is the only step trained with real sensor data. This results in a perception model that localizes objects from the first real tactile observation. Importantly, it produces pose distributions and can incorporate additional pose constraints coming from other perception systems, multiple contacts, or priors. We provide quantitative results for 20 objects. Tac2Pose provides high accuracy pose estimations from distinctive tactile observations while regressing meaningful pose distributions to account for those contact shapes that could result from different object poses. We extend and test Tac2Pose in multi-contact scenarios where two tactile sensors are simultaneously in contact with the object, as during a grasp with a parallel jaw gripper. We further show that when the output pose distribution is filtered with a prior on the object pose, Tac2Pose is often able to improve significantly on the prior. This suggests synergistic use of Tac2Pose with additional sensing modalities (e.g., vision) even in cases where the tactile observation from a grasp is not sufficiently discriminative. Given a coarse estimate of an object’s pose, even ambiguous contacts can be used to determine an object’s pose precisely. We also test Tac2Pose on object models reconstructed from a 3D scanner, to evaluate the robustness to uncertainty in the object model. We show that even in the presence of model uncertainty, Tac2Pose is able to achieve fine accuracy comparable to when the object model is the manufacturer’s CAD (computer-aided design) model. Finally, we demonstrate the advantages of Tac2Pose compared with three baseline methods for tactile pose estimation: directly regressing the object pose with a neural network, matching an observed contact to a set of possible contacts using a standard classification neural network, and direct pixel comparison of an observed contact with a set of possible contacts. Website: mcube.mit.edu/research/tac2pose.html
{"title":"Tac2Pose: Tactile object pose estimation from the first touch","authors":"Maria Bauza, Antonia Bronars, Alberto Rodriguez","doi":"10.1177/02783649231196925","DOIUrl":"https://doi.org/10.1177/02783649231196925","url":null,"abstract":"In this paper, we present Tac2Pose, an object-specific approach to tactile pose estimation from the first touch for known objects. Given the object geometry, we learn a tailored perception model in simulation that estimates a probability distribution over possible object poses given a tactile observation. To do so, we simulate the contact shapes that a dense set of object poses would produce on the sensor. Then, given a new contact shape obtained from the sensor, we match it against the pre-computed set using an object-specific embedding learned using contrastive learning. We obtain contact shapes from the sensor with an object-agnostic calibration step that maps RGB (red, green, blue) tactile observations to binary contact shapes. This mapping, which can be reused across object and sensor instances, is the only step trained with real sensor data. This results in a perception model that localizes objects from the first real tactile observation. Importantly, it produces pose distributions and can incorporate additional pose constraints coming from other perception systems, multiple contacts, or priors. We provide quantitative results for 20 objects. Tac2Pose provides high accuracy pose estimations from distinctive tactile observations while regressing meaningful pose distributions to account for those contact shapes that could result from different object poses. We extend and test Tac2Pose in multi-contact scenarios where two tactile sensors are simultaneously in contact with the object, as during a grasp with a parallel jaw gripper. We further show that when the output pose distribution is filtered with a prior on the object pose, Tac2Pose is often able to improve significantly on the prior. This suggests synergistic use of Tac2Pose with additional sensing modalities (e.g., vision) even in cases where the tactile observation from a grasp is not sufficiently discriminative. Given a coarse estimate of an object’s pose, even ambiguous contacts can be used to determine an object’s pose precisely. We also test Tac2Pose on object models reconstructed from a 3D scanner, to evaluate the robustness to uncertainty in the object model. We show that even in the presence of model uncertainty, Tac2Pose is able to achieve fine accuracy comparable to when the object model is the manufacturer’s CAD (computer-aided design) model. Finally, we demonstrate the advantages of Tac2Pose compared with three baseline methods for tactile pose estimation: directly regressing the object pose with a neural network, matching an observed contact to a set of possible contacts using a standard classification neural network, and direct pixel comparison of an observed contact with a set of possible contacts. Website: mcube.mit.edu/research/tac2pose.html","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135981270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-08DOI: 10.1177/02783649231188740
Christoforos Mavrogiannis, Jonathan DeCastro, S. Srinivasa
Despite the structure of road environments, imposed via geometry and rules, traffic flows exhibit complex multiagent dynamics. Reasoning about such dynamics is challenging due to the high dimensionality of possible behavior, the heterogeneity of agents, and the stochasticity of their decision-making. Modeling approaches learning associations in Euclidean spaces are often limited by their high sample complexity and the sparseness of available datasets. Our key insight is that the structure of traffic behavior could be effectively captured by lower-dimensional abstractions that emphasize critical interaction relationships. In this article, we abstract the space of behavior in traffic scenes into a discrete set of interaction modes, described in interpretable, symbolic form using topological braids. First, through a case study across real-world datasets, we show that braids can describe a wide range of complex behavior and uncover insights about the interactivity of vehicles. For instance, we find that high vehicle density does not always map to rich mixing patterns among them. Further, we show that our representation can effectively guide decision-making in traffic scenes. We describe a mechanism that probabilistically maps vehicles’ past behavior to modes of future interaction. We integrate this mechanism into a control algorithm that treats navigation as minimization of uncertainty over interaction modes, and investigate its performance on the task of traversing uncontrolled intersections in simulation. We show that our algorithm enables agents to coordinate significantly safer traversals for similar efficiency compared to baselines explicitly reasoning in the space of trajectories across a series of challenging scenarios.
{"title":"Abstracting road traffic via topological braids: Applications to traffic flow analysis and distributed control","authors":"Christoforos Mavrogiannis, Jonathan DeCastro, S. Srinivasa","doi":"10.1177/02783649231188740","DOIUrl":"https://doi.org/10.1177/02783649231188740","url":null,"abstract":"Despite the structure of road environments, imposed via geometry and rules, traffic flows exhibit complex multiagent dynamics. Reasoning about such dynamics is challenging due to the high dimensionality of possible behavior, the heterogeneity of agents, and the stochasticity of their decision-making. Modeling approaches learning associations in Euclidean spaces are often limited by their high sample complexity and the sparseness of available datasets. Our key insight is that the structure of traffic behavior could be effectively captured by lower-dimensional abstractions that emphasize critical interaction relationships. In this article, we abstract the space of behavior in traffic scenes into a discrete set of interaction modes, described in interpretable, symbolic form using topological braids. First, through a case study across real-world datasets, we show that braids can describe a wide range of complex behavior and uncover insights about the interactivity of vehicles. For instance, we find that high vehicle density does not always map to rich mixing patterns among them. Further, we show that our representation can effectively guide decision-making in traffic scenes. We describe a mechanism that probabilistically maps vehicles’ past behavior to modes of future interaction. We integrate this mechanism into a control algorithm that treats navigation as minimization of uncertainty over interaction modes, and investigate its performance on the task of traversing uncontrolled intersections in simulation. We show that our algorithm enables agents to coordinate significantly safer traversals for similar efficiency compared to baselines explicitly reasoning in the space of trajectories across a series of challenging scenarios.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":" ","pages":""},"PeriodicalIF":9.2,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48199767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-07DOI: 10.1177/02783649231201201
Cheng Chi, Benjamin Burchfiel, Eric Cousineau, Siyuan Feng, Shuran Song
This paper tackles the task of goal-conditioned dynamic manipulation of deformable objects. This task is highly challenging due to its complex dynamics (introduced by object deformation and high-speed action) and strict task requirements (defined by a precise goal specification). To address these challenges, we present Iterative Residual Policy (IRP), a general learning framework applicable to repeatable tasks with complex dynamics. IRP learns an implicit policy via delta dynamics—instead of modeling the entire dynamical system and inferring actions from that model, IRP learns delta dynamics that predict the effects of delta action on the previously observed trajectory. When combined with adaptive action sampling, the system can quickly optimize its actions online to reach a specified goal. We demonstrate the effectiveness of IRP on two tasks: whipping a rope to hit a target point and swinging a cloth to reach a target pose. Despite being trained only in simulation on a fixed robot setup, IRP is able to efficiently generalize to noisy real-world dynamics, new objects with unseen physical properties, and even different robot hardware embodiments, demonstrating its excellent generalization capability relative to alternative approaches.
{"title":"Iterative residual policy: For goal-conditioned dynamic manipulation of deformable objects","authors":"Cheng Chi, Benjamin Burchfiel, Eric Cousineau, Siyuan Feng, Shuran Song","doi":"10.1177/02783649231201201","DOIUrl":"https://doi.org/10.1177/02783649231201201","url":null,"abstract":"This paper tackles the task of goal-conditioned dynamic manipulation of deformable objects. This task is highly challenging due to its complex dynamics (introduced by object deformation and high-speed action) and strict task requirements (defined by a precise goal specification). To address these challenges, we present Iterative Residual Policy (IRP), a general learning framework applicable to repeatable tasks with complex dynamics. IRP learns an implicit policy via delta dynamics—instead of modeling the entire dynamical system and inferring actions from that model, IRP learns delta dynamics that predict the effects of delta action on the previously observed trajectory. When combined with adaptive action sampling, the system can quickly optimize its actions online to reach a specified goal. We demonstrate the effectiveness of IRP on two tasks: whipping a rope to hit a target point and swinging a cloth to reach a target pose. Despite being trained only in simulation on a fixed robot setup, IRP is able to efficiently generalize to noisy real-world dynamics, new objects with unseen physical properties, and even different robot hardware embodiments, demonstrating its excellent generalization capability relative to alternative approaches.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135048632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-05DOI: 10.1177/02783649231197723
Alberto Jaenal, Francisco-Angel Moreno, J. Gonzalez-Jimenez
Representing the scene appearance by a global image descriptor (BoW, NetVLAD, etc.) is a widely adopted choice to address Visual Place Recognition (VPR). The main reasons are that appearance descriptors can be effectively provided with radiometric and perspective invariances as well as they can deal with large environments because of their compactness. However, addressing metric localization with such descriptors (a problem called Appearance-based Localization or AbL) achieves much poorer accuracy than those techniques exploiting the observation of 3D landmarks, which represent the standard for visual localization. In this paper, we propose ALLOM (Appearance-based Localization with Local Observation Models) which addresses AbL by leveraging the topological location of a robot within a map to achieve accurate metric estimations. This topology-assisted metric localization is implemented with a sequential Monte Carlo Bayesian filter that applies a specific observation model for each different place of the environment, thus taking advantage of the local correlation between the pose and the appearance descriptor within each region. ALLOM also benefits from the topological structure of the map to detect eventual robot loss-of-tracking and to effectively cope with its relocalization by applying VPR. Our proposal demonstrates superior metric localization capability compared to different state-of-the-art AbL methods under a wide range of situations.
{"title":"Sequential Monte Carlo localization in topometric appearance maps","authors":"Alberto Jaenal, Francisco-Angel Moreno, J. Gonzalez-Jimenez","doi":"10.1177/02783649231197723","DOIUrl":"https://doi.org/10.1177/02783649231197723","url":null,"abstract":"Representing the scene appearance by a global image descriptor (BoW, NetVLAD, etc.) is a widely adopted choice to address Visual Place Recognition (VPR). The main reasons are that appearance descriptors can be effectively provided with radiometric and perspective invariances as well as they can deal with large environments because of their compactness. However, addressing metric localization with such descriptors (a problem called Appearance-based Localization or AbL) achieves much poorer accuracy than those techniques exploiting the observation of 3D landmarks, which represent the standard for visual localization. In this paper, we propose ALLOM (Appearance-based Localization with Local Observation Models) which addresses AbL by leveraging the topological location of a robot within a map to achieve accurate metric estimations. This topology-assisted metric localization is implemented with a sequential Monte Carlo Bayesian filter that applies a specific observation model for each different place of the environment, thus taking advantage of the local correlation between the pose and the appearance descriptor within each region. ALLOM also benefits from the topological structure of the map to detect eventual robot loss-of-tracking and to effectively cope with its relocalization by applying VPR. Our proposal demonstrates superior metric localization capability compared to different state-of-the-art AbL methods under a wide range of situations.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":" ","pages":""},"PeriodicalIF":9.2,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47074445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-04DOI: 10.1177/02783649231198900
Bohan Yang, Congying Sui, Fangxun Zhong, Yun-Hui Liu
Deformable object manipulation (DOM) with point clouds has great potential as nonrigid 3D shapes can be measured without detecting and tracking image features. However, robotic shape control of deformable objects with point clouds is challenging due to: the unknown point correspondences and the noisy partial observability of raw point clouds; the modeling difficulties of the relationship between point clouds and robot motions. To tackle these challenges, this paper introduces a novel modal-graph framework for the model-free shape servoing of deformable objects with raw point clouds. Unlike the existing works studying the object’s geometry structure, we propose a modal graph to describe the low-frequency deformation structure of the DOM system, which is robust to the measurement irregularities. The modal graph enables us to directly extract low-dimensional deformation features from raw point clouds without extra processing of registrations, refinements, and occlusion removal. It also preserves the spatial structure of the DOM system to inverse the feature changes into robot motions. Moreover, as the framework is built with unknown physical and geometric object models, we design an adaptive robust controller to deform the object toward the desired shape while tackling the modeling uncertainties, noises, and disturbances online. The system is proved to be input-to-state stable (ISS) using Lyapunov-based methods. Extensive experiments are conducted to validate our method using linear, planar, tubular, and volumetric objects under different settings.
{"title":"Modal-graph 3D shape servoing of deformable objects with raw point clouds","authors":"Bohan Yang, Congying Sui, Fangxun Zhong, Yun-Hui Liu","doi":"10.1177/02783649231198900","DOIUrl":"https://doi.org/10.1177/02783649231198900","url":null,"abstract":"Deformable object manipulation (DOM) with point clouds has great potential as nonrigid 3D shapes can be measured without detecting and tracking image features. However, robotic shape control of deformable objects with point clouds is challenging due to: the unknown point correspondences and the noisy partial observability of raw point clouds; the modeling difficulties of the relationship between point clouds and robot motions. To tackle these challenges, this paper introduces a novel modal-graph framework for the model-free shape servoing of deformable objects with raw point clouds. Unlike the existing works studying the object’s geometry structure, we propose a modal graph to describe the low-frequency deformation structure of the DOM system, which is robust to the measurement irregularities. The modal graph enables us to directly extract low-dimensional deformation features from raw point clouds without extra processing of registrations, refinements, and occlusion removal. It also preserves the spatial structure of the DOM system to inverse the feature changes into robot motions. Moreover, as the framework is built with unknown physical and geometric object models, we design an adaptive robust controller to deform the object toward the desired shape while tackling the modeling uncertainties, noises, and disturbances online. The system is proved to be input-to-state stable (ISS) using Lyapunov-based methods. Extensive experiments are conducted to validate our method using linear, planar, tubular, and volumetric objects under different settings.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135403222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}