Pub Date : 2021-09-08DOI: 10.1177/02783649211038280
Beomjoon Kim, Luke Shimanuki, L. Kaelbling, Tomas Lozano-Perez
We present a framework for learning to guide geometric task-and-motion planning (G-TAMP). G-TAMP is a subclass of task-and-motion planning in which the goal is to move multiple objects to target regions among movable obstacles. A standard graph search algorithm is not directly applicable, because G-TAMP problems involve hybrid search spaces and expensive action feasibility checks. To handle this, we introduce a novel planner that extends basic heuristic search with random sampling and a heuristic function that prioritizes feasibility checking on promising state–action pairs. The main drawback of such pure planners is that they lack the ability to learn from planning experience to improve their efficiency. We propose two learning algorithms to address this. The first is an algorithm for learning a rank function that guides the discrete task-level search, and the second is an algorithm for learning a sampler that guides the continuous motion-level search. We propose design principles for designing data-efficient algorithms for learning from planning experience and representations for effective generalization. We evaluate our framework in challenging G-TAMP problems, and show that we can improve both planning and data efficiency.
{"title":"Representation, learning, and planning algorithms for geometric task and motion planning","authors":"Beomjoon Kim, Luke Shimanuki, L. Kaelbling, Tomas Lozano-Perez","doi":"10.1177/02783649211038280","DOIUrl":"https://doi.org/10.1177/02783649211038280","url":null,"abstract":"We present a framework for learning to guide geometric task-and-motion planning (G-TAMP). G-TAMP is a subclass of task-and-motion planning in which the goal is to move multiple objects to target regions among movable obstacles. A standard graph search algorithm is not directly applicable, because G-TAMP problems involve hybrid search spaces and expensive action feasibility checks. To handle this, we introduce a novel planner that extends basic heuristic search with random sampling and a heuristic function that prioritizes feasibility checking on promising state–action pairs. The main drawback of such pure planners is that they lack the ability to learn from planning experience to improve their efficiency. We propose two learning algorithms to address this. The first is an algorithm for learning a rank function that guides the discrete task-level search, and the second is an algorithm for learning a sampler that guides the continuous motion-level search. We propose design principles for designing data-efficient algorithms for learning from planning experience and representations for effective generalization. We evaluate our framework in challenging G-TAMP problems, and show that we can improve both planning and data efficiency.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"41 1","pages":"210 - 231"},"PeriodicalIF":9.2,"publicationDate":"2021-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42922517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-08DOI: 10.1177/02783649211069156
Zherong Pan, Min Liu, Xifeng Gao, Dinesh Manocha
We present an algorithm to compute planar linkage topology and geometry, given a user-specified end-effector trajectory. Planar linkage structures convert rotational or prismatic motions of a single actuator into an arbitrarily complex periodic motion, which is an important component when building low-cost, modular robots, mechanical toys, and foldable structures in our daily lives (chairs, bikes, and shelves). The design of such structures requires trial and error even for experienced engineers. Our research provides semi-automatic methods for exploring novel designs given high-level specifications and constraints. We formulate this problem as a non-smooth numerical optimization with quadratic objective functions and non-convex quadratic constraints involving mixed-integer decision variables (MIQCQP). We propose and compare three approximate algorithms to solve this problem: mixed-integer conic-programming (MICP), mixed-integer nonlinear programming (MINLP), and simulated annealing (SA). We evaluated these algorithms searching for planar linkages involving 10 − 14 rigid links. Our results show that the best performance can be achieved by combining MICP and MINLP, leading to a hybrid algorithm capable of finding the planar linkages within a couple of hours on a desktop machine, which significantly outperforms the SA baseline in terms of optimality. We highlight the effectiveness of our optimized planar linkages by using them as legs of a walking robot.
{"title":"Joint search of optimal topology and trajectory for planar linkages","authors":"Zherong Pan, Min Liu, Xifeng Gao, Dinesh Manocha","doi":"10.1177/02783649211069156","DOIUrl":"https://doi.org/10.1177/02783649211069156","url":null,"abstract":"We present an algorithm to compute planar linkage topology and geometry, given a user-specified end-effector trajectory. Planar linkage structures convert rotational or prismatic motions of a single actuator into an arbitrarily complex periodic motion, which is an important component when building low-cost, modular robots, mechanical toys, and foldable structures in our daily lives (chairs, bikes, and shelves). The design of such structures requires trial and error even for experienced engineers. Our research provides semi-automatic methods for exploring novel designs given high-level specifications and constraints. We formulate this problem as a non-smooth numerical optimization with quadratic objective functions and non-convex quadratic constraints involving mixed-integer decision variables (MIQCQP). We propose and compare three approximate algorithms to solve this problem: mixed-integer conic-programming (MICP), mixed-integer nonlinear programming (MINLP), and simulated annealing (SA). We evaluated these algorithms searching for planar linkages involving 10 − 14 rigid links. Our results show that the best performance can be achieved by combining MICP and MINLP, leading to a hybrid algorithm capable of finding the planar linkages within a couple of hours on a desktop machine, which significantly outperforms the SA baseline in terms of optimality. We highlight the effectiveness of our optimized planar linkages by using them as legs of a walking robot.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"176 - 195"},"PeriodicalIF":9.2,"publicationDate":"2021-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45930845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-13DOI: 10.1177/02783649211038146
Marco Morales, Lydia Tapia, Gildardo Sánchez-Ante, S. Hutchinson
{"title":"Special Issue on the Thirteenth Workshop on the Algorithmic Foundations of Robotics (WAFR) 2018","authors":"Marco Morales, Lydia Tapia, Gildardo Sánchez-Ante, S. Hutchinson","doi":"10.1177/02783649211038146","DOIUrl":"https://doi.org/10.1177/02783649211038146","url":null,"abstract":"","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"40 1","pages":"1047 - 1048"},"PeriodicalIF":9.2,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46175143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-12DOI: 10.1177/02783649211037715
Baxi Chong, Tianyu Wang, Jennifer M. Rieser, Bo Lin, Abdul Kaba, Grigoriy Blekherman, H. Choset, D. Goldman
Sidewinding is a form of locomotion executed by certain snakes and has been reconstructed in limbless robots; the gait is beneficial because it is effective in diverse terrestrial environments. Sidewinding gaits are generated by coordination of horizontal and vertical traveling waves of body undulation: the horizontal wave largely sets the direction of sidewinding with respect to the body frame while the vertical traveling wave largely determines the contact pattern between the body and the environment. When the locomotor’s center of mass leaves the supporting polygon formed by the contact pattern, undesirable locomotor behaviors (such as unwanted turning or unstable oscillation of the body) can occur. In this article, we develop an approach to generate desired translation and turning by modulating the vertical wave. These modulations alter the distribution of body–environment contact patches and can stabilize configurations that were previously statically unstable. The approach first identifies the spatial frequency of the vertical wave that statically stabilizes the locomotor for a given horizontal wave. Then, using geometric mechanics tools, we design the coordination between body waves that produces the desired translation or rotation. We demonstrate the effectiveness of our technique in numerical simulations and on experiments with a 16-joint limbless robot locomoting on flat hard ground. Our scheme broadens the range of movements and behaviors accessible to sidewinding locomotors at low speeds, which can lead to limbless systems capable of traversing diverse terrain stably and/or rapidly.
{"title":"Frequency modulation of body waves to improve performance of sidewinding robots","authors":"Baxi Chong, Tianyu Wang, Jennifer M. Rieser, Bo Lin, Abdul Kaba, Grigoriy Blekherman, H. Choset, D. Goldman","doi":"10.1177/02783649211037715","DOIUrl":"https://doi.org/10.1177/02783649211037715","url":null,"abstract":"Sidewinding is a form of locomotion executed by certain snakes and has been reconstructed in limbless robots; the gait is beneficial because it is effective in diverse terrestrial environments. Sidewinding gaits are generated by coordination of horizontal and vertical traveling waves of body undulation: the horizontal wave largely sets the direction of sidewinding with respect to the body frame while the vertical traveling wave largely determines the contact pattern between the body and the environment. When the locomotor’s center of mass leaves the supporting polygon formed by the contact pattern, undesirable locomotor behaviors (such as unwanted turning or unstable oscillation of the body) can occur. In this article, we develop an approach to generate desired translation and turning by modulating the vertical wave. These modulations alter the distribution of body–environment contact patches and can stabilize configurations that were previously statically unstable. The approach first identifies the spatial frequency of the vertical wave that statically stabilizes the locomotor for a given horizontal wave. Then, using geometric mechanics tools, we design the coordination between body waves that produces the desired translation or rotation. We demonstrate the effectiveness of our technique in numerical simulations and on experiments with a 16-joint limbless robot locomoting on flat hard ground. Our scheme broadens the range of movements and behaviors accessible to sidewinding locomotors at low speeds, which can lead to limbless systems capable of traversing diverse terrain stably and/or rapidly.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"40 1","pages":"1547 - 1562"},"PeriodicalIF":9.2,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44905344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-06DOI: 10.1177/02783649211032721
A. Tanwani, Andy Yan, Jonathan Lee, S. Calinon, Ken Goldberg
This paper presents a framework to learn the sequential structure in the demonstrations for robot imitation learning. We first present a family of task-parameterized hidden semi-Markov models that extracts invariant segments (also called sub-goals or options) from demonstrated trajectories, and optimally follows the sampled sequence of states from the model with a linear quadratic tracking controller. We then extend the concept to learning invariant segments from visual observations that are sequenced together for robot imitation. We present Motion2Vec that learns a deep embedding space by minimizing a metric learning loss in a Siamese network: images from the same action segment are pulled together while being pushed away from randomly sampled images of other segments, and a time contrastive loss is used to preserve the temporal ordering of the images. The trained embeddings are segmented with a recurrent neural network, and subsequently used for decoding the end-effector pose of the robot. We first show its application to a pick-and-place task with the Baxter robot while avoiding a moving obstacle from four kinesthetic demonstrations only, followed by suturing task imitation from publicly available suturing videos of the JIGSAWS dataset with state-of-the-art 85 . 5 % segmentation accuracy and 0 . 94 cm error in position per observation on the test set.
{"title":"Sequential robot imitation learning from observations","authors":"A. Tanwani, Andy Yan, Jonathan Lee, S. Calinon, Ken Goldberg","doi":"10.1177/02783649211032721","DOIUrl":"https://doi.org/10.1177/02783649211032721","url":null,"abstract":"This paper presents a framework to learn the sequential structure in the demonstrations for robot imitation learning. We first present a family of task-parameterized hidden semi-Markov models that extracts invariant segments (also called sub-goals or options) from demonstrated trajectories, and optimally follows the sampled sequence of states from the model with a linear quadratic tracking controller. We then extend the concept to learning invariant segments from visual observations that are sequenced together for robot imitation. We present Motion2Vec that learns a deep embedding space by minimizing a metric learning loss in a Siamese network: images from the same action segment are pulled together while being pushed away from randomly sampled images of other segments, and a time contrastive loss is used to preserve the temporal ordering of the images. The trained embeddings are segmented with a recurrent neural network, and subsequently used for decoding the end-effector pose of the robot. We first show its application to a pick-and-place task with the Baxter robot while avoiding a moving obstacle from four kinesthetic demonstrations only, followed by suturing task imitation from publicly available suturing videos of the JIGSAWS dataset with state-of-the-art 85 . 5 % segmentation accuracy and 0 . 94 cm error in position per observation on the test set.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"40 1","pages":"1306 - 1325"},"PeriodicalIF":9.2,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/02783649211032721","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43109234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-21DOI: 10.1177/02783649231167210
Krishan Rana, Vibhavari Dasagi, Jesse Haviland, Ben Talbot, Michael Milford, N. Sunderhauf
We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs from each system, BCF arbitrates control between them, exploiting their respective strengths. We study BCF on two real-world robotics tasks involving navigation in a vast and long-horizon environment, and a complex reaching task that involves manipulability maximisation. For both these domains, simple handcrafted controllers exist that can solve the task at hand in a risk-averse manner but do not necessarily exhibit the optimal solution given limitations in analytical modelling, controller miscalibration and task variation. As exploration is naturally guided by the prior in the early stages of training, BCF accelerates learning, while substantially improving beyond the performance of the control prior, as the policy gains more experience. More importantly, given the risk-aversity of the control prior, BCF ensures safe exploration and deployment, where the control prior naturally dominates the action distribution in states unknown to the policy. We additionally show BCF’s applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real world. BCF is a promising approach towards combining the complementary strengths of deep RL and traditional robotic control, surpassing what either can achieve independently. The code and supplementary video material are made publicly available at https://krishanrana.github.io/bcf.
{"title":"Bayesian controller fusion: Leveraging control priors in deep reinforcement learning for robotics","authors":"Krishan Rana, Vibhavari Dasagi, Jesse Haviland, Ben Talbot, Michael Milford, N. Sunderhauf","doi":"10.1177/02783649231167210","DOIUrl":"https://doi.org/10.1177/02783649231167210","url":null,"abstract":"We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs from each system, BCF arbitrates control between them, exploiting their respective strengths. We study BCF on two real-world robotics tasks involving navigation in a vast and long-horizon environment, and a complex reaching task that involves manipulability maximisation. For both these domains, simple handcrafted controllers exist that can solve the task at hand in a risk-averse manner but do not necessarily exhibit the optimal solution given limitations in analytical modelling, controller miscalibration and task variation. As exploration is naturally guided by the prior in the early stages of training, BCF accelerates learning, while substantially improving beyond the performance of the control prior, as the policy gains more experience. More importantly, given the risk-aversity of the control prior, BCF ensures safe exploration and deployment, where the control prior naturally dominates the action distribution in states unknown to the policy. We additionally show BCF’s applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real world. BCF is a promising approach towards combining the complementary strengths of deep RL and traditional robotic control, surpassing what either can achieve independently. The code and supplementary video material are made publicly available at https://krishanrana.github.io/bcf.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"123 - 146"},"PeriodicalIF":9.2,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65097934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-13DOI: 10.1177/02783649211044405
Takayuki Osa
The objective function used in trajectory optimization is often non-convex and can have an infinite set of local optima. In such cases, there are diverse solutions to perform a given task. Although there are a few methods to find multiple solutions for motion planning, they are limited to generating a finite set of solutions. To address this issue, we present an optimization method that learns an infinite set of solutions in trajectory optimization. In our framework, diverse solutions are obtained by learning latent representations of solutions. Our approach can be interpreted as training a deep generative model of collision-free trajectories for motion planning. The experimental results indicate that the trained model represents an infinite set of homotopic solutions for motion planning problems.
{"title":"Motion planning by learning the solution manifold in trajectory optimization","authors":"Takayuki Osa","doi":"10.1177/02783649211044405","DOIUrl":"https://doi.org/10.1177/02783649211044405","url":null,"abstract":"The objective function used in trajectory optimization is often non-convex and can have an infinite set of local optima. In such cases, there are diverse solutions to perform a given task. Although there are a few methods to find multiple solutions for motion planning, they are limited to generating a finite set of solutions. To address this issue, we present an optimization method that learns an infinite set of solutions in trajectory optimization. In our framework, diverse solutions are obtained by learning latent representations of solutions. Our approach can be interpreted as training a deep generative model of collision-free trajectories for motion planning. The experimental results indicate that the trained model represents an infinite set of homotopic solutions for motion planning problems.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"41 1","pages":"281 - 311"},"PeriodicalIF":9.2,"publicationDate":"2021-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48450370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-06DOI: 10.1177/02783649211050958
Dylan P. Losey, Andrea V. Bajcsy, M. O'Malley, A. Dragan
When a robot performs a task next to a human, physical interaction is inevitable: the human might push, pull, twist, or guide the robot. The state of the art treats these interactions as disturbances that the robot should reject or avoid. At best, these robots respond safely while the human interacts; but after the human lets go, these robots simply return to their original behavior. We recognize that physical human–robot interaction (pHRI) is often intentional: the human intervenes on purpose because the robot is not doing the task correctly. In this article, we argue that when pHRI is intentional it is also informative: the robot can leverage interactions to learn how it should complete the rest of its current task even after the person lets go. We formalize pHRI as a dynamical system, where the human has in mind an objective function they want the robot to optimize, but the robot does not get direct access to the parameters of this objective: they are internal to the human. Within our proposed framework human interactions become observations about the true objective. We introduce approximations to learn from and respond to pHRI in real-time. We recognize that not all human corrections are perfect: often users interact with the robot noisily, and so we improve the efficiency of robot learning from pHRI by reducing unintended learning. Finally, we conduct simulations and user studies on a robotic manipulator to compare our proposed approach with the state of the art. Our results indicate that learning from pHRI leads to better task performance and improved human satisfaction.
{"title":"Physical interaction as communication: Learning robot objectives online from human corrections","authors":"Dylan P. Losey, Andrea V. Bajcsy, M. O'Malley, A. Dragan","doi":"10.1177/02783649211050958","DOIUrl":"https://doi.org/10.1177/02783649211050958","url":null,"abstract":"When a robot performs a task next to a human, physical interaction is inevitable: the human might push, pull, twist, or guide the robot. The state of the art treats these interactions as disturbances that the robot should reject or avoid. At best, these robots respond safely while the human interacts; but after the human lets go, these robots simply return to their original behavior. We recognize that physical human–robot interaction (pHRI) is often intentional: the human intervenes on purpose because the robot is not doing the task correctly. In this article, we argue that when pHRI is intentional it is also informative: the robot can leverage interactions to learn how it should complete the rest of its current task even after the person lets go. We formalize pHRI as a dynamical system, where the human has in mind an objective function they want the robot to optimize, but the robot does not get direct access to the parameters of this objective: they are internal to the human. Within our proposed framework human interactions become observations about the true objective. We introduce approximations to learn from and respond to pHRI in real-time. We recognize that not all human corrections are perfect: often users interact with the robot noisily, and so we improve the efficiency of robot learning from pHRI by reducing unintended learning. Finally, we conduct simulations and user studies on a robotic manipulator to compare our proposed approach with the state of the art. Our results indicate that learning from pHRI leads to better task performance and improved human satisfaction.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"41 1","pages":"20 - 44"},"PeriodicalIF":9.2,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42136688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-16DOI: 10.1177/02783649211021869
S. Demir, Utku Çulha, A. C. Karacakol, Abdon Pena‐Francesch, Sebastian Trimpe, M. Sitti
Untethered small-scale soft robots have promising applications in minimally invasive surgery, targeted drug delivery, and bioengineering applications as they can directly and non-invasively access confined and hard-to-reach spaces in the human body. For such potential biomedical applications, the adaptivity of the robot control is essential to ensure the continuity of the operations, as task environment conditions show dynamic variations that can alter the robot’s motion and task performance. The applicability of the conventional modeling and control methods is further limited for soft robots at the small-scale owing to their kinematics with virtually infinite degrees of freedom, inherent stochastic variability during fabrication, and changing dynamics during real-world interactions. To address the controller adaptation challenge to dynamically changing task environments, we propose using a probabilistic learning approach for a millimeter-scale magnetic walking soft robot using Bayesian optimization (BO) and Gaussian processes (GPs). Our approach provides a data-efficient learning scheme by finding the gait controller parameters while optimizing the stride length of the walking soft millirobot using a small number of physical experiments. To demonstrate the controller adaptation, we test the walking gait of the robot in task environments with different surface adhesion and roughness, and medium viscosity, which aims to represent the possible conditions for future robotic tasks inside the human body. We further utilize the transfer of the learned GP parameters among different task spaces and robots and compare their efficacy on the improvement of data-efficient controller learning.
{"title":"Task space adaptation via the learning of gait controllers of magnetic soft millirobots","authors":"S. Demir, Utku Çulha, A. C. Karacakol, Abdon Pena‐Francesch, Sebastian Trimpe, M. Sitti","doi":"10.1177/02783649211021869","DOIUrl":"https://doi.org/10.1177/02783649211021869","url":null,"abstract":"Untethered small-scale soft robots have promising applications in minimally invasive surgery, targeted drug delivery, and bioengineering applications as they can directly and non-invasively access confined and hard-to-reach spaces in the human body. For such potential biomedical applications, the adaptivity of the robot control is essential to ensure the continuity of the operations, as task environment conditions show dynamic variations that can alter the robot’s motion and task performance. The applicability of the conventional modeling and control methods is further limited for soft robots at the small-scale owing to their kinematics with virtually infinite degrees of freedom, inherent stochastic variability during fabrication, and changing dynamics during real-world interactions. To address the controller adaptation challenge to dynamically changing task environments, we propose using a probabilistic learning approach for a millimeter-scale magnetic walking soft robot using Bayesian optimization (BO) and Gaussian processes (GPs). Our approach provides a data-efficient learning scheme by finding the gait controller parameters while optimizing the stride length of the walking soft millirobot using a small number of physical experiments. To demonstrate the controller adaptation, we test the walking gait of the robot in task environments with different surface adhesion and roughness, and medium viscosity, which aims to represent the possible conditions for future robotic tasks inside the human body. We further utilize the transfer of the learned GP parameters among different task spaces and robots and compare their efficacy on the improvement of data-efficient controller learning.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"40 1","pages":"1331 - 1351"},"PeriodicalIF":9.2,"publicationDate":"2021-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/02783649211021869","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49638544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-06-16DOI: 10.1177/02783649211069569
Ruinian Xu, Fu-Jen Chu, P. Vela
Contemporary grasp detection approaches employ deep learning to achieve robustness to sensor and object model uncertainty. The two dominant approaches design either grasp-quality scoring or anchor-based grasp recognition networks. This paper presents a different approach to grasp detection by treating it as keypoint detection in image-space. The deep network detects each grasp candidate as a pair of keypoints, convertible to the grasp representation g = {x,y,w,θ} T , rather than a triplet or quartet of corner points. Decreasing the detection difficulty by grouping keypoints into pairs boosts performance. To promote capturing dependencies between keypoints, a non-local module is incorporated into the network design. A final filtering strategy based on discrete and continuous orientation prediction removes false correspondences and further improves grasp detection performance. GKNet, the approach presented here, achieves a good balance between accuracy and speed on the Cornell and the abridged Jacquard datasets (96.9% and 98.39% at 41.67 and 23.26 fps). Follow-up experiments on a manipulator evaluate GKNet using four types of grasping experiments reflecting different nuisance sources: static grasping, dynamic grasping, grasping at varied camera angles, and bin picking. GKNet outperforms reference baselines in static and dynamic grasping experiments while showing robustness to varied camera viewpoints and moderate clutter. The results confirm the hypothesis that grasp keypoints are an effective output representation for deep grasp networks that provide robustness to expected nuisance factors.
{"title":"GKNet: Grasp keypoint network for grasp candidates detection","authors":"Ruinian Xu, Fu-Jen Chu, P. Vela","doi":"10.1177/02783649211069569","DOIUrl":"https://doi.org/10.1177/02783649211069569","url":null,"abstract":"Contemporary grasp detection approaches employ deep learning to achieve robustness to sensor and object model uncertainty. The two dominant approaches design either grasp-quality scoring or anchor-based grasp recognition networks. This paper presents a different approach to grasp detection by treating it as keypoint detection in image-space. The deep network detects each grasp candidate as a pair of keypoints, convertible to the grasp representation g = {x,y,w,θ} T , rather than a triplet or quartet of corner points. Decreasing the detection difficulty by grouping keypoints into pairs boosts performance. To promote capturing dependencies between keypoints, a non-local module is incorporated into the network design. A final filtering strategy based on discrete and continuous orientation prediction removes false correspondences and further improves grasp detection performance. GKNet, the approach presented here, achieves a good balance between accuracy and speed on the Cornell and the abridged Jacquard datasets (96.9% and 98.39% at 41.67 and 23.26 fps). Follow-up experiments on a manipulator evaluate GKNet using four types of grasping experiments reflecting different nuisance sources: static grasping, dynamic grasping, grasping at varied camera angles, and bin picking. GKNet outperforms reference baselines in static and dynamic grasping experiments while showing robustness to varied camera viewpoints and moderate clutter. The results confirm the hypothesis that grasp keypoints are an effective output representation for deep grasp networks that provide robustness to expected nuisance factors.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"41 1","pages":"361 - 389"},"PeriodicalIF":9.2,"publicationDate":"2021-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47035577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}