Krishan Rana, Vibhavari Dasagi, Jesse Haviland, Ben Talbot, Michael Milford, N. Sunderhauf
{"title":"Bayesian controller fusion: Leveraging control priors in deep reinforcement learning for robotics","authors":"Krishan Rana, Vibhavari Dasagi, Jesse Haviland, Ben Talbot, Michael Milford, N. Sunderhauf","doi":"10.1177/02783649231167210","DOIUrl":null,"url":null,"abstract":"We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs from each system, BCF arbitrates control between them, exploiting their respective strengths. We study BCF on two real-world robotics tasks involving navigation in a vast and long-horizon environment, and a complex reaching task that involves manipulability maximisation. For both these domains, simple handcrafted controllers exist that can solve the task at hand in a risk-averse manner but do not necessarily exhibit the optimal solution given limitations in analytical modelling, controller miscalibration and task variation. As exploration is naturally guided by the prior in the early stages of training, BCF accelerates learning, while substantially improving beyond the performance of the control prior, as the policy gains more experience. More importantly, given the risk-aversity of the control prior, BCF ensures safe exploration and deployment, where the control prior naturally dominates the action distribution in states unknown to the policy. We additionally show BCF’s applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real world. BCF is a promising approach towards combining the complementary strengths of deep RL and traditional robotic control, surpassing what either can achieve independently. The code and supplementary video material are made publicly available at https://krishanrana.github.io/bcf.","PeriodicalId":54942,"journal":{"name":"International Journal of Robotics Research","volume":"42 1","pages":"123 - 146"},"PeriodicalIF":7.5000,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Robotics Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/02783649231167210","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 11
Abstract
We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs from each system, BCF arbitrates control between them, exploiting their respective strengths. We study BCF on two real-world robotics tasks involving navigation in a vast and long-horizon environment, and a complex reaching task that involves manipulability maximisation. For both these domains, simple handcrafted controllers exist that can solve the task at hand in a risk-averse manner but do not necessarily exhibit the optimal solution given limitations in analytical modelling, controller miscalibration and task variation. As exploration is naturally guided by the prior in the early stages of training, BCF accelerates learning, while substantially improving beyond the performance of the control prior, as the policy gains more experience. More importantly, given the risk-aversity of the control prior, BCF ensures safe exploration and deployment, where the control prior naturally dominates the action distribution in states unknown to the policy. We additionally show BCF’s applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real world. BCF is a promising approach towards combining the complementary strengths of deep RL and traditional robotic control, surpassing what either can achieve independently. The code and supplementary video material are made publicly available at https://krishanrana.github.io/bcf.
期刊介绍:
The International Journal of Robotics Research (IJRR) has been a leading peer-reviewed publication in the field for over two decades. It holds the distinction of being the first scholarly journal dedicated to robotics research.
IJRR presents cutting-edge and thought-provoking original research papers, articles, and reviews that delve into groundbreaking trends, technical advancements, and theoretical developments in robotics. Renowned scholars and practitioners contribute to its content, offering their expertise and insights. This journal covers a wide range of topics, going beyond narrow technical advancements to encompass various aspects of robotics.
The primary aim of IJRR is to publish work that has lasting value for the scientific and technological advancement of the field. Only original, robust, and practical research that can serve as a foundation for further progress is considered for publication. The focus is on producing content that will remain valuable and relevant over time.
In summary, IJRR stands as a prestigious publication that drives innovation and knowledge in robotics research.