首页 > 最新文献

Conference on Robot Learning最新文献

英文 中文
Task-Relevant Failure Detection for Trajectory Predictors in Autonomous Vehicles 自动驾驶车辆轨迹预测器的任务相关故障检测
Pub Date : 2022-07-25 DOI: 10.48550/arXiv.2207.12380
Alec Farid, Sushant Veer, B. Ivanovic, Karen Leung, M. Pavone
In modern autonomy stacks, prediction modules are paramount to planning motions in the presence of other mobile agents. However, failures in prediction modules can mislead the downstream planner into making unsafe decisions. Indeed, the high uncertainty inherent to the task of trajectory forecasting ensures that such mispredictions occur frequently. Motivated by the need to improve safety of autonomous vehicles without compromising on their performance, we develop a probabilistic run-time monitor that detects when a"harmful"prediction failure occurs, i.e., a task-relevant failure detector. We achieve this by propagating trajectory prediction errors to the planning cost to reason about their impact on the AV. Furthermore, our detector comes equipped with performance measures on the false-positive and the false-negative rate and allows for data-free calibration. In our experiments we compared our detector with various others and found that our detector has the highest area under the receiver operator characteristic curve.
在现代自主堆栈中,预测模块对于在其他移动代理存在的情况下规划运动至关重要。然而,预测模块中的故障可能会误导下游规划人员做出不安全的决策。事实上,轨迹预测任务固有的高度不确定性确保了这种错误预测经常发生。为了在不影响其性能的情况下提高自动驾驶汽车的安全性,我们开发了一种概率运行时监视器,用于检测何时发生“有害”预测故障,即与任务相关的故障检测器。我们通过将轨迹预测误差传播到规划成本来推断其对自动驾驶的影响,从而实现这一目标。此外,我们的探测器配备了假阳性和假阴性率的性能测量,并允许无数据校准。在我们的实验中,我们将我们的探测器与其他各种探测器进行了比较,发现我们的探测器在接收机算子特征曲线下的面积最大。
{"title":"Task-Relevant Failure Detection for Trajectory Predictors in Autonomous Vehicles","authors":"Alec Farid, Sushant Veer, B. Ivanovic, Karen Leung, M. Pavone","doi":"10.48550/arXiv.2207.12380","DOIUrl":"https://doi.org/10.48550/arXiv.2207.12380","url":null,"abstract":"In modern autonomy stacks, prediction modules are paramount to planning motions in the presence of other mobile agents. However, failures in prediction modules can mislead the downstream planner into making unsafe decisions. Indeed, the high uncertainty inherent to the task of trajectory forecasting ensures that such mispredictions occur frequently. Motivated by the need to improve safety of autonomous vehicles without compromising on their performance, we develop a probabilistic run-time monitor that detects when a\"harmful\"prediction failure occurs, i.e., a task-relevant failure detector. We achieve this by propagating trajectory prediction errors to the planning cost to reason about their impact on the AV. Furthermore, our detector comes equipped with performance measures on the false-positive and the false-negative rate and allows for data-free calibration. In our experiments we compared our detector with various others and found that our detector has the highest area under the receiver operator characteristic curve.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132885354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models 语义抽象:基于2D视觉语言模型的开放世界3D场景理解
Pub Date : 2022-07-23 DOI: 10.48550/arXiv.2207.11514
Huy Ha, Shuran Song
We study open-world 3D scene understanding, a family of tasks that require agents to reason about their 3D environment with an open-set vocabulary and out-of-domain visual inputs - a critical skill for robots to operate in the unstructured 3D world. Towards this end, we propose Semantic Abstraction (SemAbs), a framework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities, while maintaining their zero-shot robustness. We achieve this abstraction using relevancy maps extracted from CLIP, and learn 3D spatial and geometric reasoning skills on top of those abstractions in a semantic-agnostic manner. We demonstrate the usefulness of SemAbs on two open-world 3D scene understanding tasks: 1) completing partially observed objects and 2) localizing hidden objects from language descriptions. Experiments show that SemAbs can generalize to novel vocabulary, materials/lighting, classes, and domains (i.e., real-world scans) from training on limited 3D synthetic data. Code and data is available at https://semantic-abstraction.cs.columbia.edu/
我们研究了开放世界3D场景理解,这是一系列任务,需要智能体用开放集词汇和域外视觉输入来推理他们的3D环境——这是机器人在非结构化3D世界中操作的关键技能。为此,我们提出了语义抽象(SemAbs)框架,该框架为2D视觉语言模型(VLMs)提供了新的3D空间功能,同时保持了它们的零距鲁棒性。我们使用从CLIP中提取的相关性图来实现这种抽象,并以语义不可知的方式在这些抽象之上学习3D空间和几何推理技能。我们展示了SemAbs在两个开放世界3D场景理解任务中的有用性:1)完成部分观察到的对象和2)从语言描述中定位隐藏对象。实验表明,SemAbs可以从有限的3D合成数据训练中推广到新的词汇、材料/照明、类别和领域(即真实世界的扫描)。代码和数据可在https://semantic-abstraction.cs.columbia.edu/上获得
{"title":"Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models","authors":"Huy Ha, Shuran Song","doi":"10.48550/arXiv.2207.11514","DOIUrl":"https://doi.org/10.48550/arXiv.2207.11514","url":null,"abstract":"We study open-world 3D scene understanding, a family of tasks that require agents to reason about their 3D environment with an open-set vocabulary and out-of-domain visual inputs - a critical skill for robots to operate in the unstructured 3D world. Towards this end, we propose Semantic Abstraction (SemAbs), a framework that equips 2D Vision-Language Models (VLMs) with new 3D spatial capabilities, while maintaining their zero-shot robustness. We achieve this abstraction using relevancy maps extracted from CLIP, and learn 3D spatial and geometric reasoning skills on top of those abstractions in a semantic-agnostic manner. We demonstrate the usefulness of SemAbs on two open-world 3D scene understanding tasks: 1) completing partially observed objects and 2) localizing hidden objects from language descriptions. Experiments show that SemAbs can generalize to novel vocabulary, materials/lighting, classes, and domains (i.e., real-world scans) from training on limited 3D synthetic data. Code and data is available at https://semantic-abstraction.cs.columbia.edu/","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134455272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Online Dynamics Learning for Predictive Control with an Application to Aerial Robots 基于在线动力学学习的预测控制及其在航空机器人中的应用
Pub Date : 2022-07-19 DOI: 10.48550/arXiv.2207.09344
Tom Z. Jiahao, K. Y. Chee, M. A. Hsieh
In this work, we consider the task of improving the accuracy of dynamic models for model predictive control (MPC) in an online setting. Although prediction models can be learned and applied to model-based controllers, these models are often learned offline. In this offline setting, training data is first collected and a prediction model is learned through an elaborated training procedure. However, since the model is learned offline, it does not adapt to disturbances or model errors observed during deployment. To improve the adaptiveness of the model and the controller, we propose an online dynamics learning framework that continually improves the accuracy of the dynamic model during deployment. We adopt knowledge-based neural ordinary differential equations (KNODE) as the dynamic models, and use techniques inspired by transfer learning to continually improve the model accuracy. We demonstrate the efficacy of our framework with a quadrotor, and verify the framework in both simulations and physical experiments. Results show that our approach can account for disturbances that are possibly time-varying, while maintaining good trajectory tracking performance.
在这项工作中,我们考虑了在在线设置中提高模型预测控制(MPC)动态模型精度的任务。虽然预测模型可以学习并应用于基于模型的控制器,但这些模型通常是离线学习的。在这种离线设置中,首先收集训练数据,并通过详细的训练过程学习预测模型。然而,由于模型是离线学习的,因此它不能适应部署期间观察到的干扰或模型错误。为了提高模型和控制器的自适应能力,我们提出了一个在线动态学习框架,该框架在部署过程中不断提高动态模型的准确性。我们采用基于知识的神经常微分方程(KNODE)作为动态模型,并利用迁移学习启发的技术不断提高模型的精度。我们用四旋翼飞行器证明了我们的框架的有效性,并在仿真和物理实验中验证了该框架。结果表明,我们的方法可以解释可能时变的干扰,同时保持良好的轨迹跟踪性能。
{"title":"Online Dynamics Learning for Predictive Control with an Application to Aerial Robots","authors":"Tom Z. Jiahao, K. Y. Chee, M. A. Hsieh","doi":"10.48550/arXiv.2207.09344","DOIUrl":"https://doi.org/10.48550/arXiv.2207.09344","url":null,"abstract":"In this work, we consider the task of improving the accuracy of dynamic models for model predictive control (MPC) in an online setting. Although prediction models can be learned and applied to model-based controllers, these models are often learned offline. In this offline setting, training data is first collected and a prediction model is learned through an elaborated training procedure. However, since the model is learned offline, it does not adapt to disturbances or model errors observed during deployment. To improve the adaptiveness of the model and the controller, we propose an online dynamics learning framework that continually improves the accuracy of the dynamic model during deployment. We adopt knowledge-based neural ordinary differential equations (KNODE) as the dynamic models, and use techniques inspired by transfer learning to continually improve the model accuracy. We demonstrate the efficacy of our framework with a quadrotor, and verify the framework in both simulations and physical experiments. Results show that our approach can account for disturbances that are possibly time-varying, while maintaining good trajectory tracking performance.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131938169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
QuaDUE-CCM: Interpretable Distributional Reinforcement Learning using Uncertain Contraction Metrics for Precise Quadrotor Trajectory Tracking 使用不确定收缩度量进行精确四旋翼轨迹跟踪的可解释分布强化学习
Pub Date : 2022-07-15 DOI: 10.48550/arXiv.2207.07789
Yanran Wang, James O’Keeffe, Qiuchen Qian, David E. Boyle
Accuracy and stability are common requirements for Quadrotor trajectory tracking systems. Designing an accurate and stable tracking controller remains challenging, particularly in unknown and dynamic environments with complex aerodynamic disturbances. We propose a Quantile-approximation-based Distributional-reinforced Uncertainty Estimator (QuaDUE) to accurately identify the effects of aerodynamic disturbances, i.e., the uncertainties between the true and estimated Control Contraction Metrics (CCMs). Taking inspiration from contraction theory and integrating the QuaDUE for uncertainties, our novel CCM-based trajectory tracking framework tracks any feasible reference trajectory precisely whilst guaranteeing exponential convergence. More importantly, the convergence and training acceleration of the distributional RL are guaranteed and analyzed, respectively, from theoretical perspectives. We also demonstrate our system under unknown and diverse aerodynamic forces. Under large aerodynamic forces (>2m/s^2), compared with the classic data-driven approach, our QuaDUE-CCM achieves at least a 56.6% improvement in tracking error. Compared with QuaDRED-MPC, a distributional RL-based approach, QuaDUE-CCM achieves at least a 3 times improvement in contraction rate.
精度和稳定性是四旋翼飞行器轨迹跟踪系统的共同要求。设计一个精确和稳定的跟踪控制器仍然是一个挑战,特别是在未知和动态的环境中,复杂的空气动力学干扰。我们提出了一个基于分位数近似的分布增强不确定性估计器(QuaDUE)来准确识别气动干扰的影响,即真实和估计的控制收缩度量(ccm)之间的不确定性。从收缩理论中获得灵感,并对不确定性进行积分,我们的新型基于ccm的轨迹跟踪框架精确地跟踪任何可行的参考轨迹,同时保证指数收敛。更重要的是,从理论角度对分布式强化学习的收敛性和训练加速性进行了保证和分析。我们还演示了我们的系统在未知的和不同的空气动力。在较大的空气动力(>2m/s^2)下,与经典的数据驱动方法相比,我们的QuaDUE-CCM的跟踪误差至少提高了56.6%。与基于分布式rl的QuaDRED-MPC方法相比,QuaDUE-CCM的收缩速率提高了至少3倍。
{"title":"QuaDUE-CCM: Interpretable Distributional Reinforcement Learning using Uncertain Contraction Metrics for Precise Quadrotor Trajectory Tracking","authors":"Yanran Wang, James O’Keeffe, Qiuchen Qian, David E. Boyle","doi":"10.48550/arXiv.2207.07789","DOIUrl":"https://doi.org/10.48550/arXiv.2207.07789","url":null,"abstract":"Accuracy and stability are common requirements for Quadrotor trajectory tracking systems. Designing an accurate and stable tracking controller remains challenging, particularly in unknown and dynamic environments with complex aerodynamic disturbances. We propose a Quantile-approximation-based Distributional-reinforced Uncertainty Estimator (QuaDUE) to accurately identify the effects of aerodynamic disturbances, i.e., the uncertainties between the true and estimated Control Contraction Metrics (CCMs). Taking inspiration from contraction theory and integrating the QuaDUE for uncertainties, our novel CCM-based trajectory tracking framework tracks any feasible reference trajectory precisely whilst guaranteeing exponential convergence. More importantly, the convergence and training acceleration of the distributional RL are guaranteed and analyzed, respectively, from theoretical perspectives. We also demonstrate our system under unknown and diverse aerodynamic forces. Under large aerodynamic forces (>2m/s^2), compared with the classic data-driven approach, our QuaDUE-CCM achieves at least a 56.6% improvement in tracking error. Compared with QuaDRED-MPC, a distributional RL-based approach, QuaDUE-CCM achieves at least a 3 times improvement in contraction rate.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116393845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops i-Sim2Real:紧密人机交互循环中机器人策略的强化学习
Pub Date : 2022-07-14 DOI: 10.48550/arXiv.2207.06572
Saminda Abeyruwan, L. Graesser, David B. D'Ambrosio, Avi Singh, A. Shankar, A. Bewley, Deepali Jain, K. Choromanski, P. Sanketi
Sim-to-real transfer is a powerful paradigm for robotic reinforcement learning. The ability to train policies in simulation enables safe exploration and large-scale data collection quickly at low cost. However, prior works in sim-to-real transfer of robotic policies typically do not involve any human-robot interaction because accurately simulating human behavior is an open problem. In this work, our goal is to leverage the power of simulation to train robotic policies that are proficient at interacting with humans upon deployment. But there is a chicken and egg problem -- how to gather examples of a human interacting with a physical robot so as to model human behavior in simulation without already having a robot that is able to interact with a human? Our proposed method, Iterative-Sim-to-Real (i-S2R), attempts to address this. i-S2R bootstraps from a simple model of human behavior and alternates between training in simulation and deploying in the real world. In each iteration, both the human behavior model and the policy are refined. For all training we apply a new evolutionary search algorithm called Blackbox Gradient Sensing (BGS). We evaluate our method on a real world robotic table tennis setting, where the objective for the robot is to play cooperatively with a human player for as long as possible. Table tennis is a high-speed, dynamic task that requires the two players to react quickly to each other's moves, making for a challenging test bed for research on human-robot interaction. We present results on an industrial robotic arm that is able to cooperatively play table tennis with human players, achieving rallies of 22 successive hits on average and 150 at best. Further, for 80% of players, rally lengths are 70% to 175% longer compared to the sim-to-real plus fine-tuning (S2R+FT) baseline. For videos of our system in action, please see https://sites.google.com/view/is2r.
模拟到真实的迁移是机器人强化学习的一个强大范例。在模拟中训练策略的能力使安全探索和低成本快速大规模数据收集成为可能。然而,先前在机器人策略的模拟到真实迁移方面的工作通常不涉及任何人机交互,因为准确模拟人类行为是一个开放的问题。在这项工作中,我们的目标是利用模拟的力量来训练机器人策略,这些策略在部署时能够熟练地与人类交互。但这是一个先有鸡还是先有蛋的问题——在没有能够与人类互动的机器人的情况下,如何收集人类与物理机器人互动的例子,从而在模拟中模拟人类的行为?我们提出的方法,迭代模拟到真实(i-S2R),试图解决这个问题。i-S2R从一个简单的人类行为模型出发,在模拟训练和在现实世界中部署之间交替进行。在每次迭代中,人类行为模型和策略都得到了改进。对于所有的训练,我们应用了一种新的进化搜索算法,称为黑盒梯度传感(BGS)。我们在真实世界的机器人乒乓球设置中评估了我们的方法,其中机器人的目标是尽可能长时间地与人类球员合作。乒乓球是一项高速、动态的运动,需要两名选手对对方的动作做出快速反应,这为研究人机交互提供了一个具有挑战性的试验台。我们展示了一个工业机械臂的结果,它能够与人类运动员合作打乒乓球,平均连续击球22次,最多150次。此外,对于80%的玩家来说,与模拟到真实+微调(S2R+FT)基线相比,拉力赛长度要长70%至175%。有关我们系统运行的视频,请访问https://sites.google.com/view/is2r。
{"title":"i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops","authors":"Saminda Abeyruwan, L. Graesser, David B. D'Ambrosio, Avi Singh, A. Shankar, A. Bewley, Deepali Jain, K. Choromanski, P. Sanketi","doi":"10.48550/arXiv.2207.06572","DOIUrl":"https://doi.org/10.48550/arXiv.2207.06572","url":null,"abstract":"Sim-to-real transfer is a powerful paradigm for robotic reinforcement learning. The ability to train policies in simulation enables safe exploration and large-scale data collection quickly at low cost. However, prior works in sim-to-real transfer of robotic policies typically do not involve any human-robot interaction because accurately simulating human behavior is an open problem. In this work, our goal is to leverage the power of simulation to train robotic policies that are proficient at interacting with humans upon deployment. But there is a chicken and egg problem -- how to gather examples of a human interacting with a physical robot so as to model human behavior in simulation without already having a robot that is able to interact with a human? Our proposed method, Iterative-Sim-to-Real (i-S2R), attempts to address this. i-S2R bootstraps from a simple model of human behavior and alternates between training in simulation and deploying in the real world. In each iteration, both the human behavior model and the policy are refined. For all training we apply a new evolutionary search algorithm called Blackbox Gradient Sensing (BGS). We evaluate our method on a real world robotic table tennis setting, where the objective for the robot is to play cooperatively with a human player for as long as possible. Table tennis is a high-speed, dynamic task that requires the two players to react quickly to each other's moves, making for a challenging test bed for research on human-robot interaction. We present results on an industrial robotic arm that is able to cooperatively play table tennis with human players, achieving rallies of 22 successive hits on average and 150 at best. Further, for 80% of players, rally lengths are 70% to 175% longer compared to the sim-to-real plus fine-tuning (S2R+FT) baseline. For videos of our system in action, please see https://sites.google.com/view/is2r.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114809951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Inner Monologue: Embodied Reasoning through Planning with Language Models 内心独白:通过语言模型规划的具体化推理
Pub Date : 2022-07-12 DOI: 10.48550/arXiv.2207.05608
Wenlong Huang, F. Xia, Ted Xiao, Harris Chan, Jacky Liang, Peter R. Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, P. Sermanet, Noah Brown, Tomas Jackson, Linda Luu, S. Levine, Karol Hausman, Brian Ichter
Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots. These embodied problems require an agent to understand many semantic aspects of the world: the repertoire of skills available, how these skills influence the world, and how changes to the world map back to the language. LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them - answers that change over time in response to the agent's own choices. In this work, we investigate to what extent LLMs used in such embodied contexts can reason over sources of feedback provided through natural language, without any additional training. We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios. We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction. We find that closed-loop language feedback significantly improves high-level instruction completion on three domains, including simulated and real table top rearrangement tasks and long-horizon mobile manipulation tasks in a kitchen environment in the real world.
最近的研究表明,大型语言模型(llm)的推理能力可以应用于自然语言处理之外的领域,例如机器人的规划和交互。这些具体化的问题要求智能体理解世界的许多语义方面:可用的技能库,这些技能如何影响世界,以及世界的变化如何映射回语言。在具体化环境中进行规划的法学硕士不仅需要考虑要做什么技能,还需要考虑如何以及何时做这些技能——这些答案会随着时间的推移而变化,以响应代理自己的选择。在这项工作中,我们研究了在没有任何额外训练的情况下,在这种具体环境中使用的法学硕士在多大程度上可以对通过自然语言提供的反馈来源进行推理。我们提出,通过利用环境反馈,llm能够形成一种内心独白,使他们能够更丰富地处理和规划机器人控制场景。我们研究了各种各样的反馈来源,如成功检测、场景描述和人类互动。我们发现闭环语言反馈显著提高了三个领域的高级指令完成,包括模拟和真实的桌面重排任务和现实世界厨房环境中的长视距移动操作任务。
{"title":"Inner Monologue: Embodied Reasoning through Planning with Language Models","authors":"Wenlong Huang, F. Xia, Ted Xiao, Harris Chan, Jacky Liang, Peter R. Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, P. Sermanet, Noah Brown, Tomas Jackson, Linda Luu, S. Levine, Karol Hausman, Brian Ichter","doi":"10.48550/arXiv.2207.05608","DOIUrl":"https://doi.org/10.48550/arXiv.2207.05608","url":null,"abstract":"Recent works have shown how the reasoning capabilities of Large Language Models (LLMs) can be applied to domains beyond natural language processing, such as planning and interaction for robots. These embodied problems require an agent to understand many semantic aspects of the world: the repertoire of skills available, how these skills influence the world, and how changes to the world map back to the language. LLMs planning in embodied environments need to consider not just what skills to do, but also how and when to do them - answers that change over time in response to the agent's own choices. In this work, we investigate to what extent LLMs used in such embodied contexts can reason over sources of feedback provided through natural language, without any additional training. We propose that by leveraging environment feedback, LLMs are able to form an inner monologue that allows them to more richly process and plan in robotic control scenarios. We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction. We find that closed-loop language feedback significantly improves high-level instruction completion on three domains, including simulated and real table top rearrangement tasks and long-horizon mobile manipulation tasks in a kitchen environment in the real world.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114696425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 324
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning 不要从零开始:利用先验数据自动化机器人强化学习
Pub Date : 2022-07-11 DOI: 10.48550/arXiv.2207.04703
Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, S. Levine
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems. However, in practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment. Moreover, robotic policies learned with RL often fail when deployed beyond the carefully controlled setting in which they were learned. In this work, we study how these challenges can all be tackled by effective utilization of diverse offline datasets collected from previously seen tasks. When faced with a new task, our system adapts previously learned skills to quickly learn to both perform the new task and return the environment to an initial state, effectively performing its own environment reset. Our empirical results demonstrate that incorporating prior data into robotic reinforcement learning enables autonomous learning, substantially improves sample-efficiency of learning, and enables better generalization. Project website: https://sites.google.com/view/ariel-berkeley/
强化学习(RL)算法有望实现机器人系统的自主技能获取。然而,在实践中,现实世界的机器人强化学习通常需要耗时的数据收集和频繁的人为干预来重置环境。此外,通过强化学习学到的机器人策略,在被部署到精心控制的学习环境之外时,往往会失败。在这项工作中,我们研究了如何通过有效利用从以前看到的任务中收集的各种离线数据集来解决这些挑战。当面对一个新任务时,我们的系统会适应以前学习过的技能,快速学习执行新任务并将环境返回到初始状态,有效地执行自己的环境重置。项目网站:https://sites.google.com/view/ariel-berkeley/
{"title":"Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning","authors":"Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, S. Levine","doi":"10.48550/arXiv.2207.04703","DOIUrl":"https://doi.org/10.48550/arXiv.2207.04703","url":null,"abstract":"Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems. However, in practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment. Moreover, robotic policies learned with RL often fail when deployed beyond the carefully controlled setting in which they were learned. In this work, we study how these challenges can all be tackled by effective utilization of diverse offline datasets collected from previously seen tasks. When faced with a new task, our system adapts previously learned skills to quickly learn to both perform the new task and return the environment to an initial state, effectively performing its own environment reset. Our empirical results demonstrate that incorporating prior data into robotic reinforcement learning enables autonomous learning, substantially improves sample-efficiency of learning, and enables better generalization. Project website: https://sites.google.com/view/ariel-berkeley/","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114362841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Taskography: Evaluating robot task planning over large 3D scene graphs 任务图:在大型3D场景图上评估机器人任务规划
Pub Date : 2022-07-11 DOI: 10.48550/arXiv.2207.05006
Christopher Agia, Krishna Murthy Jatavallabhula, M. Khodeir, O. Mikšík, Vibhav Vineet, Mustafa Mukadam, L. Paull, F. Shkurti
: 3D scene graphs (3DSGs) are an emerging description; unifying symbolic, topological, and metric scene representations. However, typical 3DSGs contain hundreds of objects and symbols even for small environments; rendering task planning on the full graph impractical. We construct T ASKOGRAPHY , the first large-scale robotic task planning benchmark over 3DSGs. While most benchmarking efforts in this area focus on vision-based planning , we systemati-cally study symbolic planning, to decouple planning performance from visual rep-resentation learning. We observe that, among existing methods, neither classical nor learning-based planners are capable of real-time planning over full 3DSGs. Enabling real-time planning demands progress on both (a) sparsifying 3DSGs for tractable planning and (b) designing planners that better exploit 3DSG hierarchies. Towards the former goal, we propose SCRUB , a task-conditioned 3DSG sparsification method; enabling classical planners to match and in some cases sur-pass state-of-the-art learning-based planners. Towards the latter goal, we propose SEEK , a procedure enabling learning-based planners to exploit 3DSG structure, reducing the number of replanning queries required by current best approaches by an order of magnitude. We will open-source all code and baselines to spur further research along the intersections of robot task planning, learning and 3DSGs.
: 3D场景图(3dsg)是一种新兴的描述;统一符号、拓扑和度量场景表示。然而,典型的3dsg包含数百个对象和符号,即使是小型环境;在全图上绘制任务规划不切实际。我们构建了T ASKOGRAPHY,这是第一个基于3dsg的大规模机器人任务规划基准。虽然该领域的大多数基准工作都集中在基于视觉的规划上,但我们系统地研究了符号规划,将规划绩效与视觉表征学习分离开来。我们观察到,在现有的方法中,无论是经典的还是基于学习的规划器都不能对整个3dsg进行实时规划。实现实时规划需要在以下两个方面取得进展:(a)简化3DSG以进行易于处理的规划;(b)设计更好地利用3DSG层次结构的规划器。针对前一个目标,我们提出了任务条件3DSG稀疏化方法SCRUB;使传统的规划者能够匹配并在某些情况下超越最先进的学习型规划者。为了实现后一个目标,我们提出了SEEK,这是一个使基于学习的规划器能够利用3DSG结构的过程,将当前最佳方法所需的重新规划查询数量减少了一个数量级。我们将开放所有代码和基线,以促进机器人任务规划、学习和3dsg交叉领域的进一步研究。
{"title":"Taskography: Evaluating robot task planning over large 3D scene graphs","authors":"Christopher Agia, Krishna Murthy Jatavallabhula, M. Khodeir, O. Mikšík, Vibhav Vineet, Mustafa Mukadam, L. Paull, F. Shkurti","doi":"10.48550/arXiv.2207.05006","DOIUrl":"https://doi.org/10.48550/arXiv.2207.05006","url":null,"abstract":": 3D scene graphs (3DSGs) are an emerging description; unifying symbolic, topological, and metric scene representations. However, typical 3DSGs contain hundreds of objects and symbols even for small environments; rendering task planning on the full graph impractical. We construct T ASKOGRAPHY , the first large-scale robotic task planning benchmark over 3DSGs. While most benchmarking efforts in this area focus on vision-based planning , we systemati-cally study symbolic planning, to decouple planning performance from visual rep-resentation learning. We observe that, among existing methods, neither classical nor learning-based planners are capable of real-time planning over full 3DSGs. Enabling real-time planning demands progress on both (a) sparsifying 3DSGs for tractable planning and (b) designing planners that better exploit 3DSG hierarchies. Towards the former goal, we propose SCRUB , a task-conditioned 3DSG sparsification method; enabling classical planners to match and in some cases sur-pass state-of-the-art learning-based planners. Towards the latter goal, we propose SEEK , a procedure enabling learning-based planners to exploit 3DSG structure, reducing the number of replanning queries required by current best approaches by an order of magnitude. We will open-source all code and baselines to spur further research along the intersections of robot task planning, learning and 3DSGs.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130087914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action LM-Nav:机器人导航与语言,视觉和动作的大型预训练模型
Pub Date : 2022-07-10 DOI: 10.48550/arXiv.2207.04429
Dhruv Shah, B. Osinski, Brian Ichter, S. Levine
Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings. However, particularly in vision-based settings where specifying goals requires an image, this makes for an unnatural interface. Language provides a more convenient modality for communication with robots, but contemporary methods typically require expensive supervision, in the form of trajectories annotated with language descriptions. We present a system, LM-Nav, for robotic navigation that enjoys the benefits of training on unannotated large datasets of trajectories, while still providing a high-level interface to the user. Instead of utilizing a labeled instruction following dataset, we show that such a system can be constructed entirely out of pre-trained models for navigation (ViNG), image-language association (CLIP), and language modeling (GPT-3), without requiring any fine-tuning or language-annotated robot data. We instantiate LM-Nav on a real-world mobile robot and demonstrate long-horizon navigation through complex, outdoor environments from natural language instructions. For videos of our experiments, code release, and an interactive Colab notebook that runs in your browser, please check out our project page https://sites.google.com/view/lmnav
机器人导航的目标条件策略可以在大型、无注释的数据集上进行训练,为现实世界的设置提供良好的泛化。然而,特别是在基于视觉的设置中,指定目标需要图像,这使得界面不自然。语言为与机器人的交流提供了一种更方便的方式,但当代的方法通常需要昂贵的监督,以语言描述注释的轨迹的形式。我们提出了一个用于机器人导航的系统,LM-Nav,它可以在未注释的大型轨迹数据集上进行训练,同时仍然为用户提供高级界面。我们表明,这样的系统可以完全由预先训练的导航(ViNG)、图像语言关联(CLIP)和语言建模(GPT-3)模型构建,而不需要任何微调或语言注释的机器人数据。我们在现实世界的移动机器人上实例化了LM-Nav,并通过自然语言指令演示了在复杂的室外环境中进行长视距导航。有关我们的实验视频,代码发布,以及在浏览器中运行的交互式Colab笔记本,请查看我们的项目页面https://sites.google.com/view/lmnav
{"title":"LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action","authors":"Dhruv Shah, B. Osinski, Brian Ichter, S. Levine","doi":"10.48550/arXiv.2207.04429","DOIUrl":"https://doi.org/10.48550/arXiv.2207.04429","url":null,"abstract":"Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings. However, particularly in vision-based settings where specifying goals requires an image, this makes for an unnatural interface. Language provides a more convenient modality for communication with robots, but contemporary methods typically require expensive supervision, in the form of trajectories annotated with language descriptions. We present a system, LM-Nav, for robotic navigation that enjoys the benefits of training on unannotated large datasets of trajectories, while still providing a high-level interface to the user. Instead of utilizing a labeled instruction following dataset, we show that such a system can be constructed entirely out of pre-trained models for navigation (ViNG), image-language association (CLIP), and language modeling (GPT-3), without requiring any fine-tuning or language-annotated robot data. We instantiate LM-Nav on a real-world mobile robot and demonstrate long-horizon navigation through complex, outdoor environments from natural language instructions. For videos of our experiments, code release, and an interactive Colab notebook that runs in your browser, please check out our project page https://sites.google.com/view/lmnav","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122961686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 140
NeuralGrasps: Learning Implicit Representations for Grasps of Multiple Robotic Hands NeuralGrasps:学习多机械手抓取的隐式表示
Pub Date : 2022-07-06 DOI: 10.48550/arXiv.2207.02959
Ninad Khargonkar, Neil Song, Zesheng Xu, B. Prabhakaran, Yu Xiang
We introduce a neural implicit representation for grasps of objects from multiple robotic hands. Different grasps across multiple robotic hands are encoded into a shared latent space. Each latent vector is learned to decode to the 3D shape of an object and the 3D shape of a robotic hand in a grasping pose in terms of the signed distance functions of the two 3D shapes. In addition, the distance metric in the latent space is learned to preserve the similarity between grasps across different robotic hands, where the similarity of grasps is defined according to contact regions of the robotic hands. This property enables our method to transfer grasps between different grippers including a human hand, and grasp transfer has the potential to share grasping skills between robots and enable robots to learn grasping skills from humans. Furthermore, the encoded signed distance functions of objects and grasps in our implicit representation can be used for 6D object pose estimation with grasping contact optimization from partial point clouds, which enables robotic grasping in the real world.
我们引入了一种神经隐式表示,用于从多个机械手抓取物体。多个机器人手的不同抓取被编码成一个共享的潜在空间。学习每个潜在向量,根据两个三维形状的符号距离函数解码为物体的三维形状和机械手在抓取姿势下的三维形状。此外,学习了隐空间中的距离度量,以保持不同机械手之间抓取的相似性,其中根据机械手的接触区域定义抓取的相似性。这一特性使我们的方法能够在包括人手在内的不同抓取器之间转移抓取,并且抓取转移有可能在机器人之间共享抓取技能,并使机器人能够从人类那里学习抓取技能。此外,在我们的隐式表示中,物体和抓取的编码符号距离函数可以用于基于部分点云的抓取接触优化的6D物体姿态估计,从而实现现实世界中的机器人抓取。
{"title":"NeuralGrasps: Learning Implicit Representations for Grasps of Multiple Robotic Hands","authors":"Ninad Khargonkar, Neil Song, Zesheng Xu, B. Prabhakaran, Yu Xiang","doi":"10.48550/arXiv.2207.02959","DOIUrl":"https://doi.org/10.48550/arXiv.2207.02959","url":null,"abstract":"We introduce a neural implicit representation for grasps of objects from multiple robotic hands. Different grasps across multiple robotic hands are encoded into a shared latent space. Each latent vector is learned to decode to the 3D shape of an object and the 3D shape of a robotic hand in a grasping pose in terms of the signed distance functions of the two 3D shapes. In addition, the distance metric in the latent space is learned to preserve the similarity between grasps across different robotic hands, where the similarity of grasps is defined according to contact regions of the robotic hands. This property enables our method to transfer grasps between different grippers including a human hand, and grasp transfer has the potential to share grasping skills between robots and enable robots to learn grasping skills from humans. Furthermore, the encoded signed distance functions of objects and grasps in our implicit representation can be used for 6D object pose estimation with grasping contact optimization from partial point clouds, which enables robotic grasping in the real world.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115513007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
Conference on Robot Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1