Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks

Conference on Robot Learning Pub Date : 2022-10-12 DOI:10.48550/arXiv.2210.06601

Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Ge Yan, S. Levine

{"title":"Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks","authors":"Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Ge Yan, S. Levine","doi":"10.48550/arXiv.2210.06601","DOIUrl":null,"url":null,"abstract":"The utilization of broad datasets has proven to be crucial for generalization for a wide range of fields. However, how to effectively make use of diverse multi-task data for novel downstream tasks still remains a grand challenge in robotics. To tackle this challenge, we introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data, in combination with online fine-tuning guided by subgoals in learned lossy representation space. When faced with a novel task goal, the framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems. Learned from the broad data, the lossy representation emphasizes task-relevant information about states and goals while abstracting away redundant contexts that hinder generalization. It thus enables subgoal planning for unseen tasks, provides a compact input to the policy, and facilitates reward shaping during fine-tuning. We show that our framework can be pre-trained on large-scale datasets of robot experiences from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Robot Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.06601","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

The utilization of broad datasets has proven to be crucial for generalization for a wide range of fields. However, how to effectively make use of diverse multi-task data for novel downstream tasks still remains a grand challenge in robotics. To tackle this challenge, we introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data, in combination with online fine-tuning guided by subgoals in learned lossy representation space. When faced with a novel task goal, the framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems. Learned from the broad data, the lossy representation emphasizes task-relevant information about states and goals while abstracting away redundant contexts that hinder generalization. It thus enables subgoal planning for unseen tasks, provides a compact input to the policy, and facilitates reward shaping during fine-tuning. We show that our framework can be pre-trained on large-scale datasets of robot experiences from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

有损启示的泛化:利用广泛的离线数据学习视觉运动任务

广泛数据集的利用已被证明对广泛领域的泛化至关重要。然而，如何有效地利用多样化的多任务数据进行新颖的下游任务，仍然是机器人技术面临的一个重大挑战。为了应对这一挑战，我们引入了一个框架，该框架通过在广泛数据上的离线强化学习，结合由学习有损表示空间中的子目标指导的在线微调，为看不见的临时扩展任务获取目标条件策略。当面对新的任务目标时，该框架使用一个功能模型来规划一系列有损表示作为子目标，这些子目标将原始任务分解为更容易的问题。从广泛的数据中学习，有损表示强调关于状态和目标的任务相关信息，同时抽象掉阻碍泛化的冗余上下文。因此，它可以为不可见的任务进行子目标规划，为策略提供紧凑的输入，并在微调期间促进奖励形成。我们表明，我们的框架可以在先前工作的机器人经验的大规模数据集上进行预训练，并有效地对新任务进行微调，完全来自视觉输入，而无需任何手动奖励工程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Conference on Robot Learning

自引率

0.00%

发文量

期刊最新文献

MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models Lidar Line Selection with Spatially-Aware Shapley Value for Cost-Efficient Depth Completion Safe Robot Learning in Assistive Devices through Neural Network Repair COACH: Cooperative Robot Teaching Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping