Adaptive Language-Guided Abstraction from Contrastive Explanations

arXiv - CS - Robotics Pub Date : 2024-09-12 DOI:arxiv-2409.08212

Andi Peng, Belinda Z. Li, Ilia Sucholutsky, Nishanth Kumar, Julie A. Shah, Jacob Andreas, Andreea Bobu

{"title":"Adaptive Language-Guided Abstraction from Contrastive Explanations","authors":"Andi Peng, Belinda Z. Li, Ilia Sucholutsky, Nishanth Kumar, Julie A. Shah, Jacob Andreas, Andreea Bobu","doi":"arxiv-2409.08212","DOIUrl":null,"url":null,"abstract":"Many approaches to robot learning begin by inferring a reward function from a\nset of human demonstrations. To learn a good reward, it is necessary to\ndetermine which features of the environment are relevant before determining how\nthese features should be used to compute reward. End-to-end methods for joint\nfeature and reward learning (e.g., using deep networks or program synthesis\ntechniques) often yield brittle reward functions that are sensitive to spurious\nstate features. By contrast, humans can often generalizably learn from a small\nnumber of demonstrations by incorporating strong priors about what features of\na demonstration are likely meaningful for a task of interest. How do we build\nrobots that leverage this kind of background knowledge when learning from new\ndemonstrations? This paper describes a method named ALGAE (Adaptive\nLanguage-Guided Abstraction from [Contrastive] Explanations) which alternates\nbetween using language models to iteratively identify human-meaningful features\nneeded to explain demonstrated behavior, then standard inverse reinforcement\nlearning techniques to assign weights to these features. Experiments across a\nvariety of both simulated and real-world robot environments show that ALGAE\nlearns generalizable reward functions defined on interpretable features using\nonly small numbers of demonstrations. Importantly, ALGAE can recognize when\nfeatures are missing, then extract and define those features without any human\ninput -- making it possible to quickly and efficiently acquire rich\nrepresentations of user behavior.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":"62 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08212","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Many approaches to robot learning begin by inferring a reward function from a set of human demonstrations. To learn a good reward, it is necessary to determine which features of the environment are relevant before determining how these features should be used to compute reward. End-to-end methods for joint feature and reward learning (e.g., using deep networks or program synthesis techniques) often yield brittle reward functions that are sensitive to spurious state features. By contrast, humans can often generalizably learn from a small number of demonstrations by incorporating strong priors about what features of a demonstration are likely meaningful for a task of interest. How do we build robots that leverage this kind of background knowledge when learning from new demonstrations? This paper describes a method named ALGAE (Adaptive Language-Guided Abstraction from [Contrastive] Explanations) which alternates between using language models to iteratively identify human-meaningful features needed to explain demonstrated behavior, then standard inverse reinforcement learning techniques to assign weights to these features. Experiments across a variety of both simulated and real-world robot environments show that ALGAE learns generalizable reward functions defined on interpretable features using only small numbers of demonstrations. Importantly, ALGAE can recognize when features are missing, then extract and define those features without any human input -- making it possible to quickly and efficiently acquire rich representations of user behavior.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自适应语言引导下的对比解释抽象

许多机器人学习方法都是从一组人类示范开始推断奖励函数。要想学习到好的奖励，必须先确定环境中哪些特征是相关的，然后再确定如何使用这些特征来计算奖励。联合特征和奖励学习的端到端方法（如使用深度网络或程序合成技术）通常会产生对虚假状态特征敏感的脆弱奖励函数。相比之下，人类通常可以从少量的演示中进行泛化学习，方法是将演示中哪些特征可能对感兴趣的任务有意义纳入强大的先验。我们该如何构建机器人，以便在学习新演示时利用这种背景知识呢？本文介绍了一种名为 ALGAE（AdaptiveLanguage-GuidedAbstractionfrom[Contrastive]Explanations）的方法，它可以交替使用语言模型迭代识别解释演示行为所需的人类有意义特征，然后使用标准反强化学习技术为这些特征分配权重。在各种模拟和真实世界机器人环境中进行的实验表明，ALGAE 只需少量演示就能学习到定义在可解释特征上的通用奖励函数。重要的是，ALGAE 能够识别特征缺失的情况，然后提取并定义这些特征，而不需要任何人工输入--这使得快速高效地获取丰富的用户行为表现成为可能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Robotics

自引率

0.00%

发文量