Multi-Task Hierarchical Imitation Learning for Home Automation

Roy Fox, R. Berenstein, I. Stoica, Ken Goldberg
{"title":"Multi-Task Hierarchical Imitation Learning for Home Automation","authors":"Roy Fox, R. Berenstein, I. Stoica, Ken Goldberg","doi":"10.1109/COASE.2019.8843293","DOIUrl":null,"url":null,"abstract":"Control policies for home automation robots can be learned from human demonstrations, and hierarchical control has the potential to reduce the required number of demonstrations. When learning multiple policies for related tasks, demonstrations can be reused between the tasks to further reduce the number of demonstrations needed to learn each new policy. We present HIL-MT, a framework for Multi-Task Hierarchical Imitation Learning, involving a human teacher, a networked Toyota HSR robot, and a cloud-based server that stores demonstrations and trains models. In our experiments, HIL-MT learns a policy for clearing a table of dishes from 11.2 demonstrations on average. Learning to set the table requires 19 new demonstrations when training separately, but only 11.6 new demonstrations when also reusing demonstrations of clearing the table. HIL-MT learns policies for building 3- and 4-level pyramids of glass cups from 8.2 and 5 demonstrations, respectively, but reusing the 3-level demonstrations for learning a 4-level policy only requires 2.7 new demonstrations. These results suggest that learning hierarchical policies for structured domestic tasks can reuse existing demonstrations of related tasks to reduce the need for new demonstrations.","PeriodicalId":6695,"journal":{"name":"2019 IEEE 15th International Conference on Automation Science and Engineering (CASE)","volume":"39 2 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 15th International Conference on Automation Science and Engineering (CASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COASE.2019.8843293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

Abstract

Control policies for home automation robots can be learned from human demonstrations, and hierarchical control has the potential to reduce the required number of demonstrations. When learning multiple policies for related tasks, demonstrations can be reused between the tasks to further reduce the number of demonstrations needed to learn each new policy. We present HIL-MT, a framework for Multi-Task Hierarchical Imitation Learning, involving a human teacher, a networked Toyota HSR robot, and a cloud-based server that stores demonstrations and trains models. In our experiments, HIL-MT learns a policy for clearing a table of dishes from 11.2 demonstrations on average. Learning to set the table requires 19 new demonstrations when training separately, but only 11.6 new demonstrations when also reusing demonstrations of clearing the table. HIL-MT learns policies for building 3- and 4-level pyramids of glass cups from 8.2 and 5 demonstrations, respectively, but reusing the 3-level demonstrations for learning a 4-level policy only requires 2.7 new demonstrations. These results suggest that learning hierarchical policies for structured domestic tasks can reuse existing demonstrations of related tasks to reduce the need for new demonstrations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向家庭自动化的多任务分层模仿学习
家庭自动化机器人的控制策略可以从人类演示中学习,分层控制有可能减少所需的演示次数。当为相关任务学习多个策略时,可以在任务之间重用演示,以进一步减少学习每个新策略所需的演示数量。我们提出了HIL-MT,一个多任务分层模仿学习框架,涉及一名人类教师、一个联网的丰田高铁机器人和一个存储演示和训练模型的基于云的服务器。在我们的实验中,HIL-MT平均从11.2个演示中学习清理一桌菜的策略。学习摆桌子在单独训练时需要19个新的演示,而在重复使用清理桌子的演示时只需要11.6个新的演示。hill - mt分别从8.2和5个演示中学习构建3级和4级玻璃杯金字塔的策略,但是重用3级演示来学习4级策略只需要2.7个新的演示。这些结果表明,学习结构化家务任务的分层策略可以重用相关任务的现有演示,以减少对新演示的需求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A proposed mapping method for aligning machine execution data to numerical control code optimizing outpatient Department Staffing Level using Multi-Fidelity Models Advanced Sensor and Target Development to Support Robot Accuracy Degradation Assessment Multi-Task Hierarchical Imitation Learning for Home Automation Deep Reinforcement Learning of Robotic Precision Insertion Skill Accelerated by Demonstrations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1