面向家庭自动化的多任务分层模仿学习

2019 IEEE 15th International Conference on Automation Science and Engineering (CASE) Pub Date : 2019-08-25 DOI:10.1109/COASE.2019.8843293

Roy Fox, R. Berenstein, I. Stoica, Ken Goldberg

{"title":"面向家庭自动化的多任务分层模仿学习","authors":"Roy Fox, R. Berenstein, I. Stoica, Ken Goldberg","doi":"10.1109/COASE.2019.8843293","DOIUrl":null,"url":null,"abstract":"Control policies for home automation robots can be learned from human demonstrations, and hierarchical control has the potential to reduce the required number of demonstrations. When learning multiple policies for related tasks, demonstrations can be reused between the tasks to further reduce the number of demonstrations needed to learn each new policy. We present HIL-MT, a framework for Multi-Task Hierarchical Imitation Learning, involving a human teacher, a networked Toyota HSR robot, and a cloud-based server that stores demonstrations and trains models. In our experiments, HIL-MT learns a policy for clearing a table of dishes from 11.2 demonstrations on average. Learning to set the table requires 19 new demonstrations when training separately, but only 11.6 new demonstrations when also reusing demonstrations of clearing the table. HIL-MT learns policies for building 3- and 4-level pyramids of glass cups from 8.2 and 5 demonstrations, respectively, but reusing the 3-level demonstrations for learning a 4-level policy only requires 2.7 new demonstrations. These results suggest that learning hierarchical policies for structured domestic tasks can reuse existing demonstrations of related tasks to reduce the need for new demonstrations.","PeriodicalId":6695,"journal":{"name":"2019 IEEE 15th International Conference on Automation Science and Engineering (CASE)","volume":"39 2 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Multi-Task Hierarchical Imitation Learning for Home Automation\",\"authors\":\"Roy Fox, R. Berenstein, I. Stoica, Ken Goldberg\",\"doi\":\"10.1109/COASE.2019.8843293\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Control policies for home automation robots can be learned from human demonstrations, and hierarchical control has the potential to reduce the required number of demonstrations. When learning multiple policies for related tasks, demonstrations can be reused between the tasks to further reduce the number of demonstrations needed to learn each new policy. We present HIL-MT, a framework for Multi-Task Hierarchical Imitation Learning, involving a human teacher, a networked Toyota HSR robot, and a cloud-based server that stores demonstrations and trains models. In our experiments, HIL-MT learns a policy for clearing a table of dishes from 11.2 demonstrations on average. Learning to set the table requires 19 new demonstrations when training separately, but only 11.6 new demonstrations when also reusing demonstrations of clearing the table. HIL-MT learns policies for building 3- and 4-level pyramids of glass cups from 8.2 and 5 demonstrations, respectively, but reusing the 3-level demonstrations for learning a 4-level policy only requires 2.7 new demonstrations. These results suggest that learning hierarchical policies for structured domestic tasks can reuse existing demonstrations of related tasks to reduce the need for new demonstrations.\",\"PeriodicalId\":6695,\"journal\":{\"name\":\"2019 IEEE 15th International Conference on Automation Science and Engineering (CASE)\",\"volume\":\"39 2 1\",\"pages\":\"1-8\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 15th International Conference on Automation Science and Engineering (CASE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COASE.2019.8843293\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 15th International Conference on Automation Science and Engineering (CASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COASE.2019.8843293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

摘要

家庭自动化机器人的控制策略可以从人类演示中学习，分层控制有可能减少所需的演示次数。当为相关任务学习多个策略时，可以在任务之间重用演示，以进一步减少学习每个新策略所需的演示数量。我们提出了HIL-MT，一个多任务分层模仿学习框架，涉及一名人类教师、一个联网的丰田高铁机器人和一个存储演示和训练模型的基于云的服务器。在我们的实验中，HIL-MT平均从11.2个演示中学习清理一桌菜的策略。学习摆桌子在单独训练时需要19个新的演示，而在重复使用清理桌子的演示时只需要11.6个新的演示。hill - mt分别从8.2和5个演示中学习构建3级和4级玻璃杯金字塔的策略，但是重用3级演示来学习4级策略只需要2.7个新的演示。这些结果表明，学习结构化家务任务的分层策略可以重用相关任务的现有演示，以减少对新演示的需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Multi-Task Hierarchical Imitation Learning for Home Automation

Control policies for home automation robots can be learned from human demonstrations, and hierarchical control has the potential to reduce the required number of demonstrations. When learning multiple policies for related tasks, demonstrations can be reused between the tasks to further reduce the number of demonstrations needed to learn each new policy. We present HIL-MT, a framework for Multi-Task Hierarchical Imitation Learning, involving a human teacher, a networked Toyota HSR robot, and a cloud-based server that stores demonstrations and trains models. In our experiments, HIL-MT learns a policy for clearing a table of dishes from 11.2 demonstrations on average. Learning to set the table requires 19 new demonstrations when training separately, but only 11.6 new demonstrations when also reusing demonstrations of clearing the table. HIL-MT learns policies for building 3- and 4-level pyramids of glass cups from 8.2 and 5 demonstrations, respectively, but reusing the 3-level demonstrations for learning a 4-level policy only requires 2.7 new demonstrations. These results suggest that learning hierarchical policies for structured domestic tasks can reuse existing demonstrations of related tasks to reduce the need for new demonstrations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE 15th International Conference on Automation Science and Engineering (CASE)

自引率

0.00%

发文量