基于人类社会反馈的高效交互式强化智能体学习

Jinying Lin, Qilei Zhang, R. Gomez, Keisuke Nakamura, Bo He, Guangliang Li
{"title":"基于人类社会反馈的高效交互式强化智能体学习","authors":"Jinying Lin, Qilei Zhang, R. Gomez, Keisuke Nakamura, Bo He, Guangliang Li","doi":"10.1109/RO-MAN47096.2020.9223516","DOIUrl":null,"url":null,"abstract":"As a branch of reinforcement learning, interactive reinforcement learning mainly studies the interaction process between humans and agents, allowing agents to learn from the intentions of human users and adapt to their preferences. In most of the current studies, human users need to intentionally provide explicit feedback via pressing keyboard buttons or mouse clicks. However, in our paper, we proposed an interactive reinforcement learning method that facilitates an agent to learn from human social signals — facial feedback via a ordinary camera and gestural feedback via a leap motion sensor. Our method provides a natural way for ordinary people to train agents how to perform a task according to their preferences. We tested our method in two reinforcement learning benchmarking domains — LoopMaze and Tetris, and compared to the state of the art — the TAMER framework. Our experimental results show that when learning from facial feedback the recognition of which is very low, the TAMER agent can get a similar performance to that of learning from keypress feedback with slightly more feedback. When learning from gestural feedback with a more accurate recognition, the TAMER agent can obtain a similar performance to that of learning from keypress feedback with much less feedback received. Moreover, our results indicate that the recognition error of facial feedback has a large effect on the agent performance in the beginning training process than in the later training stage. Finally, our results indicate that with enough recognition accuracy, human social signals can effectively improve the learning efficiency of agents with less human feedback.","PeriodicalId":383722,"journal":{"name":"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Human Social Feedback for Efficient Interactive Reinforcement Agent Learning\",\"authors\":\"Jinying Lin, Qilei Zhang, R. Gomez, Keisuke Nakamura, Bo He, Guangliang Li\",\"doi\":\"10.1109/RO-MAN47096.2020.9223516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a branch of reinforcement learning, interactive reinforcement learning mainly studies the interaction process between humans and agents, allowing agents to learn from the intentions of human users and adapt to their preferences. In most of the current studies, human users need to intentionally provide explicit feedback via pressing keyboard buttons or mouse clicks. However, in our paper, we proposed an interactive reinforcement learning method that facilitates an agent to learn from human social signals — facial feedback via a ordinary camera and gestural feedback via a leap motion sensor. Our method provides a natural way for ordinary people to train agents how to perform a task according to their preferences. We tested our method in two reinforcement learning benchmarking domains — LoopMaze and Tetris, and compared to the state of the art — the TAMER framework. Our experimental results show that when learning from facial feedback the recognition of which is very low, the TAMER agent can get a similar performance to that of learning from keypress feedback with slightly more feedback. When learning from gestural feedback with a more accurate recognition, the TAMER agent can obtain a similar performance to that of learning from keypress feedback with much less feedback received. Moreover, our results indicate that the recognition error of facial feedback has a large effect on the agent performance in the beginning training process than in the later training stage. Finally, our results indicate that with enough recognition accuracy, human social signals can effectively improve the learning efficiency of agents with less human feedback.\",\"PeriodicalId\":383722,\"journal\":{\"name\":\"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RO-MAN47096.2020.9223516\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RO-MAN47096.2020.9223516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

交互强化学习作为强化学习的一个分支,主要研究人类与智能体之间的交互过程,让智能体从人类用户的意图中学习,并适应他们的偏好。在目前的大多数研究中,人类用户需要通过按下键盘按钮或点击鼠标来有意地提供明确的反馈。然而,在我们的论文中,我们提出了一种交互式强化学习方法,使智能体能够从人类社会信号中学习——通过普通摄像头的面部反馈和通过跳跃运动传感器的手势反馈。我们的方法为普通人提供了一种自然的方式来训练智能体如何根据他们的偏好执行任务。我们在两个强化学习基准领域(LoopMaze和Tetris)中测试了我们的方法,并与最先进的TAMER框架进行了比较。我们的实验结果表明,当人脸反馈的识别率很低时,TAMER智能体可以获得与键盘反馈学习相似的性能,而键盘反馈的识别率略高。当从手势反馈中学习并获得更准确的识别时,TAMER代理可以获得与从键盘反馈中学习相似的性能,但收到的反馈要少得多。此外,我们的研究结果表明,面部反馈的识别误差在训练开始阶段对智能体性能的影响比对训练后期的影响更大。最后,我们的研究结果表明,在足够的识别精度下,人类社会信号可以有效地提高人工反馈较少的智能体的学习效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Human Social Feedback for Efficient Interactive Reinforcement Agent Learning
As a branch of reinforcement learning, interactive reinforcement learning mainly studies the interaction process between humans and agents, allowing agents to learn from the intentions of human users and adapt to their preferences. In most of the current studies, human users need to intentionally provide explicit feedback via pressing keyboard buttons or mouse clicks. However, in our paper, we proposed an interactive reinforcement learning method that facilitates an agent to learn from human social signals — facial feedback via a ordinary camera and gestural feedback via a leap motion sensor. Our method provides a natural way for ordinary people to train agents how to perform a task according to their preferences. We tested our method in two reinforcement learning benchmarking domains — LoopMaze and Tetris, and compared to the state of the art — the TAMER framework. Our experimental results show that when learning from facial feedback the recognition of which is very low, the TAMER agent can get a similar performance to that of learning from keypress feedback with slightly more feedback. When learning from gestural feedback with a more accurate recognition, the TAMER agent can obtain a similar performance to that of learning from keypress feedback with much less feedback received. Moreover, our results indicate that the recognition error of facial feedback has a large effect on the agent performance in the beginning training process than in the later training stage. Finally, our results indicate that with enough recognition accuracy, human social signals can effectively improve the learning efficiency of agents with less human feedback.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Adaptive Control Approach to Robotic Assembly with Uncertainties in Vision and Dynamics Affective Touch Robots with Changing Textures and Movements Interactive Robotic Systems as Boundary-Crossing Robots – the User’s View* Development of a Learning-Based Intention Detection Framework for Power-Assisted Manual Wheelchair Users Multi-user Robot Impression with a Virtual Agent and Features Modification According to Real-time Emotion from Physiological Signals
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1