Jinying Lin, Qilei Zhang, R. Gomez, Keisuke Nakamura, Bo He, Guangliang Li
{"title":"基于人类社会反馈的高效交互式强化智能体学习","authors":"Jinying Lin, Qilei Zhang, R. Gomez, Keisuke Nakamura, Bo He, Guangliang Li","doi":"10.1109/RO-MAN47096.2020.9223516","DOIUrl":null,"url":null,"abstract":"As a branch of reinforcement learning, interactive reinforcement learning mainly studies the interaction process between humans and agents, allowing agents to learn from the intentions of human users and adapt to their preferences. In most of the current studies, human users need to intentionally provide explicit feedback via pressing keyboard buttons or mouse clicks. However, in our paper, we proposed an interactive reinforcement learning method that facilitates an agent to learn from human social signals — facial feedback via a ordinary camera and gestural feedback via a leap motion sensor. Our method provides a natural way for ordinary people to train agents how to perform a task according to their preferences. We tested our method in two reinforcement learning benchmarking domains — LoopMaze and Tetris, and compared to the state of the art — the TAMER framework. Our experimental results show that when learning from facial feedback the recognition of which is very low, the TAMER agent can get a similar performance to that of learning from keypress feedback with slightly more feedback. When learning from gestural feedback with a more accurate recognition, the TAMER agent can obtain a similar performance to that of learning from keypress feedback with much less feedback received. Moreover, our results indicate that the recognition error of facial feedback has a large effect on the agent performance in the beginning training process than in the later training stage. Finally, our results indicate that with enough recognition accuracy, human social signals can effectively improve the learning efficiency of agents with less human feedback.","PeriodicalId":383722,"journal":{"name":"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Human Social Feedback for Efficient Interactive Reinforcement Agent Learning\",\"authors\":\"Jinying Lin, Qilei Zhang, R. Gomez, Keisuke Nakamura, Bo He, Guangliang Li\",\"doi\":\"10.1109/RO-MAN47096.2020.9223516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a branch of reinforcement learning, interactive reinforcement learning mainly studies the interaction process between humans and agents, allowing agents to learn from the intentions of human users and adapt to their preferences. In most of the current studies, human users need to intentionally provide explicit feedback via pressing keyboard buttons or mouse clicks. However, in our paper, we proposed an interactive reinforcement learning method that facilitates an agent to learn from human social signals — facial feedback via a ordinary camera and gestural feedback via a leap motion sensor. Our method provides a natural way for ordinary people to train agents how to perform a task according to their preferences. We tested our method in two reinforcement learning benchmarking domains — LoopMaze and Tetris, and compared to the state of the art — the TAMER framework. Our experimental results show that when learning from facial feedback the recognition of which is very low, the TAMER agent can get a similar performance to that of learning from keypress feedback with slightly more feedback. When learning from gestural feedback with a more accurate recognition, the TAMER agent can obtain a similar performance to that of learning from keypress feedback with much less feedback received. Moreover, our results indicate that the recognition error of facial feedback has a large effect on the agent performance in the beginning training process than in the later training stage. Finally, our results indicate that with enough recognition accuracy, human social signals can effectively improve the learning efficiency of agents with less human feedback.\",\"PeriodicalId\":383722,\"journal\":{\"name\":\"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RO-MAN47096.2020.9223516\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RO-MAN47096.2020.9223516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Human Social Feedback for Efficient Interactive Reinforcement Agent Learning
As a branch of reinforcement learning, interactive reinforcement learning mainly studies the interaction process between humans and agents, allowing agents to learn from the intentions of human users and adapt to their preferences. In most of the current studies, human users need to intentionally provide explicit feedback via pressing keyboard buttons or mouse clicks. However, in our paper, we proposed an interactive reinforcement learning method that facilitates an agent to learn from human social signals — facial feedback via a ordinary camera and gestural feedback via a leap motion sensor. Our method provides a natural way for ordinary people to train agents how to perform a task according to their preferences. We tested our method in two reinforcement learning benchmarking domains — LoopMaze and Tetris, and compared to the state of the art — the TAMER framework. Our experimental results show that when learning from facial feedback the recognition of which is very low, the TAMER agent can get a similar performance to that of learning from keypress feedback with slightly more feedback. When learning from gestural feedback with a more accurate recognition, the TAMER agent can obtain a similar performance to that of learning from keypress feedback with much less feedback received. Moreover, our results indicate that the recognition error of facial feedback has a large effect on the agent performance in the beginning training process than in the later training stage. Finally, our results indicate that with enough recognition accuracy, human social signals can effectively improve the learning efficiency of agents with less human feedback.