Explanation-Based Reward Coaching to Improve Human Performance via Reinforcement Learning

Aaquib Tabrez, Shivendra Agrawal, Bradley Hayes
{"title":"Explanation-Based Reward Coaching to Improve Human Performance via Reinforcement Learning","authors":"Aaquib Tabrez, Shivendra Agrawal, Bradley Hayes","doi":"10.1109/HRI.2019.8673104","DOIUrl":null,"url":null,"abstract":"For robots to effectively collaborate with humans, it is critical to establish a shared mental model amongst teammates. In the case of incongruous models, catastrophic failures may occur unless mitigating steps are taken. To identify and remedy these potential issues, we propose a novel mechanism for enabling an autonomous system to detect model disparity between itself and a human collaborator, infer the source of the disagreement within the model, evaluate potential consequences of this error, and finally, provide human-interpretable feedback to encourage model correction. This process effectively enables a robot to provide a human with a policy update based on perceived model disparity, reducing the likelihood of costly or dangerous failures during joint task execution. This paper makes two contributions at the intersection of explainable AI (xAI) and human-robot collaboration: 1) The Reward Augmentation and Repair through Explanation (RARE) framework for estimating task understanding and 2) A human subjects study illustrating the effectiveness of reward augmentation-based policy repair in a complex collaborative task.","PeriodicalId":6600,"journal":{"name":"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)","volume":"12 1","pages":"249-257"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HRI.2019.8673104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 47

Abstract

For robots to effectively collaborate with humans, it is critical to establish a shared mental model amongst teammates. In the case of incongruous models, catastrophic failures may occur unless mitigating steps are taken. To identify and remedy these potential issues, we propose a novel mechanism for enabling an autonomous system to detect model disparity between itself and a human collaborator, infer the source of the disagreement within the model, evaluate potential consequences of this error, and finally, provide human-interpretable feedback to encourage model correction. This process effectively enables a robot to provide a human with a policy update based on perceived model disparity, reducing the likelihood of costly or dangerous failures during joint task execution. This paper makes two contributions at the intersection of explainable AI (xAI) and human-robot collaboration: 1) The Reward Augmentation and Repair through Explanation (RARE) framework for estimating task understanding and 2) A human subjects study illustrating the effectiveness of reward augmentation-based policy repair in a complex collaborative task.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于解释的奖励训练,通过强化学习提高人的表现
为了使机器人有效地与人类合作,在团队成员之间建立共享的心智模型至关重要。在模型不协调的情况下,除非采取缓解措施,否则可能发生灾难性的故障。为了识别和纠正这些潜在问题,我们提出了一种新机制,使自主系统能够检测自身与人类合作者之间的模型差异,推断模型中分歧的来源,评估该错误的潜在后果,最后提供人类可解释的反馈以鼓励模型纠正。这个过程有效地使机器人能够根据感知到的模型差异向人类提供策略更新,从而减少在联合任务执行期间发生代价高昂或危险故障的可能性。本文在可解释人工智能(xAI)和人机协作的交叉点上做出了两项贡献:1)通过解释来评估任务理解的奖励增强和修复(RARE)框架;2)一项人类受试者研究说明了基于奖励增强的策略修复在复杂协作任务中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Arpi, a Social Robot for Children with Epilepsy AMIGUS: A Robot Companion for Students (Video Abstract) MAPPO: The Assistance Pet for Oncological Children (Video Abstract) ACM/IEEE International Conference on Human-Robot Interaction, HRI 2022, Sapporo, Hokkaido, Japan, March 7 - 10, 2022 Leveraging Non-Experts and Formal Methods to Automatically Correct Robot Failures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1