探索用于Agent策略总结的计算用户模型。

Isaac Lage, Daphna Lifschitz, Finale Doshi-Velez, Ofra Amir
{"title":"探索用于Agent策略总结的计算用户模型。","authors":"Isaac Lage,&nbsp;Daphna Lifschitz,&nbsp;Finale Doshi-Velez,&nbsp;Ofra Amir","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>AI agents support high stakes decision-making processes from driving cars to prescribing drugs, making it increasingly important for human users to understand their behavior. Policy summarization methods aim to convey strengths and weaknesses of such agents by demonstrating their behavior in a subset of informative states. Some policy summarization methods extract a summary that optimizes the ability to reconstruct the agent's policy under the assumption that users will deploy inverse reinforcement learning. In this paper, we explore the use of different models for extracting summaries. We introduce an imitation learning-based approach to policy summarization; we demonstrate through computational simulations that a mismatch between the model used to extract a summary and the model used to reconstruct the policy results in worse reconstruction quality; and we demonstrate through a human-subject study that people use different models to reconstruct policies in different contexts, and that matching the summary extraction model to these can improve performance. Together, our results suggest that it is important to carefully consider user models in policy summarization.</p>","PeriodicalId":73334,"journal":{"name":"IJCAI : proceedings of the conference","volume":"28 ","pages":"1401-1407"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901848/pdf/nihms-1067306.pdf","citationCount":"0","resultStr":"{\"title\":\"Exploring Computational User Models for Agent Policy Summarization.\",\"authors\":\"Isaac Lage,&nbsp;Daphna Lifschitz,&nbsp;Finale Doshi-Velez,&nbsp;Ofra Amir\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>AI agents support high stakes decision-making processes from driving cars to prescribing drugs, making it increasingly important for human users to understand their behavior. Policy summarization methods aim to convey strengths and weaknesses of such agents by demonstrating their behavior in a subset of informative states. Some policy summarization methods extract a summary that optimizes the ability to reconstruct the agent's policy under the assumption that users will deploy inverse reinforcement learning. In this paper, we explore the use of different models for extracting summaries. We introduce an imitation learning-based approach to policy summarization; we demonstrate through computational simulations that a mismatch between the model used to extract a summary and the model used to reconstruct the policy results in worse reconstruction quality; and we demonstrate through a human-subject study that people use different models to reconstruct policies in different contexts, and that matching the summary extraction model to these can improve performance. Together, our results suggest that it is important to carefully consider user models in policy summarization.</p>\",\"PeriodicalId\":73334,\"journal\":{\"name\":\"IJCAI : proceedings of the conference\",\"volume\":\"28 \",\"pages\":\"1401-1407\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901848/pdf/nihms-1067306.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IJCAI : proceedings of the conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJCAI : proceedings of the conference","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

人工智能代理支持高风险的决策过程,从驾驶汽车到开处方,这使得人类用户理解它们的行为变得越来越重要。策略总结方法旨在通过展示智能体在信息状态子集中的行为来传达这些智能体的优势和劣势。一些策略总结方法在假设用户将部署逆强化学习的情况下,提取一个优化智能体策略重构能力的总结。在本文中,我们探讨了使用不同的模型来提取摘要。我们引入了一种基于模仿学习的政策总结方法;我们通过计算模拟证明,用于提取摘要的模型与用于重建策略的模型之间的不匹配会导致较差的重建质量;我们通过一项以人为对象的研究证明,人们在不同的环境中使用不同的模型来重建策略,将摘要提取模型与这些模型相匹配可以提高性能。总之,我们的结果表明,在策略总结中仔细考虑用户模型是很重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Exploring Computational User Models for Agent Policy Summarization.

AI agents support high stakes decision-making processes from driving cars to prescribing drugs, making it increasingly important for human users to understand their behavior. Policy summarization methods aim to convey strengths and weaknesses of such agents by demonstrating their behavior in a subset of informative states. Some policy summarization methods extract a summary that optimizes the ability to reconstruct the agent's policy under the assumption that users will deploy inverse reinforcement learning. In this paper, we explore the use of different models for extracting summaries. We introduce an imitation learning-based approach to policy summarization; we demonstrate through computational simulations that a mismatch between the model used to extract a summary and the model used to reconstruct the policy results in worse reconstruction quality; and we demonstrate through a human-subject study that people use different models to reconstruct policies in different contexts, and that matching the summary extraction model to these can improve performance. Together, our results suggest that it is important to carefully consider user models in policy summarization.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Predictive Modeling with Temporal Graphical Representation on Electronic Health Records. Adapt to Adaptation: Learning Personalization for Cross-Silo Federated Learning. Stabilizing and Enhancing Link Prediction through Deepened Graph Auto-Encoders. RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection Improving Attention Mechanism in Graph Neural Networks via Cardinality Preservation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1