探索用于Agent策略总结的计算用户模型。

IJCAI : proceedings of the conference Pub Date : 2019-08-01

Isaac Lage, Daphna Lifschitz, Finale Doshi-Velez, Ofra Amir

{"title":"探索用于Agent策略总结的计算用户模型。","authors":"Isaac Lage, Daphna Lifschitz, Finale Doshi-Velez, Ofra Amir","doi":"","DOIUrl":null,"url":null,"abstract":"AI agents support high stakes decision-making processes from driving cars to prescribing drugs, making it increasingly important for human users to understand their behavior. Policy summarization methods aim to convey strengths and weaknesses of such agents by demonstrating their behavior in a subset of informative states. Some policy summarization methods extract a summary that optimizes the ability to reconstruct the agent's policy under the assumption that users will deploy inverse reinforcement learning. In this paper, we explore the use of different models for extracting summaries. We introduce an imitation learning-based approach to policy summarization; we demonstrate through computational simulations that a mismatch between the model used to extract a summary and the model used to reconstruct the policy results in worse reconstruction quality; and we demonstrate through a human-subject study that people use different models to reconstruct policies in different contexts, and that matching the summary extraction model to these can improve performance. Together, our results suggest that it is important to carefully consider user models in policy summarization.","PeriodicalId":73334,"journal":{"name":"IJCAI : proceedings of the conference","volume":"28 ","pages":"1401-1407"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901848/pdf/nihms-1067306.pdf","citationCount":"0","resultStr":"{\"title\":\"Exploring Computational User Models for Agent Policy Summarization.\",\"authors\":\"Isaac Lage, Daphna Lifschitz, Finale Doshi-Velez, Ofra Amir\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"AI agents support high stakes decision-making processes from driving cars to prescribing drugs, making it increasingly important for human users to understand their behavior. Policy summarization methods aim to convey strengths and weaknesses of such agents by demonstrating their behavior in a subset of informative states. Some policy summarization methods extract a summary that optimizes the ability to reconstruct the agent's policy under the assumption that users will deploy inverse reinforcement learning. In this paper, we explore the use of different models for extracting summaries. We introduce an imitation learning-based approach to policy summarization; we demonstrate through computational simulations that a mismatch between the model used to extract a summary and the model used to reconstruct the policy results in worse reconstruction quality; and we demonstrate through a human-subject study that people use different models to reconstruct policies in different contexts, and that matching the summary extraction model to these can improve performance. Together, our results suggest that it is important to carefully consider user models in policy summarization.\",\"PeriodicalId\":73334,\"journal\":{\"name\":\"IJCAI : proceedings of the conference\",\"volume\":\"28 \",\"pages\":\"1401-1407\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901848/pdf/nihms-1067306.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IJCAI : proceedings of the conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJCAI : proceedings of the conference","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

人工智能代理支持高风险的决策过程，从驾驶汽车到开处方，这使得人类用户理解它们的行为变得越来越重要。策略总结方法旨在通过展示智能体在信息状态子集中的行为来传达这些智能体的优势和劣势。一些策略总结方法在假设用户将部署逆强化学习的情况下，提取一个优化智能体策略重构能力的总结。在本文中，我们探讨了使用不同的模型来提取摘要。我们引入了一种基于模仿学习的政策总结方法;我们通过计算模拟证明，用于提取摘要的模型与用于重建策略的模型之间的不匹配会导致较差的重建质量;我们通过一项以人为对象的研究证明，人们在不同的环境中使用不同的模型来重建策略，将摘要提取模型与这些模型相匹配可以提高性能。总之，我们的结果表明，在策略总结中仔细考虑用户模型是很重要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Exploring Computational User Models for Agent Policy Summarization.

AI agents support high stakes decision-making processes from driving cars to prescribing drugs, making it increasingly important for human users to understand their behavior. Policy summarization methods aim to convey strengths and weaknesses of such agents by demonstrating their behavior in a subset of informative states. Some policy summarization methods extract a summary that optimizes the ability to reconstruct the agent's policy under the assumption that users will deploy inverse reinforcement learning. In this paper, we explore the use of different models for extracting summaries. We introduce an imitation learning-based approach to policy summarization; we demonstrate through computational simulations that a mismatch between the model used to extract a summary and the model used to reconstruct the policy results in worse reconstruction quality; and we demonstrate through a human-subject study that people use different models to reconstruct policies in different contexts, and that matching the summary extraction model to these can improve performance. Together, our results suggest that it is important to carefully consider user models in policy summarization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IJCAI : proceedings of the conference

自引率

0.00%

发文量