Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks.

Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez
{"title":"Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks.","authors":"Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Many important behavior changes are <i>frictionful</i>; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize <i>rapidly</i> (before the individual disengages) and <i>interpretably</i>, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a <i>boundedly rational human agent</i>. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of <i>MDP equivalence</i> specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans.</p>","PeriodicalId":93357,"journal":{"name":"Proceedings of the ... International Joint Conference on Autonomous Agents and Multiagent Systems : AAMAS. International Joint Conference on Autonomous Agents and Multiagent Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11460771/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International Joint Conference on Autonomous Agents and Multiagent Systems : AAMAS. International Joint Conference on Autonomous Agents and Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/6 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans.

分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在摩擦任务中对有限理性的人类代理进行强化学习干预。
许多重要的行为改变都是摩擦性的;它们需要个人在很长一段时间内付出努力,却很少有立竿见影的效果。在这种情况下,人工智能(AI)代理可以提供个性化的干预措施,帮助个人坚持自己的目标。在这种情况下,人工智能代理必须快速(在个人脱离之前)、可解释地进行个性化干预,以帮助我们理解行为干预。在本文中,我们介绍了行为模型强化学习(BMRL),在这个框架中,人工智能代理对属于有界理性人类代理的马尔可夫决策过程(MDP)的参数进行干预。我们将人类决策者表述为一个规划代理,这使我们能够将不理想的人类政策(无法实现目标的政策)归因于其不适应的 MDP 参数,例如极低的贴现率。此外,我们还提出了一类易于理解的人类模型,可以捕捉摩擦任务中的基本行为。通过引入 BMRL 特有的 MDP 等效概念,我们从理论和经验上证明,使用我们的人类模型进行人工智能规划,可以为各种更复杂、更真实的人类提供有用的策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks. An Active Learning Method for the Comparison of Agent-based Models. Finding Spatial Clusters Susceptible to Epidemic Outbreaks due to Undervaccination. Theoretical Background Behavior Model Calibration for Epidemic Simulations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1