从算法角度解释人类如何高效地学习、转移和组成分层结构的决策政策。

IF 2.8 1区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Cognition Pub Date : 2024-10-04 DOI:10.1016/j.cognition.2024.105967
Jing-Jing Li , Anne G.E. Collins
{"title":"从算法角度解释人类如何高效地学习、转移和组成分层结构的决策政策。","authors":"Jing-Jing Li ,&nbsp;Anne G.E. Collins","doi":"10.1016/j.cognition.2024.105967","DOIUrl":null,"url":null,"abstract":"<div><div>Learning structures that effectively abstract decision policies is key to the flexibility of human intelligence. Previous work has shown that humans use hierarchically structured policies to efficiently navigate complex and dynamic environments. However, the computational processes that support the learning and construction of such policies remain insufficiently understood. To address this question, we tested 1026 human participants, who made over 1 million choices combined, in a decision-making task where they could learn, transfer, and recompose multiple sets of hierarchical policies. We propose a novel algorithmic account for the learning processes underlying observed human behavior. We show that humans rely on compressed policies over states in early learning, which gradually unfold into hierarchical representations via meta-learning and Bayesian inference. Our modeling evidence suggests that these hierarchical policies are structured in a temporally backward, rather than forward, fashion. Taken together, these algorithmic architectures characterize how the interplay between reinforcement learning, policy compression, meta-learning, and working memory supports structured decision-making and compositionality in a resource-rational way.</div></div>","PeriodicalId":48455,"journal":{"name":"Cognition","volume":"254 ","pages":"Article 105967"},"PeriodicalIF":2.8000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An algorithmic account for how humans efficiently learn, transfer, and compose hierarchically structured decision policies\",\"authors\":\"Jing-Jing Li ,&nbsp;Anne G.E. Collins\",\"doi\":\"10.1016/j.cognition.2024.105967\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Learning structures that effectively abstract decision policies is key to the flexibility of human intelligence. Previous work has shown that humans use hierarchically structured policies to efficiently navigate complex and dynamic environments. However, the computational processes that support the learning and construction of such policies remain insufficiently understood. To address this question, we tested 1026 human participants, who made over 1 million choices combined, in a decision-making task where they could learn, transfer, and recompose multiple sets of hierarchical policies. We propose a novel algorithmic account for the learning processes underlying observed human behavior. We show that humans rely on compressed policies over states in early learning, which gradually unfold into hierarchical representations via meta-learning and Bayesian inference. Our modeling evidence suggests that these hierarchical policies are structured in a temporally backward, rather than forward, fashion. Taken together, these algorithmic architectures characterize how the interplay between reinforcement learning, policy compression, meta-learning, and working memory supports structured decision-making and compositionality in a resource-rational way.</div></div>\",\"PeriodicalId\":48455,\"journal\":{\"name\":\"Cognition\",\"volume\":\"254 \",\"pages\":\"Article 105967\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognition\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010027724002531\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognition","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010027724002531","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

学习有效抽象决策政策的结构是人类智能灵活性的关键。以往的研究表明,人类使用分层结构的策略来高效地驾驭复杂多变的环境。然而,人们对支持学习和构建此类政策的计算过程仍然了解不足。为了解决这个问题,我们在一项决策任务中对 1026 名人类参与者进行了测试,他们总共做出了 100 多万个选择,在这项任务中,他们可以学习、转移和重新组合多套分层策略。我们为观察到的人类行为背后的学习过程提出了一种新的算法解释。我们的研究表明,人类在早期学习中依赖于对状态的压缩策略,这些策略通过元学习和贝叶斯推理逐渐扩展为分层表征。我们的建模证据表明,这些分层策略是以时间上向后而非向前的方式构建的。综合来看,这些算法架构描述了强化学习、策略压缩、元学习和工作记忆之间的相互作用如何以资源合理的方式支持结构化决策和组合性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An algorithmic account for how humans efficiently learn, transfer, and compose hierarchically structured decision policies
Learning structures that effectively abstract decision policies is key to the flexibility of human intelligence. Previous work has shown that humans use hierarchically structured policies to efficiently navigate complex and dynamic environments. However, the computational processes that support the learning and construction of such policies remain insufficiently understood. To address this question, we tested 1026 human participants, who made over 1 million choices combined, in a decision-making task where they could learn, transfer, and recompose multiple sets of hierarchical policies. We propose a novel algorithmic account for the learning processes underlying observed human behavior. We show that humans rely on compressed policies over states in early learning, which gradually unfold into hierarchical representations via meta-learning and Bayesian inference. Our modeling evidence suggests that these hierarchical policies are structured in a temporally backward, rather than forward, fashion. Taken together, these algorithmic architectures characterize how the interplay between reinforcement learning, policy compression, meta-learning, and working memory supports structured decision-making and compositionality in a resource-rational way.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cognition
Cognition PSYCHOLOGY, EXPERIMENTAL-
CiteScore
6.40
自引率
5.90%
发文量
283
期刊介绍: Cognition is an international journal that publishes theoretical and experimental papers on the study of the mind. It covers a wide variety of subjects concerning all the different aspects of cognition, ranging from biological and experimental studies to formal analysis. Contributions from the fields of psychology, neuroscience, linguistics, computer science, mathematics, ethology and philosophy are welcome in this journal provided that they have some bearing on the functioning of the mind. In addition, the journal serves as a forum for discussion of social and political aspects of cognitive science.
期刊最新文献
Morality on the road: Should machine drivers be more utilitarian than human drivers? Relative source credibility affects the continued influence effect: Evidence of rationality in the CIE. Decoding face identity: A reverse-correlation approach using deep learning How does color distribution learning affect goal-directed visuomotor behavior? Bias-free measure of distractor avoidance in visual search
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1