将信心偏差与强化学习过程联系起来。

IF 5.1 1区 心理学 Q1 PSYCHOLOGY Psychological review Pub Date : 2023-07-01 DOI:10.1037/rev0000424
Nahuel Salem-Garcia, Stefano Palminteri, Maël Lebreton
{"title":"将信心偏差与强化学习过程联系起来。","authors":"Nahuel Salem-Garcia,&nbsp;Stefano Palminteri,&nbsp;Maël Lebreton","doi":"10.1037/rev0000424","DOIUrl":null,"url":null,"abstract":"<p><p>We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":21016,"journal":{"name":"Psychological review","volume":"130 4","pages":"1017-1043"},"PeriodicalIF":5.1000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Linking confidence biases to reinforcement-learning processes.\",\"authors\":\"Nahuel Salem-Garcia,&nbsp;Stefano Palminteri,&nbsp;Maël Lebreton\",\"doi\":\"10.1037/rev0000424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>\",\"PeriodicalId\":21016,\"journal\":{\"name\":\"Psychological review\",\"volume\":\"130 4\",\"pages\":\"1017-1043\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological review\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/rev0000424\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological review","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/rev0000424","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY","Score":null,"Total":0}
引用次数: 4

摘要

在简单的经济任务中,我们系统性地误判了自己的表现。首先,我们通常会高估自己做出正确选择的能力——这种偏见被称为过度自信。其次,当我们寻求收益时,我们对自己的选择比我们试图避免损失时更有信心——我们把这种偏见称为价格诱发的信心偏见。引人注目的是,这两种偏差也出现在强化学习(RL)环境中,尽管结果是一次又一次的提供,原则上可以用于在线重新校准信心判断。因此,信心偏差是如何在强化学习环境中出现并维持的,这是令人困惑的,至今仍未得到解释。为了解释这一悖论,我们提出信心偏差源于学习偏差,并使用来自多个实验的数据来检验这一假设,在这些实验中,我们同时评估了学习和迁移阶段的工具选择和信心判断。我们的研究结果首先表明,参与者在这两个任务中的选择最好地解释为一个强化学习模型,该模型具有上下文依赖学习和确认性更新。然后,我们证明了在这两个任务中得出的复杂的、有偏见的信心判断模式可以通过在计算信心判断时所选择的选项的学习值的加权来解释。结果表明,导致学习偏差的个体学习模型参数——验证性更新和结果情境依赖性——可以预测个体元认知偏差。我们的结论表明,元认知偏差从根本上源于有偏差的学习计算。(PsycInfo数据库记录(c) 2023 APA,版权所有)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Linking confidence biases to reinforcement-learning processes.

We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Psychological review
Psychological review 医学-心理学
CiteScore
9.70
自引率
5.60%
发文量
97
期刊介绍: Psychological Review publishes articles that make important theoretical contributions to any area of scientific psychology, including systematic evaluation of alternative theories.
期刊最新文献
How does depressive cognition develop? A state-dependent network model of predictive processing. Bouncing back from life's perturbations: Formalizing psychological resilience from a complex systems perspective. The meaning of attention control. Counterfactuals and the logic of causal selection. The relation between learning and stimulus-response binding.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1