将信心偏差与强化学习过程联系起来。

IF 5.8 1区心理学 Q1 PSYCHOLOGY Psychological review Pub Date : 2023-07-01 DOI:10.1037/rev0000424

Nahuel Salem-Garcia, Stefano Palminteri, Maël Lebreton

{"title":"将信心偏差与强化学习过程联系起来。","authors":"Nahuel Salem-Garcia, Stefano Palminteri, Maël Lebreton","doi":"10.1037/rev0000424","DOIUrl":null,"url":null,"abstract":"We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).","PeriodicalId":21016,"journal":{"name":"Psychological review","volume":"130 4","pages":"1017-1043"},"PeriodicalIF":5.8000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Linking confidence biases to reinforcement-learning processes.\",\"authors\":\"Nahuel Salem-Garcia, Stefano Palminteri, Maël Lebreton\",\"doi\":\"10.1037/rev0000424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).\",\"PeriodicalId\":21016,\"journal\":{\"name\":\"Psychological review\",\"volume\":\"130 4\",\"pages\":\"1017-1043\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological review\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/rev0000424\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological review","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/rev0000424","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY","Score":null,"Total":0}

引用次数: 4

摘要

在简单的经济任务中，我们系统性地误判了自己的表现。首先，我们通常会高估自己做出正确选择的能力——这种偏见被称为过度自信。其次，当我们寻求收益时，我们对自己的选择比我们试图避免损失时更有信心——我们把这种偏见称为价格诱发的信心偏见。引人注目的是，这两种偏差也出现在强化学习(RL)环境中，尽管结果是一次又一次的提供，原则上可以用于在线重新校准信心判断。因此，信心偏差是如何在强化学习环境中出现并维持的，这是令人困惑的，至今仍未得到解释。为了解释这一悖论，我们提出信心偏差源于学习偏差，并使用来自多个实验的数据来检验这一假设，在这些实验中，我们同时评估了学习和迁移阶段的工具选择和信心判断。我们的研究结果首先表明，参与者在这两个任务中的选择最好地解释为一个强化学习模型，该模型具有上下文依赖学习和确认性更新。然后，我们证明了在这两个任务中得出的复杂的、有偏见的信心判断模式可以通过在计算信心判断时所选择的选项的学习值的加权来解释。结果表明，导致学习偏差的个体学习模型参数——验证性更新和结果情境依赖性——可以预测个体元认知偏差。我们的结论表明，元认知偏差从根本上源于有偏差的学习计算。(PsycInfo数据库记录(c) 2023 APA，版权所有)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Linking confidence biases to reinforcement-learning processes.

We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Psychological review 医学-心理学

CiteScore

9.70

自引率

5.60%

发文量

期刊介绍： Psychological Review publishes articles that make important theoretical contributions to any area of scientific psychology, including systematic evaluation of alternative theories.