前沿 | 不确定性条件下顺序决策学习的认知机制：一种实验和理论方法

IF 3.4 3区医学 Q2 BEHAVIORAL SCIENCES Frontiers in Behavioral Neuroscience Pub Date : 2024-07-19 DOI:10.3389/fnbeh.2024.1399394

Gloria Cecchini, Michael DePass, Emre Baspinar, Marta Andujar, Surabhi Ramawat, Pierpaolo Pani, Stefano Ferraina, Alain Destexhe, Rubén Moreno-Bote, Ignasi Cos

{"title":"前沿 | 不确定性条件下顺序决策学习的认知机制：一种实验和理论方法","authors":"Gloria Cecchini, Michael DePass, Emre Baspinar, Marta Andujar, Surabhi Ramawat, Pierpaolo Pani, Stefano Ferraina, Alain Destexhe, Rubén Moreno-Bote, Ignasi Cos","doi":"10.3389/fnbeh.2024.1399394","DOIUrl":null,"url":null,"abstract":"Learning to make adaptive decisions involves making choices, assessing their consequence, and leveraging this assessment to attain higher rewarding states. Despite vast literature on value-based decision-making, relatively little is known about the cognitive processes underlying decisions in highly uncertain contexts. Real world decisions are rarely accompanied by immediate feedback, explicit rewards, or complete knowledge of the environment. Being able to make informed decisions in such contexts requires significant knowledge about the environment, which can only be gained via exploration. Here we aim at understanding and formalizing the brain mechanisms underlying these processes. To this end, we first designed and performed an experimental task. Human participants had to learn to maximize reward while making sequences of decisions with only basic knowledge of the environment, and in the absence of explicit performance cues. Participants had to rely on their own internal assessment of performance to reveal a covert relationship between their choices and their subsequent consequences to find a strategy leading to the highest cumulative reward. Our results show that the participants’ reaction times were longer whenever the decision involved a future consequence, suggesting greater introspection whenever a delayed value had to be considered. The learning time varied significantly across participants. Second, we formalized the neurocognitive processes underlying decision-making within this task, combining mean-field representations of competing neural populations with a reinforcement learning mechanism. This model provided a plausible characterization of the brain dynamics underlying these processes, and reproduced each aspect of the participants’ behavior, from their reaction times and choices to their learning rates. In summary, both the experimental results and the model provide a principled explanation to how delayed value may be computed and incorporated into the neural dynamics of decision-making, and to how learning occurs in these uncertain scenarios.","PeriodicalId":12368,"journal":{"name":"Frontiers in Behavioral Neuroscience","volume":"133 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Frontiers | Cognitive mechanisms of learning in sequential decision-making under uncertainty: an experimental and theoretical approach\",\"authors\":\"Gloria Cecchini, Michael DePass, Emre Baspinar, Marta Andujar, Surabhi Ramawat, Pierpaolo Pani, Stefano Ferraina, Alain Destexhe, Rubén Moreno-Bote, Ignasi Cos\",\"doi\":\"10.3389/fnbeh.2024.1399394\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning to make adaptive decisions involves making choices, assessing their consequence, and leveraging this assessment to attain higher rewarding states. Despite vast literature on value-based decision-making, relatively little is known about the cognitive processes underlying decisions in highly uncertain contexts. Real world decisions are rarely accompanied by immediate feedback, explicit rewards, or complete knowledge of the environment. Being able to make informed decisions in such contexts requires significant knowledge about the environment, which can only be gained via exploration. Here we aim at understanding and formalizing the brain mechanisms underlying these processes. To this end, we first designed and performed an experimental task. Human participants had to learn to maximize reward while making sequences of decisions with only basic knowledge of the environment, and in the absence of explicit performance cues. Participants had to rely on their own internal assessment of performance to reveal a covert relationship between their choices and their subsequent consequences to find a strategy leading to the highest cumulative reward. Our results show that the participants’ reaction times were longer whenever the decision involved a future consequence, suggesting greater introspection whenever a delayed value had to be considered. The learning time varied significantly across participants. Second, we formalized the neurocognitive processes underlying decision-making within this task, combining mean-field representations of competing neural populations with a reinforcement learning mechanism. This model provided a plausible characterization of the brain dynamics underlying these processes, and reproduced each aspect of the participants’ behavior, from their reaction times and choices to their learning rates. In summary, both the experimental results and the model provide a principled explanation to how delayed value may be computed and incorporated into the neural dynamics of decision-making, and to how learning occurs in these uncertain scenarios.\",\"PeriodicalId\":12368,\"journal\":{\"name\":\"Frontiers in Behavioral Neuroscience\",\"volume\":\"133 1\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Behavioral Neuroscience\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fnbeh.2024.1399394\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BEHAVIORAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Behavioral Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fnbeh.2024.1399394","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BEHAVIORAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

学习做出适应性决策包括做出选择、评估其后果，以及利用这种评估来达到更高的回报状态。尽管有大量关于基于价值的决策的文献，但人们对高度不确定情况下决策的认知过程却知之甚少。现实世界中的决策很少伴随着即时反馈、明确奖励或对环境的全面了解。要想在这种情况下做出明智的决策，就必须掌握大量的环境知识，而这些知识只能通过探索获得。在这里，我们旨在理解这些过程的大脑机制并将其形式化。为此，我们首先设计并完成了一项实验任务。人类参与者必须学会在仅对环境有基本了解且没有明确的表现线索的情况下，在做出一系列决策的同时最大化回报。参与者必须依靠自己对表现的内部评估来揭示他们的选择与随后的后果之间的隐蔽关系，从而找到一种可获得最高累积奖励的策略。我们的结果表明，每当决策涉及未来后果时，参与者的反应时间就会延长，这表明每当需要考虑延迟值时，参与者的内省能力就会增强。不同参与者的学习时间差异很大。其次，我们将竞争神经群的平均场表征与强化学习机制相结合，正式确定了这项任务中决策的神经认知过程。该模型提供了这些过程所依赖的大脑动态的合理表征，并再现了参与者行为的各个方面，从他们的反应时间、选择到学习率。总之，实验结果和模型都为如何计算延迟值并将其纳入决策的神经动力学以及如何在这些不确定情景中进行学习提供了原则性的解释。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Frontiers | Cognitive mechanisms of learning in sequential decision-making under uncertainty: an experimental and theoretical approach

Learning to make adaptive decisions involves making choices, assessing their consequence, and leveraging this assessment to attain higher rewarding states. Despite vast literature on value-based decision-making, relatively little is known about the cognitive processes underlying decisions in highly uncertain contexts. Real world decisions are rarely accompanied by immediate feedback, explicit rewards, or complete knowledge of the environment. Being able to make informed decisions in such contexts requires significant knowledge about the environment, which can only be gained via exploration. Here we aim at understanding and formalizing the brain mechanisms underlying these processes. To this end, we first designed and performed an experimental task. Human participants had to learn to maximize reward while making sequences of decisions with only basic knowledge of the environment, and in the absence of explicit performance cues. Participants had to rely on their own internal assessment of performance to reveal a covert relationship between their choices and their subsequent consequences to find a strategy leading to the highest cumulative reward. Our results show that the participants’ reaction times were longer whenever the decision involved a future consequence, suggesting greater introspection whenever a delayed value had to be considered. The learning time varied significantly across participants. Second, we formalized the neurocognitive processes underlying decision-making within this task, combining mean-field representations of competing neural populations with a reinforcement learning mechanism. This model provided a plausible characterization of the brain dynamics underlying these processes, and reproduced each aspect of the participants’ behavior, from their reaction times and choices to their learning rates. In summary, both the experimental results and the model provide a principled explanation to how delayed value may be computed and incorporated into the neural dynamics of decision-making, and to how learning occurs in these uncertain scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Behavioral Neuroscience BEHAVIORAL SCIENCES-NEUROSCIENCES

CiteScore

4.70

自引率

3.30%

发文量

506

审稿时长

6-12 weeks

期刊介绍： Frontiers in Behavioral Neuroscience is a leading journal in its field, publishing rigorously peer-reviewed research that advances our understanding of the neural mechanisms underlying behavior. Field Chief Editor Nuno Sousa at the Instituto de Pesquisa em Ciências da Vida e da Saúde (ICVS) is supported by an outstanding Editorial Board of international experts. This multidisciplinary open-access journal is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics, clinicians and the public worldwide. This journal publishes major insights into the neural mechanisms of animal and human behavior, and welcomes articles studying the interplay between behavior and its neurobiological basis at all levels: from molecular biology and genetics, to morphological, biochemical, neurochemical, electrophysiological, neuroendocrine, pharmacological, and neuroimaging studies.