重新审视不确定性驱动探索在(可感知的)非静止世界中的作用。

Dalin Guo, Angela J Yu
{"title":"重新审视不确定性驱动探索在(可感知的)非静止世界中的作用。","authors":"Dalin Guo,&nbsp;Angela J Yu","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an \"uncertainty bonus\", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.</p>","PeriodicalId":72634,"journal":{"name":"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference","volume":"43 ","pages":"2045-2051"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341546/pdf/nihms-1725387.pdf","citationCount":"0","resultStr":"{\"title\":\"Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.\",\"authors\":\"Dalin Guo,&nbsp;Angela J Yu\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an \\\"uncertainty bonus\\\", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.</p>\",\"PeriodicalId\":72634,\"journal\":{\"name\":\"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference\",\"volume\":\"43 \",\"pages\":\"2045-2051\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341546/pdf/nihms-1725387.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

人类经常面临着探索与开发的权衡。一个常用的范例,多臂强盗,表明人类表现出“不确定性奖励”,它与估计奖励相结合,推动探索。然而,先前的研究通常使用假设奖励偶然性保持平稳的贝叶斯模型或强化学习模型来建模信念更新。另外,我们之前的研究表明,人类在强盗任务中的学习最好是用动态信念贝叶斯模型来描述的。我们假设估计的不确定性增益可能取决于所采用的学习模型。在这里,我们使用所有三种学习模型重新分析了一个强盗数据集。我们发现动态信念模型最好地捕捉了人类的选择行为,同时也揭示了比其他模型更大的不确定性奖励。更广泛地说,我们的结果还强调了适当的学习模型的重要性,因为它对于正确描述人类决策背后的过程至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.

Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an "uncertainty bonus", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Effect of Fatigue on Word Production in Aphasia. Connecting Adaptive Perceptual Learning and Signal Detection Theory in Skin Cancer Screening. Very Young Infants' Sensitivity to Consonant Mispronunciations in Word Recognition. Verb vocabularies are shaped by complex meanings from the onset of development. A Neural Network Model of Continual Learning with Cognitive Control.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1