重新审视不确定性驱动探索在(可感知的)非静止世界中的作用。

CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference Pub Date : 2021-07-01

Dalin Guo, Angela J Yu

{"title":"重新审视不确定性驱动探索在(可感知的)非静止世界中的作用。","authors":"Dalin Guo, Angela J Yu","doi":"","DOIUrl":null,"url":null,"abstract":"Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an \"uncertainty bonus\", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.","PeriodicalId":72634,"journal":{"name":"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference","volume":"43 ","pages":"2045-2051"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341546/pdf/nihms-1725387.pdf","citationCount":"0","resultStr":"{\"title\":\"Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.\",\"authors\":\"Dalin Guo, Angela J Yu\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an \\\"uncertainty bonus\\\", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.\",\"PeriodicalId\":72634,\"journal\":{\"name\":\"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference\",\"volume\":\"43 \",\"pages\":\"2045-2051\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341546/pdf/nihms-1725387.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

人类经常面临着探索与开发的权衡。一个常用的范例，多臂强盗，表明人类表现出“不确定性奖励”，它与估计奖励相结合，推动探索。然而，先前的研究通常使用假设奖励偶然性保持平稳的贝叶斯模型或强化学习模型来建模信念更新。另外，我们之前的研究表明，人类在强盗任务中的学习最好是用动态信念贝叶斯模型来描述的。我们假设估计的不确定性增益可能取决于所采用的学习模型。在这里，我们使用所有三种学习模型重新分析了一个强盗数据集。我们发现动态信念模型最好地捕捉了人类的选择行为，同时也揭示了比其他模型更大的不确定性奖励。更广泛地说，我们的结果还强调了适当的学习模型的重要性，因为它对于正确描述人类决策背后的过程至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.

Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an "uncertainty bonus", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

CogSci ... Annual Conference of the Cognitive Science Society. Cognitive Science Society (U.S.). Conference

自引率

0.00%

发文量