Understanding the development of reward learning through the lens of meta-learning

IF 21.8 Q1 PSYCHOLOGY, MULTIDISCIPLINARY Nature reviews psychology Pub Date : 2024-04-18 DOI:10.1038/s44159-024-00304-1

Kate Nussenbaum, Catherine A. Hartley

{"title":"Understanding the development of reward learning through the lens of meta-learning","authors":"Kate Nussenbaum, Catherine A. Hartley","doi":"10.1038/s44159-024-00304-1","DOIUrl":null,"url":null,"abstract":"Determining how environments shape how people learn is central to understanding individual differences in goal-directed behaviour. Studies of the effects of early-life adversity on reward learning have revealed that the environments that infants and children experience exert lasting influences on reward-guided behaviour. However, the varied findings from this research are difficult to reconcile under a unified computational account. Studies of adaptive reinforcement learning have demonstrated that learning algorithms and parameters dynamically adapt to support reward-guided behaviour in varied contexts, but this body of research has largely focused on learning that proceeds within the short timeframes of experimental tasks. In this Perspective, we argue that, to understand how the structure of experienced environments shapes reward learning across development, computational accounts of the effects of environmental statistics on reinforcement learning need to be extended to encompass learning across multiple nested timescales of experience. To this end, we consider the development of reward learning through the lens of meta-learning models, in particular meta-reinforcement learning. This computational formalization can inspire new hypotheses and methods for empirical research to understand how features of experienced environments give rise to individual differences in learning and adaptive behaviour across development. Environments shape reward learning, which can result in individual differences in behaviour. In this Perspective, Nussenbaum and Hartley consider the development of reward learning through the lens of meta-learning models, in particular meta-reinforcement learning.","PeriodicalId":74249,"journal":{"name":"Nature reviews psychology","volume":"3 6","pages":"424-438"},"PeriodicalIF":21.8000,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature reviews psychology","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s44159-024-00304-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Determining how environments shape how people learn is central to understanding individual differences in goal-directed behaviour. Studies of the effects of early-life adversity on reward learning have revealed that the environments that infants and children experience exert lasting influences on reward-guided behaviour. However, the varied findings from this research are difficult to reconcile under a unified computational account. Studies of adaptive reinforcement learning have demonstrated that learning algorithms and parameters dynamically adapt to support reward-guided behaviour in varied contexts, but this body of research has largely focused on learning that proceeds within the short timeframes of experimental tasks. In this Perspective, we argue that, to understand how the structure of experienced environments shapes reward learning across development, computational accounts of the effects of environmental statistics on reinforcement learning need to be extended to encompass learning across multiple nested timescales of experience. To this end, we consider the development of reward learning through the lens of meta-learning models, in particular meta-reinforcement learning. This computational formalization can inspire new hypotheses and methods for empirical research to understand how features of experienced environments give rise to individual differences in learning and adaptive behaviour across development. Environments shape reward learning, which can result in individual differences in behaviour. In this Perspective, Nussenbaum and Hartley consider the development of reward learning through the lens of meta-learning models, in particular meta-reinforcement learning.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从元学习的角度理解奖励学习的发展

确定环境如何影响人们的学习方式是了解目标导向行为个体差异的核心。有关早期逆境对奖赏学习影响的研究表明，婴幼儿所经历的环境会对奖赏导向行为产生持久的影响。然而，这些研究的不同发现很难在统一的计算解释下进行调和。对适应性强化学习的研究表明，学习算法和参数会动态调整，以支持不同情境下的奖赏引导行为，但这些研究主要集中于在实验任务的短时间内进行的学习。在本《视角》中，我们认为，为了理解经验环境的结构如何在整个发展过程中影响奖赏学习，需要扩展环境统计对强化学习影响的计算描述，以涵盖跨越多个嵌套经验时间尺度的学习。为此，我们从元学习模型，特别是元强化学习的角度来考虑奖赏学习的发展。这种计算形式化可以为实证研究提供新的假设和方法，从而了解经验环境的特征如何导致个体在整个发展过程中的学习和适应行为差异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊