Natural behaviour is learned through dopamine-mediated reinforcement

IF 48.5 1区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Nature Pub Date : 2025-03-12 DOI:10.1038/s41586-025-08729-1

Jonathan Kasdin, Alison Duffy, Nathan Nadler, Arnav Raha, Adrienne L. Fairhall, Kimberly L. Stachenfeld, Vikram Gadagkar

{"title":"Natural behaviour is learned through dopamine-mediated reinforcement","authors":"Jonathan Kasdin, Alison Duffy, Nathan Nadler, Arnav Raha, Adrienne L. Fairhall, Kimberly L. Stachenfeld, Vikram Gadagkar","doi":"10.1038/s41586-025-08729-1","DOIUrl":null,"url":null,"abstract":"Many natural motor skills, such as speaking or locomotion, are acquired through a process of trial-and-error learning over the course of development. It has long been hypothesized, motivated by observations in artificial learning experiments, that dopamine has a crucial role in this process. Dopamine in the basal ganglia is thought to guide reward-based trial-and-error learning by encoding reward prediction errors1, decreasing after worse-than-predicted reward outcomes and increasing after better-than-predicted ones. Our previous work in adult zebra finches—in which we changed the perceived song quality with distorted auditory feedback—showed that dopamine in Area X, the singing-related basal ganglia, encodes performance prediction error: dopamine is suppressed after worse-than-predicted (distorted syllables) and activated after better-than-predicted (undistorted syllables) performance2. However, it remains unknown whether the learning of natural behaviours, such as developmental vocal learning, occurs through dopamine-based reinforcement. Here we tracked song learning trajectories in juvenile zebra finches and used fibre photometry3 to monitor concurrent dopamine activity in Area X. We found that dopamine was activated after syllable renditions that were closer to the eventual adult version of the song, compared with recent renditions, and suppressed after renditions that were further away. Furthermore, the relationship between dopamine and song fluctuations revealed that dopamine predicted the future evolution of song, suggesting that dopamine drives behaviour. Finally, dopamine activity was explained by the contrast between the quality of the current rendition and the recent history of renditions—consistent with dopamine’s hypothesized role in encoding prediction errors in an actor–critic reinforcement-learning model4,5. Reinforcement-learning algorithms6 have emerged as a powerful class of model to explain learning in reward-based laboratory tasks, as well as for driving autonomous learning in artificial intelligence7. Our results suggest that complex natural behaviours in biological systems can also be acquired through dopamine-mediated reinforcement learning. Studies in zebra finches show that dopamine has a key role as a reinforcement signal in the trial-and-error process of learning that underlies complex natural behaviours.","PeriodicalId":18787,"journal":{"name":"Nature","volume":"641 8063","pages":"699-706"},"PeriodicalIF":48.5000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature","FirstCategoryId":"103","ListUrlMain":"https://www.nature.com/articles/s41586-025-08729-1","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Many natural motor skills, such as speaking or locomotion, are acquired through a process of trial-and-error learning over the course of development. It has long been hypothesized, motivated by observations in artificial learning experiments, that dopamine has a crucial role in this process. Dopamine in the basal ganglia is thought to guide reward-based trial-and-error learning by encoding reward prediction errors1, decreasing after worse-than-predicted reward outcomes and increasing after better-than-predicted ones. Our previous work in adult zebra finches—in which we changed the perceived song quality with distorted auditory feedback—showed that dopamine in Area X, the singing-related basal ganglia, encodes performance prediction error: dopamine is suppressed after worse-than-predicted (distorted syllables) and activated after better-than-predicted (undistorted syllables) performance2. However, it remains unknown whether the learning of natural behaviours, such as developmental vocal learning, occurs through dopamine-based reinforcement. Here we tracked song learning trajectories in juvenile zebra finches and used fibre photometry3 to monitor concurrent dopamine activity in Area X. We found that dopamine was activated after syllable renditions that were closer to the eventual adult version of the song, compared with recent renditions, and suppressed after renditions that were further away. Furthermore, the relationship between dopamine and song fluctuations revealed that dopamine predicted the future evolution of song, suggesting that dopamine drives behaviour. Finally, dopamine activity was explained by the contrast between the quality of the current rendition and the recent history of renditions—consistent with dopamine’s hypothesized role in encoding prediction errors in an actor–critic reinforcement-learning model4,5. Reinforcement-learning algorithms6 have emerged as a powerful class of model to explain learning in reward-based laboratory tasks, as well as for driving autonomous learning in artificial intelligence7. Our results suggest that complex natural behaviours in biological systems can also be acquired through dopamine-mediated reinforcement learning. Studies in zebra finches show that dopamine has a key role as a reinforcement signal in the trial-and-error process of learning that underlies complex natural behaviours.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自然行为是通过多巴胺介导的强化习得的

许多自然的运动技能，如说话或移动，都是在发展过程中通过不断的试错学习而获得的。长期以来，在人工学习实验观察的推动下，人们一直假设多巴胺在这一过程中起着至关重要的作用。基底神经节中的多巴胺被认为通过编码奖励预测错误来指导基于奖励的试错学习，在奖励结果比预期差时减少，在奖励结果比预期好时增加。我们之前在成年斑胸草雀的研究中——我们用扭曲的听觉反馈改变了它们对歌唱质量的感知——表明，与歌唱相关的基底神经节X区中的多巴胺编码了表演预测误差：在表现比预期差（扭曲的音节）时，多巴胺被抑制，在表现比预期好（未扭曲的音节）时，多巴胺被激活2。然而，自然行为的学习，如发育性声乐学习，是否通过基于多巴胺的强化发生，尚不清楚。在这里，我们追踪了幼斑草雀的鸣叫学习轨迹，并使用纤维光度法监测x区多巴胺的同时活动。我们发现，与最近的鸣叫相比，在更接近最终成年鸣叫的音节鸣叫后，多巴胺被激活，而在更远的鸣叫后，多巴胺被抑制。此外，多巴胺和歌曲波动之间的关系揭示了多巴胺预测了歌曲的未来进化，表明多巴胺驱动行为。最后，多巴胺的活性可以通过当前表演质量和近期表演历史之间的对比来解释，这与演员-评论家强化学习模型中多巴胺在编码预测误差中的假设作用相一致。强化学习算法已经成为一类强大的模型，用于解释基于奖励的实验室任务中的学习，以及驱动人工智能中的自主学习。我们的研究结果表明，生物系统中复杂的自然行为也可以通过多巴胺介导的强化学习获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Nature 综合性期刊-综合性期刊

CiteScore

90.00

自引率

1.20%

发文量

3652

审稿时长

3 months

期刊介绍： Nature is a prestigious international journal that publishes peer-reviewed research in various scientific and technological fields. The selection of articles is based on criteria such as originality, importance, interdisciplinary relevance, timeliness, accessibility, elegance, and surprising conclusions. In addition to showcasing significant scientific advances, Nature delivers rapid, authoritative, insightful news, and interpretation of current and upcoming trends impacting science, scientists, and the broader public. The journal serves a dual purpose: firstly, to promptly share noteworthy scientific advances and foster discussions among scientists, and secondly, to ensure the swift dissemination of scientific results globally, emphasizing their significance for knowledge, culture, and daily life.

期刊最新文献

Artemis II is go: humans head to the Moon after half-century absence. China is planning to land people on the Moon - and might beat the United States to it. Regular physical activity in midlife cuts risk of early death. Breakthrough computer chip tech could help meet 'monumental demand' driven by AI. How procrastination can rob you of career fulfilment in science.