变压器和皮质波:跨时空语境牵引的编码器。

IF 14.6 1区 医学 Q1 NEUROSCIENCES Trends in Neurosciences Pub Date : 2024-10-01 Epub Date: 2024-09-27 DOI:10.1016/j.tins.2024.08.006
Lyle Muller, Patricia S Churchland, Terrence J Sejnowski
{"title":"变压器和皮质波:跨时空语境牵引的编码器。","authors":"Lyle Muller, Patricia S Churchland, Terrence J Sejnowski","doi":"10.1016/j.tins.2024.08.006","DOIUrl":null,"url":null,"abstract":"<p><p>The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.</p>","PeriodicalId":23325,"journal":{"name":"Trends in Neurosciences","volume":" ","pages":"788-802"},"PeriodicalIF":14.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformers and cortical waves: encoders for pulling in context across time.\",\"authors\":\"Lyle Muller, Patricia S Churchland, Terrence J Sejnowski\",\"doi\":\"10.1016/j.tins.2024.08.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.</p>\",\"PeriodicalId\":23325,\"journal\":{\"name\":\"Trends in Neurosciences\",\"volume\":\" \",\"pages\":\"788-802\"},\"PeriodicalIF\":14.6000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Trends in Neurosciences\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.tins.2024.08.006\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Neurosciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.tins.2024.08.006","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

ChatGPT 等转换器网络和其他大型语言模型(LLM)的能力吸引了全世界的目光。其性能背后的关键计算机制依赖于将一个完整的输入序列(例如,一个句子中的所有单词)转换为一个长的 "编码向量",从而使转换器能够学习自然序列中的长程时间依赖关系。具体来说,应用于该编码向量的 "自我注意 "通过计算输入序列中词对之间的关联,增强了转换器的时间上下文。我们认为,穿越单个皮层区域或全脑范围内多个区域的神经活动波可以实现类似的编码原理。通过将最近的输入历史封装成每一时刻的单一空间模式,大脑皮层波可以从感觉输入序列中提取时间背景,这与变压器中使用的计算原理相同。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Transformers and cortical waves: encoders for pulling in context across time.

The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Trends in Neurosciences
Trends in Neurosciences 医学-神经科学
CiteScore
26.50
自引率
1.30%
发文量
123
审稿时长
6-12 weeks
期刊介绍: For over four decades, Trends in Neurosciences (TINS) has been a prominent source of inspiring reviews and commentaries across all disciplines of neuroscience. TINS is a monthly, peer-reviewed journal, and its articles are curated by the Editor and authored by leading researchers in their respective fields. The journal communicates exciting advances in brain research, serves as a voice for the global neuroscience community, and highlights the contribution of neuroscientific research to medicine and society.
期刊最新文献
Advancing ALS research: public-private partnerships to accelerate drug and biomarker development. The intertwined relationship between circadian dysfunction and Parkinson's disease. Multiple predictions of others' actions in the human brain. Representational spaces in orbitofrontal and ventromedial prefrontal cortex: task states, values, and beyond. Interconnected neural circuits mediating social reward.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1