变压器和皮质波：跨时空语境牵引的编码器。

IF 14.6 1区医学 Q1 NEUROSCIENCES Trends in Neurosciences Pub Date : 2024-10-01 Epub Date: 2024-09-27 DOI:10.1016/j.tins.2024.08.006

Lyle Muller, Patricia S Churchland, Terrence J Sejnowski

{"title":"变压器和皮质波：跨时空语境牵引的编码器。","authors":"Lyle Muller, Patricia S Churchland, Terrence J Sejnowski","doi":"10.1016/j.tins.2024.08.006","DOIUrl":null,"url":null,"abstract":"The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.","PeriodicalId":23325,"journal":{"name":"Trends in Neurosciences","volume":" ","pages":"788-802"},"PeriodicalIF":14.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transformers and cortical waves: encoders for pulling in context across time.\",\"authors\":\"Lyle Muller, Patricia S Churchland, Terrence J Sejnowski\",\"doi\":\"10.1016/j.tins.2024.08.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.\",\"PeriodicalId\":23325,\"journal\":{\"name\":\"Trends in Neurosciences\",\"volume\":\" \",\"pages\":\"788-802\"},\"PeriodicalIF\":14.6000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Trends in Neurosciences\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.tins.2024.08.006\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trends in Neurosciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.tins.2024.08.006","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"NEUROSCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

ChatGPT 等转换器网络和其他大型语言模型（LLM）的能力吸引了全世界的目光。其性能背后的关键计算机制依赖于将一个完整的输入序列（例如，一个句子中的所有单词）转换为一个长的 "编码向量"，从而使转换器能够学习自然序列中的长程时间依赖关系。具体来说，应用于该编码向量的 "自我注意 "通过计算输入序列中词对之间的关联，增强了转换器的时间上下文。我们认为，穿越单个皮层区域或全脑范围内多个区域的神经活动波可以实现类似的编码原理。通过将最近的输入历史封装成每一时刻的单一空间模式，大脑皮层波可以从感觉输入序列中提取时间背景，这与变压器中使用的计算原理相同。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Transformers and cortical waves: encoders for pulling in context across time.

The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Trends in Neurosciences 医学-神经科学

CiteScore

26.50

自引率

1.30%

发文量

123

审稿时长

6-12 weeks

期刊介绍： For over four decades, Trends in Neurosciences (TINS) has been a prominent source of inspiring reviews and commentaries across all disciplines of neuroscience. TINS is a monthly, peer-reviewed journal, and its articles are curated by the Editor and authored by leading researchers in their respective fields. The journal communicates exciting advances in brain research, serves as a voice for the global neuroscience community, and highlights the contribution of neuroscientific research to medicine and society.

期刊最新文献

Origins of food selectivity in human visual cortex. Re-examining the pathobiological basis of gait dysfunction in Parkinson's disease. Establishing functionally segregated dopaminergic circuits. Mechanistic insights into chemotherapy-induced circadian disruption using rodent models. Coordinating the energetic strategy of glia and neurons for memory.