Pub Date : 2025-01-01Epub Date: 2025-04-15DOI: 10.1080/17588928.2025.2484485
Thomas Parr, Giovanni Pezzulo, Karl Friston
This paper asks what predictive processing models of brain function can learn from the success of transformer architectures. We suggest that the reason transformer architectures have been successful is that they implicitly commit to a non-Markovian generative model - in which we need memory to contextualize our current observations and make predictions about the future. Interestingly, both the notions of working memory in cognitive science and transformer architectures rely heavily upon the concept of attention. We will argue that the move beyond Markov is crucial in the construction of generative models capable of dealing with much of the sequential data - and certainly language - that our brains contend with. We characterize two broad approaches to this problem - deep temporal hierarchies and autoregressive models - with transformers being an example of the latter. Our key conclusions are that transformers benefit heavily from their use of embedding spaces that place strong metric priors on an implicit latent variable and utilize this metric to direct a form of attention that highlights the most relevant, and not only the most recent, previous elements in a sequence to help predict the next.
{"title":"Beyond Markov: Transformers, memory, and attention.","authors":"Thomas Parr, Giovanni Pezzulo, Karl Friston","doi":"10.1080/17588928.2025.2484485","DOIUrl":"10.1080/17588928.2025.2484485","url":null,"abstract":"<p><p>This paper asks what predictive processing models of brain function can learn from the success of transformer architectures. We suggest that the reason transformer architectures have been successful is that they implicitly commit to a non-Markovian generative model - in which we need memory to contextualize our current observations and make predictions about the future. Interestingly, both the notions of working memory in cognitive science and transformer architectures rely heavily upon the concept of attention. We will argue that the move beyond Markov is crucial in the construction of generative models capable of dealing with much of the sequential data - and certainly language - that our brains contend with. We characterize two broad approaches to this problem - deep temporal hierarchies and autoregressive models - with transformers being an example of the latter. Our key conclusions are that transformers benefit heavily from their use of embedding spaces that place strong metric priors on an implicit latent variable and utilize this metric to direct a form of attention that highlights the most relevant, and not only the most recent, previous elements in a sequence to help predict the next.</p>","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"5-23"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143984568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-09-18DOI: 10.1080/17588928.2025.2561588
Giosuè Baggio
ROSE is a rare example of a neurocomputational model of language that attempts, and partly manages, to align a formal theory of syntax and parsing with an oscillations-based 'neural code' that could implement the required operations. ROSE successfully reconciles hierarchical and predictive syntactic processing, but I argue that models of language in the brain should make room for the possibility that meaning may also be derived in the absence of any syntactic computation, be it hierarchical or predictive.
{"title":"Autonomous semantics and syntax on-demand in neurocomputational models of language.","authors":"Giosuè Baggio","doi":"10.1080/17588928.2025.2561588","DOIUrl":"10.1080/17588928.2025.2561588","url":null,"abstract":"<p><p>ROSE is a rare example of a neurocomputational model of language that attempts, and partly manages, to align a formal theory of syntax and parsing with an oscillations-based 'neural code' that could implement the required operations. ROSE successfully reconciles hierarchical and predictive syntactic processing, but I argue that models of language in the brain should make room for the possibility that meaning may also be derived in the absence of any syntactic computation, be it hierarchical or predictive.</p>","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"81-82"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145079689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-07-04DOI: 10.1080/17588928.2025.2523889
Mahault Albarracin, Dalton A R Sakthivadivel
Parr, et al., explore the problem of non-Markovian pro cesses, in which the future state of a system depends not only on its present state but also on its past states. The authors suggest that the success of transformer networks in dealing with sequential data, such as language, stems from their ability to address this non-Markovian nature through the use of attention mechanisms. This commentary builds on their discussion, aiming to link it to some notions in Husserlian phenomenology and explore the implications for understanding meaning, context, and the nature of knowledge.
{"title":"Non-Markovian systems, phenomenology, and the challenges of capturing meaning and context - comment on Parr, Pezzulo, and Friston (2025).","authors":"Mahault Albarracin, Dalton A R Sakthivadivel","doi":"10.1080/17588928.2025.2523889","DOIUrl":"10.1080/17588928.2025.2523889","url":null,"abstract":"<p><p>Parr, et al., explore the problem of non-Markovian pro cesses, in which the future state of a system depends not only on its present state but also on its past states. The authors suggest that the success of transformer networks in dealing with sequential data, such as language, stems from their ability to address this non-Markovian nature through the use of attention mechanisms. This commentary builds on their discussion, aiming to link it to some notions in Husserlian phenomenology and explore the implications for understanding meaning, context, and the nature of knowledge.</p>","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"35-36"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144559411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-07-14DOI: 10.1080/17588928.2025.2523875
Elliot Murphy
Processing natural language syntax requires a negotiation between symbolic and subsymbolic representations. Building on the recent representation, operation, structure, encoding (ROSE) neurocomputational architecture for syntax that scales from single units to inter-areal dynamics, I discuss the prospects of reconciling the neural code for hierarchical syntax with predictive processes. Here, the higher levels of ROSE provide instructions for symbolic phrase structure representations (S/E), while the lower levels provide probabilistic aspects of linguistic processing (R/O), with different types of cross-frequency coupling being hypothesized to interface these domains. I argue that ROSE provides a possible infrastructure for flexibly implementing distinct types of minimalist grammar parsers for the real-time processing of language. This perspective helps furnish a more restrictive 'core language network' in the brain than contemporary approaches that isolate general sentence composition. I define the language network as being critically involved in executing specific parsing operations (i.e. establishing phrasal categories, tree-structure depth, resolving dependencies, and retrieving proprietary lexical representations), capturing these network-defining operations jointly with probabilistic aspects of parsing. ROSE offers a 'mesoscopic protectorate' for natural language; an intermediate level of emergent organizational complexity that demands multi-scale modeling. By drawing principled relations across computational, algorithmic and implementational Marrian levels, ROSE offers new constraints on what a unified neurocomputational settlement for natural language syntax might look like, providing a tentative scaffold for a 'Universal Neural Grammar' - a species-specific format for neurally organizing the construction of compositional syntactic structures, which matures in accordance with a genetically determined biological matrix.
{"title":"ROSE: A Universal Neural Grammar.","authors":"Elliot Murphy","doi":"10.1080/17588928.2025.2523875","DOIUrl":"10.1080/17588928.2025.2523875","url":null,"abstract":"<p><p>Processing natural language syntax requires a negotiation between symbolic and subsymbolic representations. Building on the recent representation, operation, structure, encoding (ROSE) neurocomputational architecture for syntax that scales from single units to inter-areal dynamics, I discuss the prospects of reconciling the neural code for hierarchical syntax with predictive processes. Here, the higher levels of ROSE provide instructions for symbolic phrase structure representations (S/E), while the lower levels provide probabilistic aspects of linguistic processing (R/O), with different types of cross-frequency coupling being hypothesized to interface these domains. I argue that ROSE provides a possible infrastructure for flexibly implementing distinct types of minimalist grammar parsers for the real-time processing of language. This perspective helps furnish a more restrictive 'core language network' in the brain than contemporary approaches that isolate general sentence composition. I define the language network as being critically involved in executing specific parsing operations (i.e. establishing phrasal categories, tree-structure depth, resolving dependencies, and retrieving proprietary lexical representations), capturing these network-defining operations jointly with probabilistic aspects of parsing. ROSE offers a 'mesoscopic protectorate' for natural language; an intermediate level of emergent organizational complexity that demands multi-scale modeling. By drawing principled relations across computational, algorithmic and implementational Marrian levels, ROSE offers new constraints on what a unified neurocomputational settlement for natural language syntax might look like, providing a tentative scaffold for a 'Universal Neural Grammar' - a species-specific format for neurally organizing the construction of compositional syntactic structures, which matures in accordance with a genetically determined biological matrix.</p>","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"49-80"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144625415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-04-29DOI: 10.1080/17588928.2025.2497762
Berfin Bastug, Urte Roeber, Erich Schröger
The brain learns statistical regularities in sensory sequences, enhancing behavioral performance for predictable stimuli while impairing behavioral performance for unpredictable stimuli. While previous research has shown that violations of non-informative regularities hinder task performance, it remains unclear whether predictable but task-irrelevant structures can facilitate performance. In a tone duration discrimination task, we manipulated the task-irrelevant pitch dimension by varying transition probabilities (TP) between successive tone frequencies. Participants judged duration, while pitch sequences were either deterministic (a rule-governed pitch pattern, TP = 1) or stochastic (no discernible pitch pattern, TP = 1/number of pitch levels). The tone pitch was task-irrelevant and it did not predict duration. Results showed that reaction times (RTs) were significantly faster for deterministic sequences, suggesting that predictability in a task-irrelevant dimension still facilitates task performance. RTs were also faster in two-tone sequences compared to eight-tone sequences, likely due to reduced memory load. These findings suggest that statistical learning benefits extend beyond task-relevant dimensions, supporting a predictive coding framework in which the brain integrates predictable sensory input to optimize cognitive processing.
{"title":"Auditory facilitation in deterministic versus stochastic worlds.","authors":"Berfin Bastug, Urte Roeber, Erich Schröger","doi":"10.1080/17588928.2025.2497762","DOIUrl":"10.1080/17588928.2025.2497762","url":null,"abstract":"<p><p>The brain learns statistical regularities in sensory sequences, enhancing behavioral performance for predictable stimuli while impairing behavioral performance for unpredictable stimuli. While previous research has shown that violations of non-informative regularities hinder task performance, it remains unclear whether predictable but task-irrelevant structures can facilitate performance. In a tone duration discrimination task, we manipulated the task-irrelevant pitch dimension by varying transition probabilities (TP) between successive tone frequencies. Participants judged duration, while pitch sequences were either deterministic (a rule-governed pitch pattern, TP = 1) or stochastic (no discernible pitch pattern, TP = 1/number of pitch levels). The tone pitch was task-irrelevant and it did not predict duration. Results showed that reaction times (RTs) were significantly faster for deterministic sequences, suggesting that predictability in a task-irrelevant dimension still facilitates task performance. RTs were also faster in two-tone sequences compared to eight-tone sequences, likely due to reduced memory load. These findings suggest that statistical learning benefits extend beyond task-relevant dimensions, supporting a predictive coding framework in which the brain integrates predictable sensory input to optimize cognitive processing.</p>","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"93-99"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143969935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-06-17DOI: 10.1080/17588928.2025.2518942
Tadahiro Taniguchi
{"title":"Beyond individuals: Collective predictive coding for memory, attention, and the emergence of language.","authors":"Tadahiro Taniguchi","doi":"10.1080/17588928.2025.2518942","DOIUrl":"10.1080/17588928.2025.2518942","url":null,"abstract":"","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"41-42"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144316040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-09-18DOI: 10.1080/17588928.2025.2561587
Edward Ruoyang Shi
{"title":"How the brain recycled memory circuits for language: An evolutionary perspective on the ROSE model.","authors":"Edward Ruoyang Shi","doi":"10.1080/17588928.2025.2561587","DOIUrl":"10.1080/17588928.2025.2561587","url":null,"abstract":"","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"88-90"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145079817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-06-18DOI: 10.1080/17588928.2025.2521403
Elliot Murphy
{"title":"Beyond prediction: comments on the format of natural intelligence.","authors":"Elliot Murphy","doi":"10.1080/17588928.2025.2521403","DOIUrl":"10.1080/17588928.2025.2521403","url":null,"abstract":"","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"37-40"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144324646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2025-11-17DOI: 10.1080/17588928.2025.2584209
Joseph B Hopfinger, Scott D Slotnick
With recent developments in artificial intelligence (AI), there is great interest in how mechanisms of human cognitive processing may be instantiated in those models and how those models may help us better understand human cognitive and neural processes. Recent research suggests predictive coding theories and associated generative models may help explain the processes of visual perception and language production, while newer AI models include mechanisms akin to human memory and attention. This special issue of Cognitive Neuroscience: Current Debates, Research & Reports presents 16 new papers that highlight important topics and present exciting new data, models, and controversies. The articles include a new discussion paper by Parr, Pezzulo, and Friston exploring how transformer architectures utilize non-Markovian generative models and how an attention-like process is critical for processing complex sequential data. This is followed by seven insightful commentaries and a reply from the authors. A discussion paper on a new neurocomputational model of syntax is provided by Murphy, in which predictive processes are integrated in a multi-level, hierarchical syntax architecture. This is followed by five commentaries suggesting important evolutionary and developmental perspectives and ways to explore and test the model. Finally, an empirical article by Bastug, Roeber, and Schröger on auditory perception presents new evidence suggesting that distracting information requires less cognitive processing when it is predictable. The topics of this special issue are evolving rapidly and promise to be at the heart of future developments in artificial learning systems and theories of the brain mechanisms that mediate cognitive processes.
{"title":"Predictive coding of cognitive processes in natural and artificial systems.","authors":"Joseph B Hopfinger, Scott D Slotnick","doi":"10.1080/17588928.2025.2584209","DOIUrl":"https://doi.org/10.1080/17588928.2025.2584209","url":null,"abstract":"<p><p>With recent developments in artificial intelligence (AI), there is great interest in how mechanisms of human cognitive processing may be instantiated in those models and how those models may help us better understand human cognitive and neural processes. Recent research suggests predictive coding theories and associated generative models may help explain the processes of visual perception and language production, while newer AI models include mechanisms akin to human memory and attention. This special issue of <i>Cognitive Neuroscience: Current Debates, Research & Reports</i> presents 16 new papers that highlight important topics and present exciting new data, models, and controversies. The articles include a new discussion paper by Parr, Pezzulo, and Friston exploring how transformer architectures utilize non-Markovian generative models and how an attention-like process is critical for processing complex sequential data. This is followed by seven insightful commentaries and a reply from the authors. A discussion paper on a new neurocomputational model of syntax is provided by Murphy, in which predictive processes are integrated in a multi-level, hierarchical syntax architecture. This is followed by five commentaries suggesting important evolutionary and developmental perspectives and ways to explore and test the model. Finally, an empirical article by Bastug, Roeber, and Schröger on auditory perception presents new evidence suggesting that distracting information requires less cognitive processing when it is predictable. The topics of this special issue are evolving rapidly and promise to be at the heart of future developments in artificial learning systems and theories of the brain mechanisms that mediate cognitive processes.</p>","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":"16 1-4","pages":"1-4"},"PeriodicalIF":2.2,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145539234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}