首页 > 最新文献

Neural Computation最新文献

英文 中文
Working Memory and Self-Directed Inner Speech Enhance Multitask Generalization in Active Inference 工作记忆和自我导向内言增强主动推理的多任务泛化。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-22 DOI: 10.1162/NECO.a.36
Jeffrey Frederic Queißer;Jun Tani
This simulation study shows how a set of working memory tasks can be acquired simultaneously through interaction between a stacked recurrent neural network (RNN) and multiple working memories. In these tasks, temporal patterns are provided, followed by linguistically specified task goals. Training is performed in a supervised manner by minimizing the free energy, and goal-directed tasks are performed using the active inference (AIF) framework. Our simulation results show that the best task performance is obtained when two working memory modules are used instead of one or none and when self-directed inner speech is incorporated during task execution. Detailed analysis indicates that a temporal hierarchy develops in the stacked RNN module under these optimal conditions. We argue that the model’s capacity for generalization across novel task configurations is supported by the structured interplay between working memory and the generation of self-directed language outputs during task execution. This interplay promotes internal representations that reflect task structure, which in turn support generalization by enabling a functional separation between content encoding and control dynamics within the memory architecture.
本仿真研究展示了如何通过堆叠递归神经网络(RNN)与多个工作记忆之间的相互作用,同时获得一组工作记忆任务。在这些任务中,提供了时间模式,然后是语言指定的任务目标。通过最小化自由能以监督的方式执行训练,并使用主动推理(AIF)框架执行目标导向任务。仿真结果表明,当使用两个工作记忆模块而不是一个或没有工作记忆模块时,以及在任务执行过程中加入自我导向的内部语音时,可以获得最佳的任务性能。详细分析表明,在这些最优条件下,堆叠RNN模块中形成了时间层次结构。我们认为,在任务执行过程中,工作记忆和自我导向语言输出之间的结构化相互作用支持了该模型在新任务配置中的泛化能力。这种相互作用促进了反映任务结构的内部表示,从而通过在内存体系结构中实现内容编码和控制动态之间的功能分离来支持泛化。
{"title":"Working Memory and Self-Directed Inner Speech Enhance Multitask Generalization in Active Inference","authors":"Jeffrey Frederic Queißer;Jun Tani","doi":"10.1162/NECO.a.36","DOIUrl":"10.1162/NECO.a.36","url":null,"abstract":"This simulation study shows how a set of working memory tasks can be acquired simultaneously through interaction between a stacked recurrent neural network (RNN) and multiple working memories. In these tasks, temporal patterns are provided, followed by linguistically specified task goals. Training is performed in a supervised manner by minimizing the free energy, and goal-directed tasks are performed using the active inference (AIF) framework. Our simulation results show that the best task performance is obtained when two working memory modules are used instead of one or none and when self-directed inner speech is incorporated during task execution. Detailed analysis indicates that a temporal hierarchy develops in the stacked RNN module under these optimal conditions. We argue that the model’s capacity for generalization across novel task configurations is supported by the structured interplay between working memory and the generation of self-directed language outputs during task execution. This interplay promotes internal representations that reflect task structure, which in turn support generalization by enabling a functional separation between content encoding and control dynamics within the memory architecture.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"38 1","pages":"28-70"},"PeriodicalIF":2.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145403074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effective Learning Rules as Natural Gradient Descent 作为自然梯度下降的有效学习规则。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-22 DOI: 10.1162/NECO.a.1474
Lucas Shoji;Kenta Suzuki;Leo Kozachkov
We establish that a broad class of effective learning rules—those that improve a scalar performance measure over a given time window—can be expressed as natural gradient descent with respect to an appropriately defined metric. Specifically, parameter updates in this class can always be written as the product of a symmetric positive-definite matrix and the negative gradient of a loss function encoding the task. Given the high level of generality, our findings formally support the idea that the gradient is a fundamental object underlying all learning processes. Our results are valid across a wide range of common settings, including continuous- time, discrete-time, stochastic, and higher-order learning rules, as well as loss functions with explicit time dependence. Beyond providing a unified framework for learning, our results also have practical implications for control as well as experimental neuroscience.
我们建立了一大类有效的学习规则——那些在给定时间窗口内改进标量性能度量的规则——可以表示为相对于适当定义的度量的自然梯度下降。具体来说,这类中的参数更新总是可以写成对称正定矩阵与编码任务的损失函数的负梯度的乘积。考虑到高水平的普遍性,我们的研究结果正式支持了梯度是所有学习过程背后的基本对象的观点。我们的结果适用于广泛的常见设置,包括连续时间、离散时间、随机和高阶学习规则,以及具有明确时间依赖性的损失函数。除了提供一个统一的学习框架外,我们的结果对控制和实验神经科学也有实际意义。
{"title":"Effective Learning Rules as Natural Gradient Descent","authors":"Lucas Shoji;Kenta Suzuki;Leo Kozachkov","doi":"10.1162/NECO.a.1474","DOIUrl":"10.1162/NECO.a.1474","url":null,"abstract":"We establish that a broad class of effective learning rules—those that improve a scalar performance measure over a given time window—can be expressed as natural gradient descent with respect to an appropriately defined metric. Specifically, parameter updates in this class can always be written as the product of a symmetric positive-definite matrix and the negative gradient of a loss function encoding the task. Given the high level of generality, our findings formally support the idea that the gradient is a fundamental object underlying all learning processes. Our results are valid across a wide range of common settings, including continuous- time, discrete-time, stochastic, and higher-order learning rules, as well as loss functions with explicit time dependence. Beyond providing a unified framework for learning, our results also have practical implications for control as well as experimental neuroscience.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"38 1","pages":"71-96"},"PeriodicalIF":2.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Possible Principles for Aligned Structure Learning Agents 对齐结构学习智能体的可能原则。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-22 DOI: 10.1162/NECO.a.39
Lancelot Da Costa;Tomáš Gavenčiak;David Hyland;Mandana Samiei;Cristian Dragos-Manta;Candice Pattisapu;Adeel Razi;Karl Friston
This paper offers a road map for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests on enabling artificial agents to learn a good model of the world that includes a good model of our preferences. For this, the main objective is creating agents that learn to represent the world and other agents’ world models, a problem that falls under structure learning (also known as causal representation learning or model discovery). We expose the structure learning and alignment problems with this goal in mind, as well as principles to guide us forward, synthesizing various ideas across mathematics, statistics, and cognitive science. We discuss the essential role of core knowledge, information geometry, and model reduction in structure learning and suggest core structural modules to learn a wide range of naturalistic worlds. We then outline a way toward aligned agents through structure learning and theory of mind. As an illustrative example, we mathematically sketch Asimov’s laws of robotics, which prescribe agents to act cautiously to minimize the ill-being of other agents. We supplement this example by proposing refined approaches to alignment. These observations may guide the development of artificial intelligence in helping to scale existing, or design new, aligned structure learning systems.
本文从自然智能的第一原理描述出发,为可扩展对齐人工智能(AI)的发展提供了路线图。简而言之,通往可扩展的对齐人工智能的可能途径在于使人工智能能够学习一个良好的世界模型,其中包括一个良好的我们偏好模型。为此,主要目标是创建学习表示世界和其他代理的世界模型的代理,这个问题属于结构学习(也称为因果表示学习或模型发现)。我们针对这个目标揭示了结构学习和对齐问题,以及指导我们前进的原则,综合了数学、统计学和认知科学中的各种思想。我们讨论了核心知识、信息几何和模型约简在结构学习中的重要作用,并提出了学习广泛的自然世界的核心结构模块。然后,我们概述了通过结构学习和心智理论来实现对齐代理的方法。作为一个说明性的例子,我们在数学上概述了阿西莫夫的机器人定律,该定律规定代理人谨慎行事,以尽量减少其他代理人的不幸。我们通过提出精确的校准方法来补充这个例子。这些观察结果可以指导人工智能的发展,帮助扩大现有的规模,或者设计新的、一致的结构学习系统。
{"title":"Possible Principles for Aligned Structure Learning Agents","authors":"Lancelot Da Costa;Tomáš Gavenčiak;David Hyland;Mandana Samiei;Cristian Dragos-Manta;Candice Pattisapu;Adeel Razi;Karl Friston","doi":"10.1162/NECO.a.39","DOIUrl":"10.1162/NECO.a.39","url":null,"abstract":"This paper offers a road map for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests on enabling artificial agents to learn a good model of the world that includes a good model of our preferences. For this, the main objective is creating agents that learn to represent the world and other agents’ world models, a problem that falls under structure learning (also known as causal representation learning or model discovery). We expose the structure learning and alignment problems with this goal in mind, as well as principles to guide us forward, synthesizing various ideas across mathematics, statistics, and cognitive science. We discuss the essential role of core knowledge, information geometry, and model reduction in structure learning and suggest core structural modules to learn a wide range of naturalistic worlds. We then outline a way toward aligned agents through structure learning and theory of mind. As an illustrative example, we mathematically sketch Asimov’s laws of robotics, which prescribe agents to act cautiously to minimize the ill-being of other agents. We supplement this example by proposing refined approaches to alignment. These observations may guide the development of artificial intelligence in helping to scale existing, or design new, aligned structure learning systems.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"38 1","pages":"97-143"},"PeriodicalIF":2.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Associative Skill Memories for Safer Robotics and Modeling Human Sensorimotor Repertoires 安全机器人的神经联想技能记忆和人类感觉运动技能的建模。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-22 DOI: 10.1162/NECO.a.1475
Pranav Mahajan;Mufeng Tang;T. Ed Li;Ioannis Havoutis;Ben Seymour
Modern robots face a challenge shared by biological systems: how to learn and adaptively express multiple sensorimotor skills. A key aspect of this is developing an internal model of expected sensorimotor experiences to detect and react to unexpected events, guiding self-preserving behaviors. Associative skill memories (ASMs) address this by linking movement primitives to sensory feedback, but existing implementations rely on hard-coded libraries of individual skills. A key unresolved problem is how a single neural network can learn a repertoire of skills while enabling integrated fault detection and context-aware execution. Here we introduce neural associative skill memories (neural ASMs), a framework that uses self-supervised temporal predictive coding to integrate skill learning and expression using biologically plausible local learning rules. Unlike traditional ASMs, which require explicit skill selection, neural ASMs implicitly recognize and express skills through contextual inference, enabling fault detection using “predictive surprise” across the entire learned repertoire. Compared to recurrent neural networks trained using backpropagation through time, our model achieves comparable qualitative performance in skill memory expression while using local learning rules and predicts a biologically relevant speed-versus-accuracy trade-off. By integrating fault detection, reactive control, and skill expression into a single energy-based architecture, neural ASMs contribute to safer, self-preserving robotics and provide a computational lens to study biological sensorimotor learning.
现代机器人面临着生物系统共同的挑战:如何学习和自适应地表达多种感觉运动技能。这方面的一个关键方面是发展预期感觉运动体验的内部模型,以检测和应对意外事件,指导自我保护行为。联想技能记忆(asm)通过将运动原语与感觉反馈联系起来来解决这个问题,但现有的实现依赖于个人技能的硬编码库。一个关键的未解决的问题是单个神经网络如何在实现集成故障检测和上下文感知执行的同时学习一系列技能。在这里,我们介绍了神经联想技能记忆(neural associated skill memories, neural asm),这是一个使用自监督时间预测编码的框架,利用生物学上合理的局部学习规则整合技能学习和表达。与传统asm需要明确的技能选择不同,神经asm通过上下文推理隐含地识别和表达技能,从而在整个学习过程中使用“预测惊喜”进行故障检测。与使用时间反向传播训练的递归神经网络相比,我们的模型在使用局部学习规则的同时,在技能记忆表达方面取得了相当的定性表现,并预测了与生物学相关的速度与准确性权衡。通过将故障检测、反应性控制和技能表达集成到一个基于能量的单一架构中,神经asm有助于更安全、自我保护的机器人,并为研究生物感觉运动学习提供了一个计算透镜。
{"title":"Neural Associative Skill Memories for Safer Robotics and Modeling Human Sensorimotor Repertoires","authors":"Pranav Mahajan;Mufeng Tang;T. Ed Li;Ioannis Havoutis;Ben Seymour","doi":"10.1162/NECO.a.1475","DOIUrl":"10.1162/NECO.a.1475","url":null,"abstract":"Modern robots face a challenge shared by biological systems: how to learn and adaptively express multiple sensorimotor skills. A key aspect of this is developing an internal model of expected sensorimotor experiences to detect and react to unexpected events, guiding self-preserving behaviors. Associative skill memories (ASMs) address this by linking movement primitives to sensory feedback, but existing implementations rely on hard-coded libraries of individual skills. A key unresolved problem is how a single neural network can learn a repertoire of skills while enabling integrated fault detection and context-aware execution. Here we introduce neural associative skill memories (neural ASMs), a framework that uses self-supervised temporal predictive coding to integrate skill learning and expression using biologically plausible local learning rules. Unlike traditional ASMs, which require explicit skill selection, neural ASMs implicitly recognize and express skills through contextual inference, enabling fault detection using “predictive surprise” across the entire learned repertoire. Compared to recurrent neural networks trained using backpropagation through time, our model achieves comparable qualitative performance in skill memory expression while using local learning rules and predicts a biologically relevant speed-versus-accuracy trade-off. By integrating fault detection, reactive control, and skill expression into a single energy-based architecture, neural ASMs contribute to safer, self-preserving robotics and provide a computational lens to study biological sensorimotor learning.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"38 1","pages":"1-27"},"PeriodicalIF":2.1,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145524867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulated Complex Cells Contribute to Object Recognition Through Representational Untangling. 模拟复杂细胞有助于通过代表性解缠对象识别。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-10 DOI: 10.1162/NECO.a.1480
Mitchell B Slapik, Harel Z Shouval

The visual system performs a remarkable feat: it takes complex retinal activation patterns and decodes them for object recognition. This operation, termed "representational untangling," organizes neural representations by clustering similar objects together while separating different categories of objects. While representational untangling is usually associated with higher-order visual areas like the inferior temporal cortex, it remains unclear how the early visual system contributes to this process-whether through highly selective neurons or high-dimensional population codes. This article investigates how a computational model of early vision contributes to representational untangling. Using a computational visual hierarchy and two different data sets consisting of numerals and objects, we demonstrate that simulated complex cells significantly contribute to representational untangling for object recognition. Our findings challenge prior theories by showing that untangling does not depend on skewed, sparse, or high-dimensional representations. Instead, simulated complex cells reformat visual information into a low-dimensional, yet more separable, neural code, striking a balance between representational untangling and computational efficiency.

视觉系统完成了一项非凡的壮举:它获取复杂的视网膜激活模式,并对其进行解码,以便识别物体。这种操作被称为“表征解缠”,通过将相似的对象聚在一起,同时分离不同类别的对象来组织神经表征。虽然表征性解结通常与高阶视觉区域(如下颞叶皮层)有关,但尚不清楚早期视觉系统是如何参与这一过程的——是通过高度选择性的神经元还是通过高维的种群代码。本文研究了早期视觉的计算模型如何有助于表征解结。使用计算视觉层次和由数字和物体组成的两个不同数据集,我们证明了模拟复杂细胞对物体识别的表征解缠有显著贡献。我们的研究结果挑战了先前的理论,表明解缠并不依赖于扭曲的、稀疏的或高维的表征。相反,模拟的复杂细胞将视觉信息重新格式化为低维,但更可分离的神经代码,在表征解结和计算效率之间取得平衡。
{"title":"Simulated Complex Cells Contribute to Object Recognition Through Representational Untangling.","authors":"Mitchell B Slapik, Harel Z Shouval","doi":"10.1162/NECO.a.1480","DOIUrl":"https://doi.org/10.1162/NECO.a.1480","url":null,"abstract":"<p><p>The visual system performs a remarkable feat: it takes complex retinal activation patterns and decodes them for object recognition. This operation, termed \"representational untangling,\" organizes neural representations by clustering similar objects together while separating different categories of objects. While representational untangling is usually associated with higher-order visual areas like the inferior temporal cortex, it remains unclear how the early visual system contributes to this process-whether through highly selective neurons or high-dimensional population codes. This article investigates how a computational model of early vision contributes to representational untangling. Using a computational visual hierarchy and two different data sets consisting of numerals and objects, we demonstrate that simulated complex cells significantly contribute to representational untangling for object recognition. Our findings challenge prior theories by showing that untangling does not depend on skewed, sparse, or high-dimensional representations. Instead, simulated complex cells reformat visual information into a low-dimensional, yet more separable, neural code, striking a balance between representational untangling and computational efficiency.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":" ","pages":"1-20"},"PeriodicalIF":2.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sum-of-Norms Regularized Nonnegative Matrix Factorization. 范数和正则化非负矩阵分解。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-10 DOI: 10.1162/NECO.a.1482
Andersen Ang, Waqas Bin Hamed, Hans De Sterck

When applying nonnegative matrix factorization (NMF), the rank parameter is generally unknown. This rank, called the nonnegative rank, is usually estimated heuristically since computing its exact value is NP-hard. In this work, we propose an approximation method to estimate the rank on the fly while solving NMF. We use the sum-of-norm (SON), a group-lasso structure that encourages pairwise similarity, to reduce the rank of a factor matrix when the initial rank is overestimated. On various data sets, SON-NMF can reveal the correct nonnegative rank of the data without prior knowledge or parameter tuning. SON-NMF is a nonconvex, nonsmooth, nonseparable, and nonproximable problem, making it nontrivial to solve. First, since rank estimation in NMF is NP-hard, the proposed approach does not benefit from lower computational complexity. Using a graph-theoretic argument, we prove that the complexity of SON NMF is essentially irreducible. Second, the per iteration cost of algorithms for SON-NMF can be high. This motivates us to propose a first-order BCD algorithm that approximately solves SON-NMF with low per iteration cost via the proximal average operator. SON-NMF exhibits favorable features for applications. Besides the ability to automatically estimate the rank from data, SON-NMF can handle rank-deficient data matrices and detect weak components with little energy. Furthermore, in hyperspectral imaging, SON-NMF naturally addresses the issue of spectral variability.

在应用非负矩阵分解(NMF)时,秩参数通常是未知的。这个秩称为非负秩,通常是启发式估计的,因为计算它的确切值是np困难的。在这项工作中,我们提出了一种在求解NMF时动态估计秩的近似方法。我们使用规范和(SON),一种鼓励两两相似性的组套索结构,当初始秩被高估时降低因子矩阵的秩。在各种数据集上,SON-NMF可以在不需要先验知识或参数调优的情况下显示数据的正确非负秩。SON-NMF是一个非凸的、非光滑的、不可分离的、不可接近的问题,这使得它的求解是非平凡的。首先,由于NMF中的秩估计是np困难的,因此所提出的方法不会从较低的计算复杂度中获益。利用图论论证,证明了SON NMF的复杂性本质上是不可约的。其次,SON-NMF算法的每次迭代成本可能很高。这促使我们提出一种一阶BCD算法,该算法通过近平均算子以较低的每次迭代成本近似求解SON-NMF。SON-NMF具有良好的应用特性。除了能够从数据中自动估计秩外,SON-NMF还可以处理秩不足的数据矩阵并以较少的能量检测弱成分。此外,在高光谱成像中,SON-NMF自然地解决了光谱变异性的问题。
{"title":"Sum-of-Norms Regularized Nonnegative Matrix Factorization.","authors":"Andersen Ang, Waqas Bin Hamed, Hans De Sterck","doi":"10.1162/NECO.a.1482","DOIUrl":"https://doi.org/10.1162/NECO.a.1482","url":null,"abstract":"<p><p>When applying nonnegative matrix factorization (NMF), the rank parameter is generally unknown. This rank, called the nonnegative rank, is usually estimated heuristically since computing its exact value is NP-hard. In this work, we propose an approximation method to estimate the rank on the fly while solving NMF. We use the sum-of-norm (SON), a group-lasso structure that encourages pairwise similarity, to reduce the rank of a factor matrix when the initial rank is overestimated. On various data sets, SON-NMF can reveal the correct nonnegative rank of the data without prior knowledge or parameter tuning. SON-NMF is a nonconvex, nonsmooth, nonseparable, and nonproximable problem, making it nontrivial to solve. First, since rank estimation in NMF is NP-hard, the proposed approach does not benefit from lower computational complexity. Using a graph-theoretic argument, we prove that the complexity of SON NMF is essentially irreducible. Second, the per iteration cost of algorithms for SON-NMF can be high. This motivates us to propose a first-order BCD algorithm that approximately solves SON-NMF with low per iteration cost via the proximal average operator. SON-NMF exhibits favorable features for applications. Besides the ability to automatically estimate the rank from data, SON-NMF can handle rank-deficient data matrices and detect weak components with little energy. Furthermore, in hyperspectral imaging, SON-NMF naturally addresses the issue of spectral variability.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":" ","pages":"1-28"},"PeriodicalIF":2.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Approximation Rates in Fréchet Metrics: Barron Spaces, Paley-Wiener Spaces, and Fourier Multipliers. 弗雷切度量中的近似率:巴伦空间,佩利-维纳空间和傅立叶乘法器。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-10 DOI: 10.1162/NECO.a.1481
Ahmed Abdeljawad, Thomas Dittrich

Operator learning is a recent development in the simulation of partial differential equations by means of neural networks. The idea behind this approach is to learn the behavior of an operator, such that the resulting neural network is an approximate mapping in infinite-dimensional spaces that is capable of (approximately) simulating the solution operator governed by the partial differential equation. In our work, we study some general approximation capabilities for linear differential operators by approximating the corresponding symbol in the Fourier domain. Analogous to the structure of the class of Hörmander symbols, we consider the approximation with respect to a topology that is induced by a sequence of semi-norms. In that sense, we measure the approximation error in terms of a Fréchet metric, and our main result identifies sufficient conditions for achieving a predefined approximation error. We then focus on a natural extension of our main theorem, in which we reduce the assumptions on the sequence of seminorms. Based on existing approximation results for the exponential spectral Barron space, we then present a concrete example of symbols that can be approximated well.

算子学习是近年来利用神经网络模拟偏微分方程的一个新发展。这种方法背后的思想是学习算子的行为,这样得到的神经网络是无限维空间中的近似映射,能够(近似地)模拟由偏微分方程控制的解算子。在我们的工作中,我们通过在傅里叶域中近似相应的符号来研究线性微分算子的一些一般近似能力。类似于Hörmander符号类的结构,我们考虑关于由半规范序列诱导的拓扑的逼近。从这个意义上说,我们根据一个fr度量来测量近似误差,我们的主要结果确定了实现预定义近似误差的充分条件。然后,我们将重点放在主要定理的自然推广上,其中我们减少了对半精序列的假设。在已有的指数谱巴伦空间近似结果的基础上,我们给出了一个可以很好近似的符号的具体例子。
{"title":"Approximation Rates in Fréchet Metrics: Barron Spaces, Paley-Wiener Spaces, and Fourier Multipliers.","authors":"Ahmed Abdeljawad, Thomas Dittrich","doi":"10.1162/NECO.a.1481","DOIUrl":"https://doi.org/10.1162/NECO.a.1481","url":null,"abstract":"<p><p>Operator learning is a recent development in the simulation of partial differential equations by means of neural networks. The idea behind this approach is to learn the behavior of an operator, such that the resulting neural network is an approximate mapping in infinite-dimensional spaces that is capable of (approximately) simulating the solution operator governed by the partial differential equation. In our work, we study some general approximation capabilities for linear differential operators by approximating the corresponding symbol in the Fourier domain. Analogous to the structure of the class of Hörmander symbols, we consider the approximation with respect to a topology that is induced by a sequence of semi-norms. In that sense, we measure the approximation error in terms of a Fréchet metric, and our main result identifies sufficient conditions for achieving a predefined approximation error. We then focus on a natural extension of our main theorem, in which we reduce the assumptions on the sequence of seminorms. Based on existing approximation results for the exponential spectral Barron space, we then present a concrete example of symbols that can be approximated well.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":" ","pages":"1-33"},"PeriodicalIF":2.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusing Foveal Fixations Using Linear Retinal Transformations and Bayesian Experimental Design 利用线性视网膜变换和贝叶斯实验设计融合中央凹固定。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1162/neco.a.33
Christopher K. I. Williams
Humans (and many vertebrates) face the problem of fusing together multiple fixations of a scene in order to obtain a representation of the whole, where each fixation uses a high-resolution fovea and decreasing resolution in the periphery. In this letter, we explicitly represent the retinal transformation of a fixation as a linear downsampling of a high-resolution latent image of the scene, exploiting the known geometry. This linear transformation allows us to carry out exact inference for the latent variables in factor analysis (FA) and mixtures of FA models of the scene. This also allows us to formulate and solve the choice of where to look next as a Bayesian experimental design problem using the expected information gain criterion. Experiments on the Frey faces and MNIST data sets demonstrate the effectiveness of our models.
人类(和许多脊椎动物)面临着将一个场景的多个注视点融合在一起以获得整体表现的问题,其中每个注视点都使用高分辨率的中央凹和周围分辨率不断降低的中央凹。在这封信中,我们明确表示视网膜转换的固定作为一个线性下采样的高分辨率的潜在图像的场景,利用已知的几何形状。这种线性变换使我们能够对场景的因素分析(FA)和FA模型的混合中的潜在变量进行精确的推断。这也使我们能够使用预期信息增益准则制定和解决贝叶斯实验设计问题的下一步选择。在Frey人脸和MNIST数据集上的实验证明了我们模型的有效性。
{"title":"Fusing Foveal Fixations Using Linear Retinal Transformations and Bayesian Experimental Design","authors":"Christopher K. I. Williams","doi":"10.1162/neco.a.33","DOIUrl":"10.1162/neco.a.33","url":null,"abstract":"Humans (and many vertebrates) face the problem of fusing together multiple fixations of a scene in order to obtain a representation of the whole, where each fixation uses a high-resolution fovea and decreasing resolution in the periphery. In this letter, we explicitly represent the retinal transformation of a fixation as a linear downsampling of a high-resolution latent image of the scene, exploiting the known geometry. This linear transformation allows us to carry out exact inference for the latent variables in factor analysis (FA) and mixtures of FA models of the scene. This also allows us to formulate and solve the choice of where to look next as a Bayesian experimental design problem using the expected information gain criterion. Experiments on the Frey faces and MNIST data sets demonstrate the effectiveness of our models.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 12","pages":"2235-2256"},"PeriodicalIF":2.1,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model Predictive Control on the Neural Manifold 神经流形的模型预测控制。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1162/neco.a.37
Christof Fehrman;C. Daniel Meliza
Neural manifolds are an attractive theoretical framework for characterizing the complex behaviors of neural populations. However, many of the tools for identifying these low-dimensional subspaces are correlational and provide limited insight into the underlying dynamics. The ability to precisely control the latent activity of a circuit would allow researchers to investigate the structure and function of neural manifolds. We simulate controlling the latent dynamics of a neural population using closed-loop, dynamically generated sensory inputs. Using a spiking neural network (SNN) as a model of a neural circuit, we find low-dimensional representations of both the network activity (the neural manifold) and a set of salient visual stimuli. The fields of classical and optimal control offer a range of methods to choose from for controlling dynamics on the neural manifold, which differ in performance, computational cost, and ease of implementation. Here, we focus on two commonly used control methods: proportional-integral-derivative (PID) control and model predictive control (MPC). PID is a computationally lightweight controller that is simple to implement. In contrast, MPC is a model-based, anticipatory controller with a much higher computational cost and engineering overhead. We evaluate both methods on trajectory-following tasks in latent space, under partial observability and in the presence of unknown noise. While both controllers in some cases were able to successfully control the latent dynamics on the neural manifold, MPC consistently produced more accurate control and required less hyperparameter tuning. These results demonstrate how MPC can be applied on the neural manifold using data-driven dynamics models and provide a framework to experimentally test for causal relationships between manifold dynamics and external stimuli.
神经流形是描述神经群体复杂行为的一个有吸引力的理论框架。然而,许多用于识别这些低维子空间的工具是相互关联的,并且对底层动态的了解有限。精确控制回路潜在活动的能力将使研究人员能够研究神经流形的结构和功能。我们模拟控制潜在的动态神经群体使用闭环,动态产生的感官输入。使用尖峰神经网络(SNN)作为神经回路的模型,我们发现了网络活动(神经流形)和一组显著视觉刺激的低维表示。经典控制和最优控制领域为神经流形上的动态控制提供了一系列方法可供选择,这些方法在性能、计算成本和易于实现方面存在差异。本文重点介绍了两种常用的控制方法:比例-积分-导数(PID)控制和模型预测控制(MPC)。PID是一种计算量很轻的控制器,易于实现。相比之下,MPC是一种基于模型的预期控制器,具有更高的计算成本和工程开销。我们在潜在空间、部分可观察性和存在未知噪声的情况下评估了这两种方法的轨迹跟踪任务。虽然在某些情况下,两个控制器都能够成功地控制神经流形上的潜在动力学,但MPC始终能够产生更精确的控制,并且需要更少的超参数调谐。这些结果表明MPC可以通过数据驱动的动力学模型应用于神经流形,并为实验测试流形动力学与外部刺激之间的因果关系提供了一个框架。
{"title":"Model Predictive Control on the Neural Manifold","authors":"Christof Fehrman;C. Daniel Meliza","doi":"10.1162/neco.a.37","DOIUrl":"10.1162/neco.a.37","url":null,"abstract":"Neural manifolds are an attractive theoretical framework for characterizing the complex behaviors of neural populations. However, many of the tools for identifying these low-dimensional subspaces are correlational and provide limited insight into the underlying dynamics. The ability to precisely control the latent activity of a circuit would allow researchers to investigate the structure and function of neural manifolds. We simulate controlling the latent dynamics of a neural population using closed-loop, dynamically generated sensory inputs. Using a spiking neural network (SNN) as a model of a neural circuit, we find low-dimensional representations of both the network activity (the neural manifold) and a set of salient visual stimuli. The fields of classical and optimal control offer a range of methods to choose from for controlling dynamics on the neural manifold, which differ in performance, computational cost, and ease of implementation. Here, we focus on two commonly used control methods: proportional-integral-derivative (PID) control and model predictive control (MPC). PID is a computationally lightweight controller that is simple to implement. In contrast, MPC is a model-based, anticipatory controller with a much higher computational cost and engineering overhead. We evaluate both methods on trajectory-following tasks in latent space, under partial observability and in the presence of unknown noise. While both controllers in some cases were able to successfully control the latent dynamics on the neural manifold, MPC consistently produced more accurate control and required less hyperparameter tuning. These results demonstrate how MPC can be applied on the neural manifold using data-driven dynamics models and provide a framework to experimentally test for causal relationships between manifold dynamics and external stimuli.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 12","pages":"2125-2157"},"PeriodicalIF":2.1,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145403057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boosting MCTS With Free Energy Minimization 用自由能最小化来提升MCTS。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1162/neco.a.31
Mawaba Pascal Dao;Adrian M. Peter
Active inference, grounded in the free energy principle, provides a powerful lens for understanding how agents balance exploration and goal-directed behavior in uncertain environments. Here, we propose a new planning framework that integrates Monte Carlo tree search (MCTS) with active inference objectives to systematically reduce epistemic uncertainty while pursuing extrinsic rewards. Our key insight is that MCTS, already renowned for its search efficiency, can be naturally extended to incorporate free energy minimization by blending expected rewards with information gain. Concretely, the cross-entropy method (CEM) is used to optimize action proposals at the root node, while tree expansions leverage reward modeling alongside intrinsic exploration bonuses. This synergy allows our planner to maintain coherent estimates of value and uncertainty throughout planning, without sacrificing computational tractability. Empirically, we benchmark our planner on a diverse set of continuous control tasks, where it demonstrates performance gains over both stand-alone CEM and MCTS with random rollouts.
基于自由能原理的主动推理为理解智能体在不确定环境中如何平衡探索和目标导向行为提供了一个强有力的视角。在这里,我们提出了一个新的规划框架,将蒙特卡罗树搜索(MCTS)与主动推理目标相结合,在追求外在奖励的同时系统地减少认知不确定性。我们的关键见解是,MCTS已经以其搜索效率而闻名,可以通过混合预期奖励和信息增益,自然地扩展到包含自由能量最小化。具体而言,交叉熵方法(CEM)用于优化根节点的行动建议,而树扩展利用奖励建模和内在探索奖励。这种协同作用允许我们的计划人员在整个计划过程中保持对价值和不确定性的一致估计,而不牺牲计算的可追溯性。根据经验,我们在一系列连续控制任务上对我们的计划器进行基准测试,在这些任务中,它展示了相对于独立CEM和随机部署的MCTS的性能增益。
{"title":"Boosting MCTS With Free Energy Minimization","authors":"Mawaba Pascal Dao;Adrian M. Peter","doi":"10.1162/neco.a.31","DOIUrl":"10.1162/neco.a.31","url":null,"abstract":"Active inference, grounded in the free energy principle, provides a powerful lens for understanding how agents balance exploration and goal-directed behavior in uncertain environments. Here, we propose a new planning framework that integrates Monte Carlo tree search (MCTS) with active inference objectives to systematically reduce epistemic uncertainty while pursuing extrinsic rewards. Our key insight is that MCTS, already renowned for its search efficiency, can be naturally extended to incorporate free energy minimization by blending expected rewards with information gain. Concretely, the cross-entropy method (CEM) is used to optimize action proposals at the root node, while tree expansions leverage reward modeling alongside intrinsic exploration bonuses. This synergy allows our planner to maintain coherent estimates of value and uncertainty throughout planning, without sacrificing computational tractability. Empirically, we benchmark our planner on a diverse set of continuous control tasks, where it demonstrates performance gains over both stand-alone CEM and MCTS with random rollouts.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 12","pages":"2205-2234"},"PeriodicalIF":2.1,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1