首页 > 最新文献

Neural Computation最新文献

英文 中文
A Chimera Model for Motion Anticipation in the Retina and the Primary Visual Cortex 视网膜和初级视觉皮层运动预期的嵌合体模型。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-10 DOI: 10.1162/neco.a.34
Jérôme Emonet;Selma Souihel;Frédéric Chavane;Alain Destexhe;Matteo di Volo;Bruno Cessac
We propose a mean field model of the primary visual cortex (V1), connected to a realistic retina model, to study the impact of the retina on motion anticipation. We first consider the case where the retina does not itself provide anticipation—which is then only triggered by a cortical mechanism, the “anticipation by latency”—and unravel the effects of the retinal input amplitude, of stimulus features such as speed and contrast and of the size of cortical extensions and fiber conduction speed. Then we explore the changes in the cortical wave of anticipation when V1 is triggered by retina-driven anticipatory mechanisms: gain control and lateral inhibition by amacrine cells. Here, we show how retinal and cortical anticipation combine to provide an efficient processing where the simulated cortical response is in advance over the moving object that triggers this response, compensating the delays in visual processing.
我们提出了一个初级视觉皮层(V1)的平均场模型,连接到一个真实的视网膜模型,研究视网膜对运动预期的影响。我们首先考虑视网膜本身不提供预期的情况——这种预期只能由一种皮层机制触发,即“潜伏期预期”——并揭示视网膜输入幅度、刺激特征(如速度和对比度)、皮质延伸的大小和纤维传导速度的影响。然后,我们探讨了视网膜驱动的预期机制:无毛细胞的增益控制和侧抑制触发V1时皮质预期波的变化。在这里,我们展示了视网膜和皮层预期如何结合起来提供一个有效的处理,其中模拟的皮层反应提前于触发该反应的移动物体,补偿视觉处理的延迟。
{"title":"A Chimera Model for Motion Anticipation in the Retina and the Primary Visual Cortex","authors":"Jérôme Emonet;Selma Souihel;Frédéric Chavane;Alain Destexhe;Matteo di Volo;Bruno Cessac","doi":"10.1162/neco.a.34","DOIUrl":"10.1162/neco.a.34","url":null,"abstract":"We propose a mean field model of the primary visual cortex (V1), connected to a realistic retina model, to study the impact of the retina on motion anticipation. We first consider the case where the retina does not itself provide anticipation—which is then only triggered by a cortical mechanism, the “anticipation by latency”—and unravel the effects of the retinal input amplitude, of stimulus features such as speed and contrast and of the size of cortical extensions and fiber conduction speed. Then we explore the changes in the cortical wave of anticipation when V1 is triggered by retina-driven anticipatory mechanisms: gain control and lateral inhibition by amacrine cells. Here, we show how retinal and cortical anticipation combine to provide an efficient processing where the simulated cortical response is in advance over the moving object that triggers this response, compensating the delays in visual processing.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 11","pages":"1925-1974"},"PeriodicalIF":2.1,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Normalization Prevents Collapse of Noncontrastive Learning Dynamics 特征归一化防止非对比学习动态的崩溃。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-10 DOI: 10.1162/neco.a.27
Han Bao
Contrastive learning is a self-supervised representation learning framework where two positive views generated through data augmentation are made similar by an attraction force in a data representation space, while a repulsive force makes them far from negative examples. Noncontrastive learning, represented by BYOL and SimSiam, gets rid of negative examples and improves computational efficiency. While learned representations may collapse into a single point due to the lack of the repulsive force at first sight, Tian et al. (2021) revealed through learning dynamics analysis that the representations can avoid collapse if data augmentation is sufficiently stronger than regularization. However, their analysis does not take into account commonly used feature normalization, a normalizer before measuring the similarity of representations, and hence excessively strong regularization may still collapse the dynamics, an unnatural behavior under the presence of feature normalization. Therefore, we extend the previous theory based on the L2 loss by considering the cosine loss instead, which involves feature normalization. We show that the cosine loss induces sixth-order dynamics (while the L2 loss induces a third-order one), in which a stable equilibrium dynamically emerges even if there are only collapsed solutions with given initial parameters. Thus, we offer a new understanding that feature normalization plays an important role in robustly preventing the dynamics collapse.
对比学习是一种自我监督的表征学习框架,通过数据增强生成的两个正面视图在数据表征空间中被引力使相似,而斥力使它们远离负面示例。以BYOL和SimSiam为代表的非对比学习也消除了负例,提高了计算效率。虽然学习到的表征可能会由于第一眼缺乏排斥力而崩溃为单点,但Tian等人(2021)通过学习动力学分析揭示,如果数据增强比正则化足够强,表征可以避免崩溃。然而,他们的分析并没有考虑到常用的特征归一化,在测量表征的相似性之前使用一个归一化器,因此过于强烈的正则化可能仍然会破坏动态,这是特征归一化存在下的一种不自然的行为。因此,我们通过考虑余弦损失来扩展先前基于L2损失的理论,这涉及到特征归一化。我们证明了余弦损失诱导六阶动力学(而L2损失诱导三阶动力学),其中即使只有具有给定初始参数的崩塌解,也会动态出现稳定的平衡。因此,我们提供了一个新的认识,即特征归一化在鲁棒性防止动态崩溃中起着重要作用。
{"title":"Feature Normalization Prevents Collapse of Noncontrastive Learning Dynamics","authors":"Han Bao","doi":"10.1162/neco.a.27","DOIUrl":"10.1162/neco.a.27","url":null,"abstract":"Contrastive learning is a self-supervised representation learning framework where two positive views generated through data augmentation are made similar by an attraction force in a data representation space, while a repulsive force makes them far from negative examples. Noncontrastive learning, represented by BYOL and SimSiam, gets rid of negative examples and improves computational efficiency. While learned representations may collapse into a single point due to the lack of the repulsive force at first sight, Tian et al. (2021) revealed through learning dynamics analysis that the representations can avoid collapse if data augmentation is sufficiently stronger than regularization. However, their analysis does not take into account commonly used feature normalization, a normalizer before measuring the similarity of representations, and hence excessively strong regularization may still collapse the dynamics, an unnatural behavior under the presence of feature normalization. Therefore, we extend the previous theory based on the L2 loss by considering the cosine loss instead, which involves feature normalization. We show that the cosine loss induces sixth-order dynamics (while the L2 loss induces a third-order one), in which a stable equilibrium dynamically emerges even if there are only collapsed solutions with given initial parameters. Thus, we offer a new understanding that feature normalization plays an important role in robustly preventing the dynamics collapse.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 11","pages":"2079-2124"},"PeriodicalIF":2.1,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Firing Rate Models as Associative Memory: Synaptic Design for Robust Retrieval 触发率模型作为联想记忆:稳健检索的突触设计。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1162/neco.a.28
Simone Betteti;Giacomo Baggio;Francesco Bullo;Sandro Zampieri
Firing rate models are dynamical systems widely used in applied and theoretical neuroscience to describe local cortical dynamics in neuronal populations. By providing a macroscopic perspective of neuronal activity, these models are essential for investigating oscillatory phenomena, chaotic behavior, and associative memory processes. Despite their widespread use, the application of firing rate models to associative memory networks has received limited mathematical exploration, and most existing studies are focused on specific models. Conversely, well-established associative memory designs, such as Hopfield networks, lack key biologically relevant features intrinsic to firing rate models, including positivity and interpretable synaptic matrices reflecting the action of long-term potentiation and long-term depression. To address this gap, we propose a general framework that ensures the emergence of rescaled memory patterns as stable equilibria in the firing rate dynamics. Furthermore, we analyze the conditions under which the memories are locally and globally asymptotically stable, providing insights into constructing biologically plausible and robust systems for associative memory retrieval.
放电速率模型是应用神经科学和理论神经科学中广泛使用的动态系统,用于描述神经元群体的局部皮质动力学。通过提供神经元活动的宏观视角,这些模型对于研究振荡现象、混沌行为和联想记忆过程至关重要。尽管激发率模型在联想记忆网络中的应用得到了广泛的应用,但在数学上的探索却很有限,现有的研究大多集中在特定的模型上。相反,完善的联想记忆设计,如Hopfield网络,缺乏放电率模型固有的关键生物学相关特征,包括反映长期增强和长期抑制作用的正性和可解释的突触矩阵。为了解决这一差距,我们提出了一个通用框架,以确保重新缩放的记忆模式作为发射速率动力学中的稳定平衡的出现。此外,我们还分析了记忆在局部和全局渐近稳定的条件,为构建生物学上可信和健壮的联想记忆检索系统提供了见解。
{"title":"Firing Rate Models as Associative Memory: Synaptic Design for Robust Retrieval","authors":"Simone Betteti;Giacomo Baggio;Francesco Bullo;Sandro Zampieri","doi":"10.1162/neco.a.28","DOIUrl":"10.1162/neco.a.28","url":null,"abstract":"Firing rate models are dynamical systems widely used in applied and theoretical neuroscience to describe local cortical dynamics in neuronal populations. By providing a macroscopic perspective of neuronal activity, these models are essential for investigating oscillatory phenomena, chaotic behavior, and associative memory processes. Despite their widespread use, the application of firing rate models to associative memory networks has received limited mathematical exploration, and most existing studies are focused on specific models. Conversely, well-established associative memory designs, such as Hopfield networks, lack key biologically relevant features intrinsic to firing rate models, including positivity and interpretable synaptic matrices reflecting the action of long-term potentiation and long-term depression. To address this gap, we propose a general framework that ensures the emergence of rescaled memory patterns as stable equilibria in the firing rate dynamics. Furthermore, we analyze the conditions under which the memories are locally and globally asymptotically stable, providing insights into constructing biologically plausible and robust systems for associative memory retrieval.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 10","pages":"1807-1838"},"PeriodicalIF":2.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rapid Reweighting of Sensory Inputs and Predictions in Visual Perception 视觉感知中感官输入和预测的快速重加权。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1162/neco.a.26
William Turner;Oh-Sang Kwon;Minwoo J.B. Kim;Hinze Hogendoorn
A striking perceptual phenomenon has recently been described wherein people report seeing abrupt jumps in the location of a smoothly moving object (“position resets”). Here, we show that this phenomenon can be understood within the framework of recursive Bayesian estimation as arising from transient gain changes, temporarily prioritizing sensory input over predictive beliefs. From this perspective, position resets reveal a capacity for rapid adaptive precision weighting in human visual perception and offer a possible test bed within which to study the timing and flexibility of sensory gain control.
最近有一种引人注目的感知现象被描述出来,人们报告说在一个平稳移动的物体的位置上看到突然的跳跃(“位置重置”)。在这里,我们表明这种现象可以在递归贝叶斯估计的框架内理解为由瞬态增益变化引起的,暂时优先考虑感官输入而不是预测信念。从这个角度来看,位置重置揭示了人类视觉感知中快速自适应精确加权的能力,并为研究感官增益控制的时间和灵活性提供了一个可能的测试平台。
{"title":"Rapid Reweighting of Sensory Inputs and Predictions in Visual Perception","authors":"William Turner;Oh-Sang Kwon;Minwoo J.B. Kim;Hinze Hogendoorn","doi":"10.1162/neco.a.26","DOIUrl":"10.1162/neco.a.26","url":null,"abstract":"A striking perceptual phenomenon has recently been described wherein people report seeing abrupt jumps in the location of a smoothly moving object (“position resets”). Here, we show that this phenomenon can be understood within the framework of recursive Bayesian estimation as arising from transient gain changes, temporarily prioritizing sensory input over predictive beliefs. From this perspective, position resets reveal a capacity for rapid adaptive precision weighting in human visual perception and offer a possible test bed within which to study the timing and flexibility of sensory gain control.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 10","pages":"1853-1862"},"PeriodicalIF":2.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequential Learning in the Dense Associative Memory 密集联想记忆中的顺序学习。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1162/neco.a.20
Hayden McAlister;Anthony Robins;Lech Szymanski
Sequential learning involves learning tasks in a sequence and proves challenging for most neural networks. Biological neural networks regularly succeed at the sequential learning challenge and are even capable of transferring knowledge both forward and backward between tasks. Artificial neural networks often totally fail to transfer performance between tasks and regularly suffer from degraded performance or catastrophic forgetting on previous tasks. Models of associative memory have been used to investigate the discrepancy between biological and artificial neural networks due to their biological ties and inspirations, of which the Hopfield network is the most studied model. The dense associative memory (DAM), or modern Hopfield network, generalizes the Hopfield network, allowing for greater capacities and prototype learning behaviors while still retaining the associative memory structure. We give a substantial review of the sequential learning space with particular respect to the Hopfield network and associative memories. We present the first published benchmarks of sequential learning in the DAM using various sequential learning techniques and analyze the results of the sequential learning to demonstrate previously unseen transitions in the behavior of the DAM. This letter also discusses the departure from biological plausibility that may affect the utility of the DAM as a tool for studying biological neural networks. We present our findings, including the effectiveness of a range of state-of-the-art sequential learning methods when applied to the DAM, and use these methods to further the understanding of DAM properties and behaviors.
顺序学习包括按顺序学习任务,这对大多数神经网络来说都是具有挑战性的。生物神经网络通常在顺序学习挑战中取得成功,甚至能够在任务之间向前和向后传递知识。人工神经网络常常完全不能在任务之间传递性能,并且经常遭受性能下降或对先前任务的灾难性遗忘。联想记忆模型被用来研究生物神经网络和人工神经网络之间的差异,因为它们的生物联系和灵感,其中Hopfield网络是研究最多的模型。密集联想记忆(DAM),或现代Hopfield网络,概括了Hopfield网络,在保留联想记忆结构的同时,允许更大的容量和原型学习行为。我们对顺序学习空间进行了实质性的回顾,特别是关于Hopfield网络和联想记忆。我们提出了DAM中使用各种顺序学习技术的顺序学习的第一个公开基准,并分析了顺序学习的结果,以展示DAM行为中以前未见过的转变。这封信还讨论了可能影响DAM作为研究生物神经网络工具的效用的偏离生物学的合理性。我们展示了我们的发现,包括一系列最先进的顺序学习方法在应用于DAM时的有效性,并使用这些方法进一步了解DAM的特性和行为。
{"title":"Sequential Learning in the Dense Associative Memory","authors":"Hayden McAlister;Anthony Robins;Lech Szymanski","doi":"10.1162/neco.a.20","DOIUrl":"10.1162/neco.a.20","url":null,"abstract":"Sequential learning involves learning tasks in a sequence and proves challenging for most neural networks. Biological neural networks regularly succeed at the sequential learning challenge and are even capable of transferring knowledge both forward and backward between tasks. Artificial neural networks often totally fail to transfer performance between tasks and regularly suffer from degraded performance or catastrophic forgetting on previous tasks. Models of associative memory have been used to investigate the discrepancy between biological and artificial neural networks due to their biological ties and inspirations, of which the Hopfield network is the most studied model. The dense associative memory (DAM), or modern Hopfield network, generalizes the Hopfield network, allowing for greater capacities and prototype learning behaviors while still retaining the associative memory structure. We give a substantial review of the sequential learning space with particular respect to the Hopfield network and associative memories. We present the first published benchmarks of sequential learning in the DAM using various sequential learning techniques and analyze the results of the sequential learning to demonstrate previously unseen transitions in the behavior of the DAM. This letter also discusses the departure from biological plausibility that may affect the utility of the DAM as a tool for studying biological neural networks. We present our findings, including the effectiveness of a range of state-of-the-art sequential learning methods when applied to the DAM, and use these methods to further the understanding of DAM properties and behaviors.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 10","pages":"1877-1924"},"PeriodicalIF":2.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformer Models for Signal Processing: Scaled Dot-Product Attention Implements Constrained Filtering 用于信号处理的变压器模型:缩放点积注意实现约束滤波。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1162/neco.a.29
Terence D. Sanger
The remarkable success of the transformer machine learning architecture for processing language sequences far exceeds the performance of classical signal processing methods. A unique component of transformer models is the scaled dot-product attention (SDPA) layer, which does not appear to have an analog in prior signal processing algorithms. Here, we show that SDPA operates using a novel principle that projects the current state estimate onto the space spanned by prior estimates. We show that SDPA, when used for causal recursive state estimation, implements constrained state estimation in circumstances where the constraint is unknown and may be time varying. Since constraints in high-dimensional space may represent the complex relationships that define nonlinear signals and models, this suggests that the SDPA layer and transformer models leverage constrained estimation to achieve their success. This also suggests that transformers and the SPDA layer could be a computational model for previously unexplained capabilities of human behavior.
变压器机器学习架构在处理语言序列方面的显著成功远远超过了经典信号处理方法的性能。变压器模型的一个独特组成部分是按比例的点积注意(SDPA)层,它在先前的信号处理算法中似乎没有类比。在这里,我们展示了SDPA使用一种新的原理来操作,该原理将当前状态估计投影到先前估计所跨越的空间上。我们表明,当用于因果递归状态估计时,SDPA在约束未知且可能随时间变化的情况下实现了约束状态估计。由于高维空间中的约束可能表示定义非线性信号和模型的复杂关系,这表明SDPA层和变压器模型利用约束估计来实现它们的成功。这也表明,变压器和SPDA层可能是以前无法解释的人类行为能力的计算模型。
{"title":"Transformer Models for Signal Processing: Scaled Dot-Product Attention Implements Constrained Filtering","authors":"Terence D. Sanger","doi":"10.1162/neco.a.29","DOIUrl":"10.1162/neco.a.29","url":null,"abstract":"The remarkable success of the transformer machine learning architecture for processing language sequences far exceeds the performance of classical signal processing methods. A unique component of transformer models is the scaled dot-product attention (SDPA) layer, which does not appear to have an analog in prior signal processing algorithms. Here, we show that SDPA operates using a novel principle that projects the current state estimate onto the space spanned by prior estimates. We show that SDPA, when used for causal recursive state estimation, implements constrained state estimation in circumstances where the constraint is unknown and may be time varying. Since constraints in high-dimensional space may represent the complex relationships that define nonlinear signals and models, this suggests that the SDPA layer and transformer models leverage constrained estimation to achieve their success. This also suggests that transformers and the SPDA layer could be a computational model for previously unexplained capabilities of human behavior.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 10","pages":"1839-1852"},"PeriodicalIF":2.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distance-Based Logistic Matrix Factorization 基于距离的Logistic矩阵分解。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1162/neco.a.25
Anoop Praturu;Tatyana O. Sharpee
Matrix factorization is a central paradigm in matrix completion and collaborative filtering. Low-rank factorizations have been extremely successful in reconstructing and generalizing high-dimensional data in a wide variety of machine learning problems from drug-target discovery to music recommendations. Virtually all proposed matrix factorization techniques use the dot product between latent factor vectors to reconstruct the original matrix. We propose a reformulation of the widely used logistic matrix factorization in which we use the distance, rather than the dot product, to measure similarity between latent factors. We show that this measure of similarity, which can draw nonlinear decision boundaries and respect triangle inequalities between points, has more expressive power and modeling capacity. The distance-based model implemented in Euclidean and hyperbolic space outperforms previous formulations of logistic matrix factorization on three different biological test problems with disparate structure and statistics. In particular, we show that a distance-based factorization (1) generalizes better to test data, (2) achieves optimal performance at lower factor space dimension, and (3) clusters data better in the latent factor space.
矩阵分解是矩阵补全和协同过滤的核心范式。在从药物靶标发现到音乐推荐的各种机器学习问题中,低秩分解在重构和泛化高维数据方面非常成功。几乎所有提出的矩阵分解技术都使用潜在因子向量之间的点积来重建原始矩阵。我们提出了一种广泛使用的逻辑矩阵分解的重新表述,其中我们使用距离,而不是点积,来衡量潜在因素之间的相似性。我们的研究表明,这种相似性度量可以绘制非线性决策边界并尊重点之间的三角形不等式,具有更强的表达能力和建模能力。在欧几里得和双曲空间中实现的基于距离的模型在三种不同结构和统计量的生物测试问题上优于先前的逻辑矩阵分解公式。特别是,我们证明了基于距离的分解(1)更好地泛化到测试数据,(2)在较低的因子空间维度下获得最佳性能,(3)在潜在因子空间中更好地聚类数据。
{"title":"Distance-Based Logistic Matrix Factorization","authors":"Anoop Praturu;Tatyana O. Sharpee","doi":"10.1162/neco.a.25","DOIUrl":"10.1162/neco.a.25","url":null,"abstract":"Matrix factorization is a central paradigm in matrix completion and collaborative filtering. Low-rank factorizations have been extremely successful in reconstructing and generalizing high-dimensional data in a wide variety of machine learning problems from drug-target discovery to music recommendations. Virtually all proposed matrix factorization techniques use the dot product between latent factor vectors to reconstruct the original matrix. We propose a reformulation of the widely used logistic matrix factorization in which we use the distance, rather than the dot product, to measure similarity between latent factors. We show that this measure of similarity, which can draw nonlinear decision boundaries and respect triangle inequalities between points, has more expressive power and modeling capacity. The distance-based model implemented in Euclidean and hyperbolic space outperforms previous formulations of logistic matrix factorization on three different biological test problems with disparate structure and statistics. In particular, we show that a distance-based factorization (1) generalizes better to test data, (2) achieves optimal performance at lower factor space dimension, and (3) clusters data better in the latent factor space.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 10","pages":"1863-1876"},"PeriodicalIF":2.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diversity Deconstrains Component Limitations in Sensorimotor Control 多样性消除了感觉运动控制中的成分限制。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-09-22 DOI: 10.1162/neco.a.24
Yorie Nakahira;Quanying Liu;Xiyu Deng;Terrence J. Sejnowski;John C. Doyle
Human sensorimotor control is remarkably fast and accurate at the system level despite severe speed-accuracy trade-offs at the component level. The discrepancy between the contrasting speed-accuracy trade-offs at these two levels is a paradox. Meanwhile, speed accuracy trade-offs, heterogeneity, and layered architectures are ubiquitous in nerves, skeletons, and muscles, but they have only been studied in isolation using domain-specific models. In this article, we develop a mechanistic model for how component speed-accuracy trade-offs constrain sensorimotor control that is consistent with Fitts’ law for reaching. The model suggests that diversity among components deconstrains the limitations of individual components in sensorimotor control. Such diversity-enabled sweet spots (DESSs) are ubiquitous in nature, explaining why large heterogeneities exist in the components of biological systems and how natural selection routinely evolves systems with fast and accurate responses using imperfect components.
人类感觉运动控制在系统水平上是非常快速和准确的,尽管在组件水平上存在严重的速度-精度权衡。在这两个层次上,速度和精度之间的差异是一个悖论。同时,在神经、骨骼和肌肉中,速度精度权衡、异构性和分层架构无处不在,但它们只是使用特定领域的模型孤立地进行了研究。在本文中,我们开发了一个机制模型,用于组件速度-精度权衡如何约束与Fitts定律一致的感觉运动控制。该模型表明,在感觉运动控制中,组分之间的多样性限制了单个组分的局限性。这种多样性激活的最佳点(DESSs)在自然界中无处不在,解释了为什么生物系统的组成部分存在巨大的异质性,以及自然选择如何通过使用不完美的组成部分来快速准确地进化系统。
{"title":"Diversity Deconstrains Component Limitations in Sensorimotor Control","authors":"Yorie Nakahira;Quanying Liu;Xiyu Deng;Terrence J. Sejnowski;John C. Doyle","doi":"10.1162/neco.a.24","DOIUrl":"10.1162/neco.a.24","url":null,"abstract":"Human sensorimotor control is remarkably fast and accurate at the system level despite severe speed-accuracy trade-offs at the component level. The discrepancy between the contrasting speed-accuracy trade-offs at these two levels is a paradox. Meanwhile, speed accuracy trade-offs, heterogeneity, and layered architectures are ubiquitous in nerves, skeletons, and muscles, but they have only been studied in isolation using domain-specific models. In this article, we develop a mechanistic model for how component speed-accuracy trade-offs constrain sensorimotor control that is consistent with Fitts’ law for reaching. The model suggests that diversity among components deconstrains the limitations of individual components in sensorimotor control. Such diversity-enabled sweet spots (DESSs) are ubiquitous in nature, explaining why large heterogeneities exist in the components of biological systems and how natural selection routinely evolves systems with fast and accurate responses using imperfect components.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 10","pages":"1783-1806"},"PeriodicalIF":2.1,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast Multigroup Gaussian Process Factor Models 快速多组高斯过程因子模型。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-08 DOI: 10.1162/neco.a.22
Evren Gokcen;Anna I. Jasper;Adam Kohn;Christian K. Machens;Byron M. Yu
Gaussian processes are now commonly used in dimensionality reduction approaches tailored to neuroscience, especially to describe changes in high-dimensional neural activity over time. As recording capabilities expand to include neuronal populations across multiple brain areas, cortical layers, and cell types, interest in extending gaussian process factor models to characterize multipopulation interactions has grown. However, the cubic runtime scaling of current methods with the length of experimental trials and the number of recorded populations (groups) precludes their application to large-scale multipopulation recordings. Here, we improve this scaling from cubic to linear in both trial length and group number. We present two approximate approaches to fitting multigroup gaussian process factor models based on inducing variables and the frequency domain. Empirically, both methods achieved orders of magnitude speed-up with minimal impact on statistical performance, in simulation and on neural recordings of hundreds of neurons across three brain areas. The frequency domain approach, in particular, consistently provided the greatest runtime benefits with the fewest trade-offs in statistical performance. We further characterize the estimation biases introduced by the frequency domain approach and demonstrate effective strategies to mitigate them. This work enables a powerful class of analysis techniques to keep pace with the growing scale of multipopulation recordings, opening new avenues for exploring brain function.
高斯过程现在通常用于神经科学的降维方法,特别是描述高维神经活动随时间的变化。随着记录能力扩展到包括跨多个脑区、皮层和细胞类型的神经元群,对扩展高斯过程因子模型以表征多种群相互作用的兴趣也在增长。然而,当前方法的三次运行时尺度与实验试验的长度和记录的种群(组)的数量限制了它们在大规模多种群记录中的应用。在这里,我们在试验长度和组数上都将这种缩放从三次改进为线性。提出了两种基于诱导变量和频域的多组高斯过程因子模型的近似拟合方法。从经验上看,这两种方法都实现了数量级的加速,而对统计性能、模拟和三个大脑区域数百个神经元的神经记录的影响最小。特别是频域方法,始终以最小的统计性能折衷提供最大的运行时收益。我们进一步描述了频域方法引入的估计偏差,并展示了有效的策略来减轻它们。这项工作使一类强大的分析技术能够跟上不断增长的多种群记录的步伐,为探索大脑功能开辟了新的途径。
{"title":"Fast Multigroup Gaussian Process Factor Models","authors":"Evren Gokcen;Anna I. Jasper;Adam Kohn;Christian K. Machens;Byron M. Yu","doi":"10.1162/neco.a.22","DOIUrl":"10.1162/neco.a.22","url":null,"abstract":"Gaussian processes are now commonly used in dimensionality reduction approaches tailored to neuroscience, especially to describe changes in high-dimensional neural activity over time. As recording capabilities expand to include neuronal populations across multiple brain areas, cortical layers, and cell types, interest in extending gaussian process factor models to characterize multipopulation interactions has grown. However, the cubic runtime scaling of current methods with the length of experimental trials and the number of recorded populations (groups) precludes their application to large-scale multipopulation recordings. Here, we improve this scaling from cubic to linear in both trial length and group number. We present two approximate approaches to fitting multigroup gaussian process factor models based on inducing variables and the frequency domain. Empirically, both methods achieved orders of magnitude speed-up with minimal impact on statistical performance, in simulation and on neural recordings of hundreds of neurons across three brain areas. The frequency domain approach, in particular, consistently provided the greatest runtime benefits with the fewest trade-offs in statistical performance. We further characterize the estimation biases introduced by the frequency domain approach and demonstrate effective strategies to mitigate them. This work enables a powerful class of analysis techniques to keep pace with the growing scale of multipopulation recordings, opening new avenues for exploring brain function.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 9","pages":"1709-1782"},"PeriodicalIF":2.1,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Generalized Entropic Sparsification for Convolutional Neural Networks 卷积神经网络的广义熵稀疏化研究。
IF 2.1 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-08-08 DOI: 10.1162/neco.a.21
Tin Barisin;Illia Horenko
Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has N neurons, then there are 2N possibilities to connect them—and therefore 2N possible architectures and 2N Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an Np-hard problem since 2N grows in N faster then any polynomial Np. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.
据报道,卷积神经网络(cnn)被过度参数化。寻找最优(最小)和充分的架构是一个np困难问题:如果网络有$N$神经元,那么有2$^{N}$连接它们的可能性-因此有2$^{N}$可能的架构和2$^{N}$布尔超参数来编码它们。从它们中选择最好的超参数变成了一个困难的问题,因为2$^{N}$在$N$中的增长速度比任何多项式$N^{p}$都快。在这里,我们引入了一种基于数学思想的逐层数据驱动的剪枝方法,旨在解决剪枝问题的计算可扩展的熵松弛问题。使用网络熵最小化作为稀疏性约束,从预训练(完整)CNN中找到稀疏子网络。这允许部署具有次线性扩展成本的数值可扩展算法。该方法在几个基准(架构)上进行了验证:在MNIST (LeNet)上,稀疏度为55%至84%,准确度损失仅为0.1%至0.5%;在CIFAR-10 (vgg16, ResNet18)上,稀疏度为73%至89%,准确度损失为0.1%至0.5%。
{"title":"Toward Generalized Entropic Sparsification for Convolutional Neural Networks","authors":"Tin Barisin;Illia Horenko","doi":"10.1162/neco.a.21","DOIUrl":"10.1162/neco.a.21","url":null,"abstract":"Convolutional neural networks (CNNs) are reported to be overparametrized. The search for optimal (minimal) and sufficient architecture is an NP-hard problem: if the network has N neurons, then there are 2N possibilities to connect them—and therefore 2N possible architectures and 2N Boolean hyperparameters to encode them. Selecting the best possible hyperparameter out of them becomes an Np-hard problem since 2N grows in N faster then any polynomial Np. Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally scalable entropic relaxation of the pruning problem. The sparse subnetwork is found from the pretrained (full) CNN using the network entropy minimization as a sparsity constraint. This allows deploying a numerically scalable algorithm with a sublinear scaling cost. The method is validated on several benchmarks (architectures): on MNIST (LeNet), resulting in sparsity of 55% to 84% and loss in accuracy of just 0.1% to 0.5%, and on CIFAR-10 (VGG-16, ResNet18), resulting in sparsity of 73% to 89% and loss in accuracy of 0.1% to 0.5%.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"37 9","pages":"1648-1676"},"PeriodicalIF":2.1,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1