Neural Computation最新文献_第3页

KLIF: An Optimized Spiking Neuron Unit for Tuning Surrogate Gradient Function KLIF：用于调整代梯度函数的优化尖峰神经元单元

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-11-19 DOI: 10.1162/neco_a_01712

Chunming Jiang;Yilei Zhang

Spiking neural networks (SNNs) have garnered significant attention owing to their adeptness in processing temporal information, low power consumption, and enhanced biological plausibility. Despite these advantages, the development of efficient and high-performing learning algorithms for SNNs remains a formidable challenge. Techniques such as artificial neural network (ANN)-to-SNN conversion can convert ANNs to SNNs with minimal performance loss, but they necessitate prolonged simulations to approximate rate coding accurately. Conversely, the direct training of SNNs using spike-based backpropagation (BP), such as surrogate gradient approximation, is more flexible and widely adopted. Nevertheless, our research revealed that the shape of the surrogate gradient function profoundly influences the training and inference accuracy of SNNs. Importantly, we identified that the shape of the surrogate gradient function significantly affects the final training accuracy. The shape of the surrogate gradient function is typically manually selected before training and remains static throughout the training process. In this article, we introduce a novel k-based leaky integrate-and-fire (KLIF) spiking neural model. KLIF, featuring a learnable parameter, enables the dynamic adjustment of the height and width of the effective surrogate gradient near threshold during training. Our proposed model undergoes evaluation on static CIFAR-10 and CIFAR-100 data sets, as well as neuromorphic CIFAR10-DVS and DVS128-Gesture data sets. Experimental results demonstrate that KLIF outperforms the leaky Integrate-and-Fire (LIF) model across multiple data sets and network architectures. The superior performance of KLIF positions it as a viable replacement for the essential role of LIF in SNNs across diverse tasks.

尖峰神经网络（SNN）因其善于处理时间信息、低功耗和更高的生物合理性而备受关注。尽管具有这些优势，为尖峰神经网络开发高效和高性能的学习算法仍然是一项艰巨的挑战。人工神经网络（ANN）到 SNN 的转换等技术能以最小的性能损失将 ANN 转换为 SNN，但这些技术需要长时间的模拟才能准确地近似速率编码。相反，使用基于尖峰的反向传播（BP）直接训练 SNN（如代梯度逼近）则更为灵活，也被广泛采用。然而，我们的研究发现，代梯度函数的形状深刻影响着 SNN 的训练和推理精度。重要的是，我们发现代梯度函数的形状会显著影响最终的训练精度。代梯度函数的形状通常在训练前人工选择，并在整个训练过程中保持不变。在这篇文章中，我们介绍了一种新颖的基于 k 的泄漏积分发射（KLIF）尖峰神经模型。KLIF 具有一个可学习的参数，能在训练过程中动态调整阈值附近有效替代梯度的高度和宽度。我们提出的模型在静态 CIFAR-10 和 CIFAR-100 数据集以及神经形态 CIFAR10-DVS 和 DVS128-Gesture 数据集上进行了评估。实验结果表明，在多个数据集和网络架构中，KLIF 的性能都优于泄漏的 "集成-发射"（LIF）模型。KLIF 的优越性能使其成为 SNN 中 LIF 在各种任务中发挥重要作用的可行替代品。

{"title":"KLIF: An Optimized Spiking Neuron Unit for Tuning Surrogate Gradient Function","authors":"Chunming Jiang;Yilei Zhang","doi":"10.1162/neco_a_01712","DOIUrl":"10.1162/neco_a_01712","url":null,"abstract":"Spiking neural networks (SNNs) have garnered significant attention owing to their adeptness in processing temporal information, low power consumption, and enhanced biological plausibility. Despite these advantages, the development of efficient and high-performing learning algorithms for SNNs remains a formidable challenge. Techniques such as artificial neural network (ANN)-to-SNN conversion can convert ANNs to SNNs with minimal performance loss, but they necessitate prolonged simulations to approximate rate coding accurately. Conversely, the direct training of SNNs using spike-based backpropagation (BP), such as surrogate gradient approximation, is more flexible and widely adopted. Nevertheless, our research revealed that the shape of the surrogate gradient function profoundly influences the training and inference accuracy of SNNs. Importantly, we identified that the shape of the surrogate gradient function significantly affects the final training accuracy. The shape of the surrogate gradient function is typically manually selected before training and remains static throughout the training process. In this article, we introduce a novel k-based leaky integrate-and-fire (KLIF) spiking neural model. KLIF, featuring a learnable parameter, enables the dynamic adjustment of the height and width of the effective surrogate gradient near threshold during training. Our proposed model undergoes evaluation on static CIFAR-10 and CIFAR-100 data sets, as well as neuromorphic CIFAR10-DVS and DVS128-Gesture data sets. Experimental results demonstrate that KLIF outperforms the leaky Integrate-and-Fire (LIF) model across multiple data sets and network architectures. The superior performance of KLIF positions it as a viable replacement for the essential role of LIF in SNNs across diverse tasks.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 12","pages":"2636-2650"},"PeriodicalIF":2.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Associative Learning and Active Inference 联想学习和主动推理

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-11-19 DOI: 10.1162/neco_a_01711

Petr Anokhin;Artyom Sorokin;Mikhail Burtsev;Karl Friston

Associative learning is a behavioral phenomenon in which individuals develop connections between stimuli or events based on their co-occurrence. Initially studied by Pavlov in his conditioning experiments, the fundamental principles of learning have been expanded on through the discovery of a wide range of learning phenomena. Computational models have been developed based on the concept of minimizing reward prediction errors. The Rescorla-Wagner model, in particular, is a well-known model that has greatly influenced the field of reinforcement learning. However, the simplicity of these models restricts their ability to fully explain the diverse range of behavioral phenomena associated with learning. In this study, we adopt the free energy principle, which suggests that living systems strive to minimize surprise or uncertainty under their internal models of the world. We consider the learning process as the minimization of free energy and investigate its relationship with the Rescorla-Wagner model, focusing on the informational aspects of learning, different types of surprise, and prediction errors based on beliefs and values. Furthermore, we explore how well-known behavioral phenomena such as blocking, overshadowing, and latent inhibition can be modeled within the active inference framework. We accomplish this by using the informational and novelty aspects of attention, which share similar ideas proposed by seemingly contradictory models such as Mackintosh and Pearce-Hall models. Thus, we demonstrate that the free energy principle, as a theoretical framework derived from first principles, can integrate the ideas and models of associative learning proposed based on empirical experiments and serve as a framework for a better understanding of the computational processes behind associative learning in the brain.

联想学习是一种行为现象，在这种现象中，个体会根据刺激物或事件的共同发生建立起它们之间的联系。学习的基本原理最初是由巴甫洛夫在他的条件反射实验中研究出来的。基于奖励预测误差最小化的概念，人们开发出了计算模型。其中，雷斯科拉-瓦格纳模型（Rescorla-Wagner model）是一个著名的模型，对强化学习领域产生了巨大影响。然而，这些模型的简单性限制了它们完全解释与学习相关的各种行为现象的能力。在本研究中，我们采用了自由能原理，该原理认为生命系统在其内部世界模型下，会努力将意外或不确定性降至最低。我们将学习过程视为自由能最小化的过程，并研究其与雷斯科拉-瓦格纳模型的关系，重点关注学习的信息方面、不同类型的惊喜以及基于信念和价值观的预测错误。此外，我们还探讨了如何在主动推理框架内对阻滞、阴影和潜在抑制等众所周知的行为现象进行建模。我们利用注意力的信息性和新颖性来实现这一目标，这两个方面与麦金托什模型和皮尔斯-霍尔模型等看似矛盾的模型所提出的观点相似。因此，我们证明了自由能原理作为一个从第一性原理衍生出来的理论框架，可以整合根据经验实验提出的联想学习思想和模型，并以此为框架更好地理解大脑中联想学习背后的计算过程。

{"title":"Associative Learning and Active Inference","authors":"Petr Anokhin;Artyom Sorokin;Mikhail Burtsev;Karl Friston","doi":"10.1162/neco_a_01711","DOIUrl":"10.1162/neco_a_01711","url":null,"abstract":"Associative learning is a behavioral phenomenon in which individuals develop connections between stimuli or events based on their co-occurrence. Initially studied by Pavlov in his conditioning experiments, the fundamental principles of learning have been expanded on through the discovery of a wide range of learning phenomena. Computational models have been developed based on the concept of minimizing reward prediction errors. The Rescorla-Wagner model, in particular, is a well-known model that has greatly influenced the field of reinforcement learning. However, the simplicity of these models restricts their ability to fully explain the diverse range of behavioral phenomena associated with learning. In this study, we adopt the free energy principle, which suggests that living systems strive to minimize surprise or uncertainty under their internal models of the world. We consider the learning process as the minimization of free energy and investigate its relationship with the Rescorla-Wagner model, focusing on the informational aspects of learning, different types of surprise, and prediction errors based on beliefs and values. Furthermore, we explore how well-known behavioral phenomena such as blocking, overshadowing, and latent inhibition can be modeled within the active inference framework. We accomplish this by using the informational and novelty aspects of attention, which share similar ideas proposed by seemingly contradictory models such as Mackintosh and Pearce-Hall models. Thus, we demonstrate that the free energy principle, as a theoretical framework derived from first principles, can integrate the ideas and models of associative learning proposed based on empirical experiments and serve as a framework for a better understanding of the computational processes behind associative learning in the brain.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 12","pages":"2602-2635"},"PeriodicalIF":2.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures 利用时空分层架构优化注意力和认知控制成本

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-11-19 DOI: 10.1162/neco_a_01718

Devdhar Patel;Terrence Sejnowski;Hava Siegelmann

The current reinforcement learning framework focuses exclusively on performance, often at the expense of efficiency. In contrast, biological control achieves remarkable performance while also optimizing computational energy expenditure and decision frequency. We propose a decision-bounded Markov decision process (DB-MDP) that constrains the number of decisions and computational energy available to agents in reinforcement learning environments. Our experiments demonstrate that existing reinforcement learning algorithms struggle within this framework, leading to either failure or suboptimal performance. To address this, we introduce a biologically inspired, temporally layered architecture (TLA), enabling agents to manage computational costs through two layers with distinct timescales and energy requirements. TLA achieves optimal performance in decision-bounded environments and in continuous control environments, matching state-of-the-art performance while using a fraction of the computing cost. Compared to current reinforcement learning algorithms that solely prioritize performance, our approach significantly lowers computational energy expenditure while maintaining performance. These findings establish a benchmark and pave the way for future research on energy and time-aware control.

目前的强化学习框架只注重性能，往往以牺牲效率为代价。相比之下，生物控制在实现卓越性能的同时，还能优化计算能量消耗和决策频率。我们提出了一种决策受限马尔可夫决策过程（DB-MDP），它限制了强化学习环境中代理的决策次数和可用计算能量。我们的实验证明，现有的强化学习算法在这一框架内举步维艰，要么失败，要么性能不达标。为了解决这个问题，我们引入了一种受生物启发的时间分层架构（TLA），使代理能够通过具有不同时间尺度和能量要求的两层架构来管理计算成本。TLA 在有决策限制的环境和连续控制环境中都能达到最佳性能，与最先进的性能相匹配，而计算成本却很低。与当前仅优先考虑性能的强化学习算法相比，我们的方法在保持性能的同时显著降低了计算能耗。这些发现建立了一个基准，为未来的能量和时间感知控制研究铺平了道路。

{"title":"Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures","authors":"Devdhar Patel;Terrence Sejnowski;Hava Siegelmann","doi":"10.1162/neco_a_01718","DOIUrl":"10.1162/neco_a_01718","url":null,"abstract":"The current reinforcement learning framework focuses exclusively on performance, often at the expense of efficiency. In contrast, biological control achieves remarkable performance while also optimizing computational energy expenditure and decision frequency. We propose a decision-bounded Markov decision process (DB-MDP) that constrains the number of decisions and computational energy available to agents in reinforcement learning environments. Our experiments demonstrate that existing reinforcement learning algorithms struggle within this framework, leading to either failure or suboptimal performance. To address this, we introduce a biologically inspired, temporally layered architecture (TLA), enabling agents to manage computational costs through two layers with distinct timescales and energy requirements. TLA achieves optimal performance in decision-bounded environments and in continuous control environments, matching state-of-the-art performance while using a fraction of the computing cost. Compared to current reinforcement learning algorithms that solely prioritize performance, our approach significantly lowers computational energy expenditure while maintaining performance. These findings establish a benchmark and pave the way for future research on energy and time-aware control.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 12","pages":"2734-2763"},"PeriodicalIF":2.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142395375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Nonnegative Matrix Factorization With Beta Divergences 利用贝塔差分进行深度非负矩阵因式分解

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-10-11 DOI: 10.1162/neco_a_01679

Valentin Leplat;Le T. K. Hien;Akwum Onwunta;Nicolas Gillis

Deep nonnegative matrix factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse data sets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that ß-divergences offer a more suitable alternative. In this article, we develop new models and algorithms for deep NMF using some ß-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.

深度非负矩阵因式分解（deep nonnegative matrix factorization，deep NMF）是最近出现的一种提取不同尺度多层特征的重要技术。然而，所有现有的深度非负矩阵因式分解模型和算法都主要以最小二乘误差为评估核心，而这可能并不是评估不同数据集近似质量的最合适指标。例如，在处理音频信号和文档等数据类型时，人们普遍认为ß-差分提供了更合适的选择。在本文中，我们利用一些ß-发散为深度 NMF 开发了新的模型和算法，重点是 Kullback-Leibler 发散。随后，我们将这些技术应用于面部特征的提取、文档集中主题的识别以及高光谱图像中材料的识别。

引用次数: 0

Multimodal and Multifactor Branching Time Active Inference 多模态和多因素分支时间主动推理

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-10-11 DOI: 10.1162/neco_a_01703

Théophile Champion;Marek Grześ;Howard Bowman

Active inference is a state-of-the-art framework for modeling the brain that explains a wide range of mechanisms. Recently, two versions of branching time active inference (BTAI) have been developed to handle the exponential (space and time) complexity class that occurs when computing the prior over all possible policies up to the time horizon. However, those two versions of BTAI still suffer from an exponential complexity class with regard to the number of observed and latent variables being modeled. We resolve this limitation by allowing each observation to have its own likelihood mapping and each latent variable to have its own transition mapping. The implicit mean field approximation was tested in terms of its efficiency and computational cost using a dSprites environment in which the metadata of the dSprites data set was used as input to the model. In this setting, earlier implementations of branching time active inference (namely, BTAIVMP and BTAIBF) underperformed in relation to the mean field approximation (BTAI3MF) in terms of performance and computational efficiency. Specifically, BTAIVMP was able to solve 96.9% of the task in 5.1 seconds, and BTAIBF was able to solve 98.6% of the task in 17.5 seconds. Our new approach outperformed both of its predecessors by solving the task completely (100%) in only 2.559 seconds.

主动推理是最先进的大脑建模框架，可以解释各种机制。最近，人们开发了两个版本的分支时间主动推理（BTAI），以处理在计算时间范围内所有可能策略的先验时出现的指数级（空间和时间）复杂性。然而，这两个版本的 BTAI 仍然受到与被建模的观察变量和潜在变量数量有关的指数复杂性的影响。我们允许每个观测值都有自己的似然映射，每个潜变量都有自己的转换映射，从而解决了这一限制。我们使用 dSprites 环境测试了隐式均值场近似的效率和计算成本，在该环境中，dSprites 数据集的元数据被用作模型的输入。在这种情况下，分支时间主动推理的早期实现（即 BTAIVMP 和 BTAIBF）在性能和计算效率方面都不如均值场近似（BTAI3MF）。具体来说，BTAIVMP 能在 5.1 秒内解决 96.9% 的任务，而 BTAIBF 能在 17.5 秒内解决 98.6% 的任务。我们的新方法仅用了 2.559 秒就完全（100%）解决了任务，表现优于前两者。

{"title":"Multimodal and Multifactor Branching Time Active Inference","authors":"Théophile Champion;Marek Grześ;Howard Bowman","doi":"10.1162/neco_a_01703","DOIUrl":"10.1162/neco_a_01703","url":null,"abstract":"Active inference is a state-of-the-art framework for modeling the brain that explains a wide range of mechanisms. Recently, two versions of branching time active inference (BTAI) have been developed to handle the exponential (space and time) complexity class that occurs when computing the prior over all possible policies up to the time horizon. However, those two versions of BTAI still suffer from an exponential complexity class with regard to the number of observed and latent variables being modeled. We resolve this limitation by allowing each observation to have its own likelihood mapping and each latent variable to have its own transition mapping. The implicit mean field approximation was tested in terms of its efficiency and computational cost using a dSprites environment in which the metadata of the dSprites data set was used as input to the model. In this setting, earlier implementations of branching time active inference (namely, BTAIVMP and BTAIBF) underperformed in relation to the mean field approximation (BTAI3MF) in terms of performance and computational efficiency. Specifically, BTAIVMP was able to solve 96.9% of the task in 5.1 seconds, and BTAIBF was able to solve 98.6% of the task in 17.5 seconds. Our new approach outperformed both of its predecessors by solving the task completely (100%) in only 2.559 seconds.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2479-2504"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prototype Analysis in Hopfield Networks With Hebbian Learning 采用 Hebbian 学习的 Hopfield 网络中的原型分析

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-10-11 DOI: 10.1162/neco_a_01704

Hayden McAlister;Anthony Robins;Lech Szymanski

We discuss prototype formation in the Hopfield network. Typically, Hebbian learning with highly correlated states leads to degraded memory performance. We show that this type of learning can lead to prototype formation, where unlearned states emerge as representatives of large correlated subsets of states, alleviating capacity woes. This process has similarities to prototype learning in human cognition. We provide a substantial literature review of prototype learning in associative memories, covering contributions from psychology, statistical physics, and computer science. We analyze prototype formation from a theoretical perspective and derive a stability condition for these states based on the number of examples of the prototype presented for learning, the noise in those examples, and the number of nonexample states presented. The stability condition is used to construct a probability of stability for a prototype state as the factors of stability change. We also note similarities to traditional network analysis, allowing us to find a prototype capacity. We corroborate these expectations of prototype formation with experiments using a simple Hopfield network with standard Hebbian learning. We extend our experiments to a Hopfield network trained on data with multiple prototypes and find the network is capable of stabilizing multiple prototypes concurrently. We measure the basins of attraction of the multiple prototype states, finding attractor strength grows with the number of examples and the agreement of examples. We link the stability and dominance of prototype states to the energy profile of these states, particularly when comparing the profile shape to target states or other spurious states.

我们讨论了 Hopfield 网络中的原型形成。通常，具有高度相关状态的海比学习会导致记忆性能下降。我们的研究表明，这种类型的学习会导致原型形成，即未学习的状态会作为大相关状态子集的代表出现，从而缓解容量问题。这一过程与人类认知中的原型学习有相似之处。我们对联想记忆中的原型学习进行了大量的文献综述，涵盖了心理学、统计物理学和计算机科学的研究成果。我们从理论角度分析了原型的形成，并根据为学习而呈现的原型示例数量、这些示例中的噪声以及呈现的非示例状态数量，推导出了这些状态的稳定条件。随着稳定因素的变化，稳定条件被用来构建原型状态的稳定概率。我们还注意到与传统网络分析的相似之处，这使我们能够找到原型容量。我们通过使用标准海比学习的简单 Hopfield 网络进行实验，证实了对原型形成的这些预期。我们将实验扩展到在具有多个原型的数据上训练的 Hopfield 网络，发现该网络能够同时稳定多个原型。我们测量了多个原型状态的吸引盆地，发现吸引强度会随着示例数量和示例一致性的增加而增加。我们将原型状态的稳定性和主导性与这些状态的能量曲线联系起来，特别是在将曲线形状与目标状态或其他虚假状态进行比较时。

{"title":"Prototype Analysis in Hopfield Networks With Hebbian Learning","authors":"Hayden McAlister;Anthony Robins;Lech Szymanski","doi":"10.1162/neco_a_01704","DOIUrl":"10.1162/neco_a_01704","url":null,"abstract":"We discuss prototype formation in the Hopfield network. Typically, Hebbian learning with highly correlated states leads to degraded memory performance. We show that this type of learning can lead to prototype formation, where unlearned states emerge as representatives of large correlated subsets of states, alleviating capacity woes. This process has similarities to prototype learning in human cognition. We provide a substantial literature review of prototype learning in associative memories, covering contributions from psychology, statistical physics, and computer science. We analyze prototype formation from a theoretical perspective and derive a stability condition for these states based on the number of examples of the prototype presented for learning, the noise in those examples, and the number of nonexample states presented. The stability condition is used to construct a probability of stability for a prototype state as the factors of stability change. We also note similarities to traditional network analysis, allowing us to find a prototype capacity. We corroborate these expectations of prototype formation with experiments using a simple Hopfield network with standard Hebbian learning. We extend our experiments to a Hopfield network trained on data with multiple prototypes and find the network is capable of stabilizing multiple prototypes concurrently. We measure the basins of attraction of the multiple prototype states, finding attractor strength grows with the number of examples and the agreement of examples. We link the stability and dominance of prototype states to the energy profile of these states, particularly when comparing the profile shape to target states or other spurious states.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2322-2364"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Latent Space Bayesian Optimization With Latent Data Augmentation for Enhanced Exploration 潜空间贝叶斯优化与潜数据增强，以加强探索。

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-10-11 DOI: 10.1162/neco_a_01708

Onur Boyar;Ichiro Takeuchi

Latent space Bayesian optimization (LSBO) combines generative models, typically variational autoencoders (VAE), with Bayesian optimization (BO), to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this article, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the VAE-BO mismatch. To address this, we propose the latent consistent aware-acquisition function (LCA-AF) that leverages consistent points in LSBO. Additionally, we present LCA-VAE, a novel VAE method that creates a latent space with increased consistent points through data augmentation in latent space and penalization of latent inconsistencies. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Our approach achieves high sample efficiency and effective exploration, emphasizing the significance of addressing latent consistency through the novel incorporation of data augmentation in latent space within LCA-VAE in LSBO. We showcase the performance of our proposal via de novo image generation and de novo chemical design tasks.

潜空间贝叶斯优化（LSBO）将生成模型（通常是变异自动编码器（VAE））与贝叶斯优化（BO）相结合，生成新的感兴趣对象。然而，由于贝叶斯优化（BO）和变异自编码器（VAE）的目标不匹配，LSBO 面临着挑战，导致探索能力低下。在本文中，我们将提出新的贡献，以提高 LSBO 的效率并克服这一挑战。我们首先介绍了潜在一致性/不一致性的概念，它是 LSBO 中的一个关键问题，由 VAE-BO 不匹配引起。为了解决这个问题，我们提出了潜在一致性感知获取函数（LCA-AF），它利用了 LSBO 中的一致性点。此外，我们还提出了 LCA-VAE，这是一种新颖的 VAE 方法，它通过在潜在空间中增加数据和对潜在不一致性进行惩罚来创建一个具有更多一致点的潜在空间。结合 LCA-VAE 和 LCA-AF，我们开发了 LCA-LSBO。我们的方法实现了高采样效率和有效探索，通过在 LSBO 的 LCA-VAE 中新加入潜空间数据增强，强调了解决潜一致性问题的重要性。我们通过全新图像生成和全新化学设计任务展示了我们建议的性能。

{"title":"Latent Space Bayesian Optimization With Latent Data Augmentation for Enhanced Exploration","authors":"Onur Boyar;Ichiro Takeuchi","doi":"10.1162/neco_a_01708","DOIUrl":"10.1162/neco_a_01708","url":null,"abstract":"Latent space Bayesian optimization (LSBO) combines generative models, typically variational autoencoders (VAE), with Bayesian optimization (BO), to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this article, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the VAE-BO mismatch. To address this, we propose the latent consistent aware-acquisition function (LCA-AF) that leverages consistent points in LSBO. Additionally, we present LCA-VAE, a novel VAE method that creates a latent space with increased consistent points through data augmentation in latent space and penalization of latent inconsistencies. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Our approach achieves high sample efficiency and effective exploration, emphasizing the significance of addressing latent consistency through the novel incorporation of data augmentation in latent space within LCA-VAE in LSBO. We showcase the performance of our proposal via de novo image generation and de novo chemical design tasks.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2446-2478"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning Internal Representations of 3D Transformations From 2D Projected Inputs 从二维投影输入学习三维变换的内部表征

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-10-11 DOI: 10.1162/neco_a_01695

Marissa Connor;Bruno Olshausen;Christopher Rozell

We describe a computational model for inferring 3D structure from the motion of projected 2D points in an image, with the aim of understanding how biological vision systems learn and internally represent 3D transformations from the statistics of their input. The model uses manifold transport operators to describe the action of 3D points in a scene as they undergo transformation. We show that the model can learn the generator of the Lie group for these transformations from purely 2D input, providing a proof-of-concept demonstration for how biological systems could adapt their internal representations based on sensory input. Focusing on a rotational model, we evaluate the ability of the model to infer depth from moving 2D projected points and to learn rotational transformations from 2D training stimuli. Finally, we compare the model performance to psychophysical performance on structure-from-motion tasks.

我们描述了一个从图像中投射的二维点的运动推断三维结构的计算模型，目的是了解生物视觉系统如何从其输入的统计数据中学习并在内部表示三维变换。该模型使用流形传输算子来描述三维点在场景中发生变换时的动作。我们的研究表明，该模型能从纯粹的二维输入中学习这些变换的李群发生器，为生物系统如何根据感官输入调整其内部表征提供了概念验证。我们以旋转模型为重点，评估了该模型从移动的二维投影点推断深度以及从二维训练刺激学习旋转变换的能力。最后，我们将模型的表现与运动结构任务的心理物理表现进行了比较。

引用次数: 0

Spiking Neural Network Pressure Sensor 尖峰神经网络压力传感器

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-10-11 DOI: 10.1162/neco_a_01706

Michał Markiewicz;Ireneusz Brzozowski;Szymon Janusz

Von Neumann architecture requires information to be encoded as numerical values. For that reason, artificial neural networks running on computers require the data coming from sensors to be discretized. Other network architectures that more closely mimic biological neural networks (e.g., spiking neural networks) can be simulated on von Neumann architecture, but more important, they can also be executed on dedicated electrical circuits having orders of magnitude less power consumption. Unfortunately, input signal conditioning and encoding are usually not supported by such circuits, so a separate module consisting of an analog-to-digital converter, encoder, and transmitter is required. The aim of this article is to propose a sensor architecture, the output signal of which can be directly connected to the input of a spiking neural network. We demonstrate that the output signal is a valid spike source for the Izhikevich model neurons, ensuring the proper operation of a number of neurocomputational features. The advantages are clear: much lower power consumption, smaller area, and a less complex electronic circuit. The main disadvantage is that sensor characteristics somehow limit the parameters of applicable spiking neurons. The proposed architecture is illustrated by a case study involving a capacitive pressure sensor circuit, which is compatible with most of the neurocomputational properties of the Izhikevich neuron model. The sensor itself is characterized by very low power consumption: it draws only 3.49 μA at 3.3 V.

冯-诺依曼架构要求将信息编码为数值。因此，在计算机上运行的人工神经网络需要将来自传感器的数据离散化。其他更接近生物神经网络的网络架构（如尖峰神经网络）可以在冯-诺依曼架构上进行模拟，但更重要的是，它们也可以在专用电路上执行，功耗要低得多。遗憾的是，这类电路通常不支持输入信号调节和编码，因此需要一个由模数转换器、编码器和发射器组成的独立模块。本文旨在提出一种传感器结构，其输出信号可直接连接到尖峰神经网络的输入端。我们证明，输出信号是 Izhikevich 模型神经元的有效尖峰源，可确保一些神经计算功能的正常运行。其优点显而易见：功耗更低、占地面积更小、电子电路更简单。主要缺点是传感器特性在某种程度上限制了适用尖峰神经元的参数。我们通过一个涉及电容式压力传感器电路的案例研究来说明所提出的架构，该电路与 Izhikevich 神经元模型的大部分神经计算特性相兼容。传感器本身的特点是功耗极低：在 3.3 V 电压下仅消耗 3.49 μA 电流。

{"title":"Spiking Neural Network Pressure Sensor","authors":"Michał Markiewicz;Ireneusz Brzozowski;Szymon Janusz","doi":"10.1162/neco_a_01706","DOIUrl":"10.1162/neco_a_01706","url":null,"abstract":"Von Neumann architecture requires information to be encoded as numerical values. For that reason, artificial neural networks running on computers require the data coming from sensors to be discretized. Other network architectures that more closely mimic biological neural networks (e.g., spiking neural networks) can be simulated on von Neumann architecture, but more important, they can also be executed on dedicated electrical circuits having orders of magnitude less power consumption. Unfortunately, input signal conditioning and encoding are usually not supported by such circuits, so a separate module consisting of an analog-to-digital converter, encoder, and transmitter is required. The aim of this article is to propose a sensor architecture, the output signal of which can be directly connected to the input of a spiking neural network. We demonstrate that the output signal is a valid spike source for the Izhikevich model neurons, ensuring the proper operation of a number of neurocomputational features. The advantages are clear: much lower power consumption, smaller area, and a less complex electronic circuit. The main disadvantage is that sensor characteristics somehow limit the parameters of applicable spiking neurons. The proposed architecture is illustrated by a case study involving a capacitive pressure sensor circuit, which is compatible with most of the neurocomputational properties of the Izhikevich neuron model. The sensor itself is characterized by very low power consumption: it draws only 3.49 μA at 3.3 V.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2299-2321"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ℓ1-Regularized ICA: A Novel Method for Analysis of Task-Related fMRI Data ℓ 1 -Regularized ICA：分析任务相关 fMRI 数据的新方法。

IF 2.7 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Computation

Pub Date : 2024-10-11 DOI: 10.1162/neco_a_01709

Yusuke Endo;Koujin Takeda

We propose a new method of independent component analysis (ICA) in order to extract appropriate features from high-dimensional data. In general, matrix factorization methods including ICA have a problem regarding the interpretability of extracted features. For the improvement of interpretability, sparse constraint on a factorized matrix is helpful. With this background, we construct a new ICA method with sparsity. In our method, the ℓ1-regularization term is added to the cost function of ICA, and minimization of the cost function is performed by a difference of convex functions algorithm. For the validity of our proposed method, we apply it to synthetic data and real functional magnetic resonance imaging data.

我们提出了一种新的独立分量分析（ICA）方法，以便从高维数据中提取适当的特征。一般来说，包括 ICA 在内的矩阵因式分解方法在提取特征的可解释性方面存在问题。为了提高可解释性，对因式分解矩阵进行稀疏约束很有帮助。在此背景下，我们构建了一种具有稀疏性的新 ICA 方法。在我们的方法中，ICA 的代价函数中加入了 ℓ1-regularized IC 项，代价函数的最小化是通过凸函数差分算法来实现的。为了证明我们提出的方法的有效性，我们将其应用于合成数据和真实的功能磁共振成像数据。

引用次数: 0