Valentin Leplat, Le T K Hien, Akwum Onwunta, Nicolas Gillis
Deep nonnegative matrix factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse data sets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that ß-divergences offer a more suitable alternative. In this article, we develop new models and algorithms for deep NMF using some ß-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.
{"title":"Deep Nonnegative Matrix Factorization with Beta Divergences.","authors":"Valentin Leplat, Le T K Hien, Akwum Onwunta, Nicolas Gillis","doi":"10.1162/neco_a_01679","DOIUrl":"https://doi.org/10.1162/neco_a_01679","url":null,"abstract":"<p><p>Deep nonnegative matrix factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse data sets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that ß-divergences offer a more suitable alternative. In this article, we develop new models and algorithms for deep NMF using some ß-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vasily Zadorozhnyy, Edison Mucllari, Cole Pospisil, Duc Nguyen, Qiang Ye
In recent years, using orthogonal matrices has been shown to be a promising approach to improving recurrent neural networks (RNNs) with training, stability, and convergence, particularly to control gradients. While gated recurrent unit (GRU) and long short-term memory (LSTM) architectures address the vanishing gradient problem by using a variety of gates and memory cells, they are still prone to the exploding gradient problem. In this work, we analyze the gradients in GRU and propose the use of orthogonal matrices to prevent exploding gradient problems and enhance long-term memory. We study where to use orthogonal matrices and propose a Neumann series-based scaled Cayley transformation for training orthogonal matrices in GRU, which we call Neumann-Cayley orthogonal GRU (NC-GRU). We present detailed experiments of our model on several synthetic and real-world tasks, which show that NC-GRU significantly outperforms GRU and several other RNNs.
{"title":"Orthogonal Gated Recurrent Unit With Neumann-Cayley Transformation.","authors":"Vasily Zadorozhnyy, Edison Mucllari, Cole Pospisil, Duc Nguyen, Qiang Ye","doi":"10.1162/neco_a_01710","DOIUrl":"https://doi.org/10.1162/neco_a_01710","url":null,"abstract":"<p><p>In recent years, using orthogonal matrices has been shown to be a promising approach to improving recurrent neural networks (RNNs) with training, stability, and convergence, particularly to control gradients. While gated recurrent unit (GRU) and long short-term memory (LSTM) architectures address the vanishing gradient problem by using a variety of gates and memory cells, they are still prone to the exploding gradient problem. In this work, we analyze the gradients in GRU and propose the use of orthogonal matrices to prevent exploding gradient problems and enhance long-term memory. We study where to use orthogonal matrices and propose a Neumann series-based scaled Cayley transformation for training orthogonal matrices in GRU, which we call Neumann-Cayley orthogonal GRU (NC-GRU). We present detailed experiments of our model on several synthetic and real-world tasks, which show that NC-GRU significantly outperforms GRU and several other RNNs.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Latent space Bayesian optimization (LSBO) combines generative models, typically variational autoencoders (VAE), with Bayesian optimization (BO), to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this article, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the VAE-BO mismatch. To address this, we propose the latent consistent aware-acquisition function (LCA-AF) that leverages consistent points in LSBO. Additionally, we present LCA-VAE, a novel VAE method that creates a latent space with increased consistent points through data augmentation in latent space and penalization of latent inconsistencies. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Our approach achieves high sample efficiency and effective exploration, emphasizing the significance of addressing latent consistency through the novel incorporation of data augmentation in latent space within LCA-VAE in LSBO. We showcase the performance of our proposal via de novo image generation and de novo chemical design tasks.
{"title":"Latent Space Bayesian Optimization With Latent Data Augmentation for Enhanced Exploration.","authors":"Onur Boyar, Ichiro Takeuchi","doi":"10.1162/neco_a_01708","DOIUrl":"https://doi.org/10.1162/neco_a_01708","url":null,"abstract":"<p><p>Latent space Bayesian optimization (LSBO) combines generative models, typically variational autoencoders (VAE), with Bayesian optimization (BO), to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this article, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the VAE-BO mismatch. To address this, we propose the latent consistent aware-acquisition function (LCA-AF) that leverages consistent points in LSBO. Additionally, we present LCA-VAE, a novel VAE method that creates a latent space with increased consistent points through data augmentation in latent space and penalization of latent inconsistencies. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Our approach achieves high sample efficiency and effective exploration, emphasizing the significance of addressing latent consistency through the novel incorporation of data augmentation in latent space within LCA-VAE in LSBO. We showcase the performance of our proposal via de novo image generation and de novo chemical design tasks.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new method of independent component analysis (ICA) in order to extract appropriate features from high-dimensional data. In general, matrix factorization methods including ICA have a problem regarding the interpretability of extracted features. For the improvement of interpretability, sparse constraint on a factorized matrix is helpful. With this background, we construct a new ICA method with sparsity. In our method, the ℓ1-regularized IC term is added to the cost function of ICA, and minimization of the cost function is performed by a difference of convex functions algorithm. For the validity of our proposed method, we apply it to synthetic data and real functional magnetic resonance imaging data.
我们提出了一种新的独立分量分析(ICA)方法,以便从高维数据中提取适当的特征。一般来说,包括 ICA 在内的矩阵因式分解方法在提取特征的可解释性方面存在问题。为了提高可解释性,对因式分解矩阵进行稀疏约束很有帮助。在此背景下,我们构建了一种具有稀疏性的新 ICA 方法。在我们的方法中,ICA 的代价函数中加入了 ℓ1-regularized IC 项,代价函数的最小化是通过凸函数差分算法来实现的。为了证明我们提出的方法的有效性,我们将其应用于合成数据和真实的功能磁共振成像数据。
{"title":"ℓ 1 -Regularized ICA: A Novel Method for Analysis of Task-Related fMRI Data.","authors":"Yusuke Endo, Koujin Takeda","doi":"10.1162/neco_a_01709","DOIUrl":"https://doi.org/10.1162/neco_a_01709","url":null,"abstract":"<p><p>We propose a new method of independent component analysis (ICA) in order to extract appropriate features from high-dimensional data. In general, matrix factorization methods including ICA have a problem regarding the interpretability of extracted features. For the improvement of interpretability, sparse constraint on a factorized matrix is helpful. With this background, we construct a new ICA method with sparsity. In our method, the ℓ1-regularized IC term is added to the cost function of ICA, and minimization of the cost function is performed by a difference of convex functions algorithm. For the validity of our proposed method, we apply it to synthetic data and real functional magnetic resonance imaging data.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs) have garnered significant attention owing to their adeptness in processing temporal information, low power consumption, and enhanced biological plausibility. Despite these advantages, the development of efficient and high-performing learning algorithms for SNNs remains a formidable challenge. Techniques such as artificial neural network (ANN)-to-SNN conversion can convert ANNs to SNNs with minimal performance loss, but they necessitate prolonged simulations to approximate rate coding accurately. Conversely, the direct training of SNNs using spike-based backpropagation (BP), such as surrogate gradient approximation, is more flexible and widely adopted. Nevertheless, our research revealed that the shape of the surrogate gradient function profoundly influences the training and inference accuracy of SNNs. Importantly, we identified that the shape of the surrogate gradient function significantly affects the final training accuracy. The shape of the surrogate gradient function is typically manually selected before training and remains static throughout the training process. In this article, we introduce a novel k-based leaky integrate-and-fire (KLIF) spiking neural model. KLIF, featuring a learnable parameter, enables the dynamic adjustment of the height and width of the effective surrogate gradient near threshold during training. Our proposed model undergoes evaluation on static CIFAR-10 and CIFAR-100 data sets, as well as neuromorphic CIFAR10-DVS and DVS128-Gesture data sets. Experimental results demonstrate that KLIF outperforms the leaky Integrate-and-Fire (LIF) model across multiple data sets and network architectures. The superior performance of KLIF positions it as a viable replacement for the essential role of LIF in SNNs across diverse tasks.
{"title":"KLIF: An Optimized Spiking Neuron Unit for Tuning Surrogate Gradient Function.","authors":"Chunming Jiang, Yilei Zhang","doi":"10.1162/neco_a_01712","DOIUrl":"https://doi.org/10.1162/neco_a_01712","url":null,"abstract":"<p><p>Spiking neural networks (SNNs) have garnered significant attention owing to their adeptness in processing temporal information, low power consumption, and enhanced biological plausibility. Despite these advantages, the development of efficient and high-performing learning algorithms for SNNs remains a formidable challenge. Techniques such as artificial neural network (ANN)-to-SNN conversion can convert ANNs to SNNs with minimal performance loss, but they necessitate prolonged simulations to approximate rate coding accurately. Conversely, the direct training of SNNs using spike-based backpropagation (BP), such as surrogate gradient approximation, is more flexible and widely adopted. Nevertheless, our research revealed that the shape of the surrogate gradient function profoundly influences the training and inference accuracy of SNNs. Importantly, we identified that the shape of the surrogate gradient function significantly affects the final training accuracy. The shape of the surrogate gradient function is typically manually selected before training and remains static throughout the training process. In this article, we introduce a novel k-based leaky integrate-and-fire (KLIF) spiking neural model. KLIF, featuring a learnable parameter, enables the dynamic adjustment of the height and width of the effective surrogate gradient near threshold during training. Our proposed model undergoes evaluation on static CIFAR-10 and CIFAR-100 data sets, as well as neuromorphic CIFAR10-DVS and DVS128-Gesture data sets. Experimental results demonstrate that KLIF outperforms the leaky Integrate-and-Fire (LIF) model across multiple data sets and network architectures. The superior performance of KLIF positions it as a viable replacement for the essential role of LIF in SNNs across diverse tasks.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Petr Anokhin, Artyom Sorokin, Mikhail Burtsev, Karl Friston
Associative learning is a behavioral phenomenon in which individuals develop connections between stimuli or events based on their co-occurrence. Initially studied by Pavlov in his conditioning experiments, the fundamental principles of learning have been expanded on through the discovery of a wide range of learning phenomena. Computational models have been developed based on the concept of minimizing reward prediction errors. The Rescorla-Wagner model, in particular, is a well-known model that has greatly influenced the field of reinforcement learning. However, the simplicity of these models restricts their ability to fully explain the diverse range of behavioral phenomena associated with learning. In this study, we adopt the free energy principle, which suggests that living systems strive to minimize surprise or uncertainty under their internal models of the world. We consider the learning process as the minimization of free energy and investigate its relationship with the Rescorla-Wagner model, focusing on the informational aspects of learning, different types of surprise, and prediction errors based on beliefs and values. Furthermore, we explore how well-known behavioral phenomena such as blocking, overshadowing, and latent inhibition can be modeled within the active inference framework. We accomplish this by using the informational and novelty aspects of attention, which share similar ideas proposed by seemingly contradictory models such as Mackintosh and Pearce-Hall models. Thus, we demonstrate that the free energy principle, as a theoretical framework derived from first principles, can integrate the ideas and models of associative learning proposed based on empirical experiments and serve as a framework for a better understanding of the computational processes behind associative learning in the brain.
{"title":"Associative Learning and Active Inference.","authors":"Petr Anokhin, Artyom Sorokin, Mikhail Burtsev, Karl Friston","doi":"10.1162/neco_a_01711","DOIUrl":"https://doi.org/10.1162/neco_a_01711","url":null,"abstract":"<p><p>Associative learning is a behavioral phenomenon in which individuals develop connections between stimuli or events based on their co-occurrence. Initially studied by Pavlov in his conditioning experiments, the fundamental principles of learning have been expanded on through the discovery of a wide range of learning phenomena. Computational models have been developed based on the concept of minimizing reward prediction errors. The Rescorla-Wagner model, in particular, is a well-known model that has greatly influenced the field of reinforcement learning. However, the simplicity of these models restricts their ability to fully explain the diverse range of behavioral phenomena associated with learning. In this study, we adopt the free energy principle, which suggests that living systems strive to minimize surprise or uncertainty under their internal models of the world. We consider the learning process as the minimization of free energy and investigate its relationship with the Rescorla-Wagner model, focusing on the informational aspects of learning, different types of surprise, and prediction errors based on beliefs and values. Furthermore, we explore how well-known behavioral phenomena such as blocking, overshadowing, and latent inhibition can be modeled within the active inference framework. We accomplish this by using the informational and novelty aspects of attention, which share similar ideas proposed by seemingly contradictory models such as Mackintosh and Pearce-Hall models. Thus, we demonstrate that the free energy principle, as a theoretical framework derived from first principles, can integrate the ideas and models of associative learning proposed based on empirical experiments and serve as a framework for a better understanding of the computational processes behind associative learning in the brain.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The free energy principle (FEP) describes (biological) agents as minimizing a variational free energy (FE) with respect to a generative model of their environment. Active inference (AIF) is a corollary of the FEP that describes how agents explore and exploit their environment by minimizing an expected FE objective. In two related papers, we describe a scalable, epistemic approach to synthetic AIF by message passing on free-form Forney-style factor graphs (FFGs). A companion paper (part I of this article; Koudahl et al., 2023) introduces a constrained FFG (CFFG) notation that visually represents (generalized) FE objectives for AIF. This article (part II) derives message-passing algorithms that minimize (generalized) FE objectives on a CFFG by variational calculus. A comparison between simulated Bethe and generalized FE agents illustrates how the message-passing approach to synthetic AIF induces epistemic behavior on a T-maze navigation task. Extension of the T-maze simulation to learning goal statistics and a multiagent bargaining setting illustrate how this approach encourages reuse of nodes and updates in alternative settings. With a full message-passing account of synthetic AIF agents, it becomes possible to derive and reuse message updates across models and move closer to industrial applications of synthetic AIF.
自由能原理(FEP)将(生物)代理描述为相对于其环境的生成模型最小化可变自由能(FE)。主动推理(AIF)是自由能原理的必然结果,它描述了生物体如何通过最小化预期自由能目标来探索和利用其环境。在两篇相关论文中,我们描述了通过在自由形式的福尼式因子图(FFGs)上进行消息传递来合成 AIF 的可扩展认识论方法。另一篇相关论文(本文第一部分;Koudahl 等人,2023 年)介绍了一种受限 FFG(CFFG)符号,它能直观地表示 AIF 的(广义)FE 目标。本文(第二部分)通过变分法推导了在 CFFG 上最小化(广义)FE 目标的消息传递算法。模拟贝特代理和广义 FE 代理之间的比较说明了合成 AIF 的信息传递方法如何在 T 型迷宫导航任务中诱导认识行为。将 T 型迷宫模拟扩展到学习目标统计和多代理讨价还价设置,说明了这种方法如何鼓励在其他设置中重复使用节点和更新。有了合成 AIF 代理的完整消息传递账户,就有可能在不同模型中推导和重用消息更新,并更接近合成 AIF 的工业应用。
{"title":"Realizing Synthetic Active Inference Agents, Part II: Variational Message Updates.","authors":"Thijs van de Laar, Magnus Koudahl, Bert de Vries","doi":"10.1162/neco_a_01713","DOIUrl":"https://doi.org/10.1162/neco_a_01713","url":null,"abstract":"<p><p>The free energy principle (FEP) describes (biological) agents as minimizing a variational free energy (FE) with respect to a generative model of their environment. Active inference (AIF) is a corollary of the FEP that describes how agents explore and exploit their environment by minimizing an expected FE objective. In two related papers, we describe a scalable, epistemic approach to synthetic AIF by message passing on free-form Forney-style factor graphs (FFGs). A companion paper (part I of this article; Koudahl et al., 2023) introduces a constrained FFG (CFFG) notation that visually represents (generalized) FE objectives for AIF. This article (part II) derives message-passing algorithms that minimize (generalized) FE objectives on a CFFG by variational calculus. A comparison between simulated Bethe and generalized FE agents illustrates how the message-passing approach to synthetic AIF induces epistemic behavior on a T-maze navigation task. Extension of the T-maze simulation to learning goal statistics and a multiagent bargaining setting illustrate how this approach encourages reuse of nodes and updates in alternative settings. With a full message-passing account of synthetic AIF agents, it becomes possible to derive and reuse message updates across models and move closer to industrial applications of synthetic AIF.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Travis Monk, Nik Dennler, Nicholas Ralph, Shavika Rastogi, Saeed Afshar, Pablo Urbizagastegui, Russell Jarvis, André van Schaik, Andrew Adamatzky
Neural action potentials (APs) are difficult to interpret as signal encoders and/or computational primitives. Their relationships with stimuli and behaviors are obscured by the staggering complexity of nervous systems themselves. We can reduce this complexity by observing that "simpler" neuron-less organisms also transduce stimuli into transient electrical pulses that affect their behaviors. Without a complicated nervous system, APs are often easier to understand as signal/response mechanisms. We review examples of nonneural stimulus transductions in domains of life largely neglected by theoretical neuroscience: bacteria, protozoans, plants, fungi, and neuron-less animals. We report properties of those electrical signals-for example, amplitudes, durations, ionic bases, refractory periods, and particularly their ecological purposes. We compare those properties with those of neurons to infer the tasks and selection pressures that neurons satisfy. Throughout the tree of life, nonneural stimulus transductions time behavioral responses to environmental changes. Nonneural organisms represent the presence or absence of a stimulus with the presence or absence of an electrical signal. Their transductions usually exhibit high sensitivity and specificity to a stimulus, but are often slow compared to neurons. Neurons appear to be sacrificing the specificity of their stimulus transductions for sensitivity and speed. We interpret cellular stimulus transductions as a cell's assertion that it detected something important at that moment in time. In particular, we consider neural APs as fast but noisy detection assertions. We infer that a principal goal of nervous systems is to detect extremely weak signals from noisy sensory spikes under enormous time pressure. We discuss neural computation proposals that address this goal by casting neurons as devices that implement online, analog, probabilistic computations with their membrane potentials. Those proposals imply a measurable relationship between afferent neural spiking statistics and efferent neural membrane electrophysiology.
神经动作电位(APs)很难被解释为信号编码器和/或计算原语。神经系统本身惊人的复杂性掩盖了它们与刺激和行为之间的关系。我们可以通过观察 "更简单 "的无神经元生物,将刺激转化为影响其行为的瞬时电脉冲,从而降低这种复杂性。没有复杂的神经系统,AP 通常更容易理解为信号/反应机制。我们回顾了理论神经科学在很大程度上忽视的生命领域中的非神经刺激信号转导实例:细菌、原生动物、植物、真菌和无神经元动物。我们报告了这些电信号的特性--例如振幅、持续时间、离子基础、折射周期,尤其是它们的生态目的。我们将这些特性与神经元的特性进行比较,以推断神经元所满足的任务和选择压力。在整个生命树中,非神经刺激传导为行为对环境变化的反应定时。非神经生物以电信号的存在或不存在来表示刺激的存在或不存在。它们的信号转导通常对刺激具有高灵敏度和特异性,但与神经元相比,它们的信号转导通常比较缓慢。神经元似乎牺牲了刺激信号传导的特异性,以换取灵敏度和速度。我们将细胞刺激转导解释为细胞断言它在那一时刻检测到了重要的东西。特别是,我们将神经 AP 视为快速但有噪声的检测断言。我们推断,神经系统的主要目标是在巨大的时间压力下,从嘈杂的感觉尖峰中检测出极其微弱的信号。针对这一目标,我们讨论了神经计算建议,将神经元视为利用膜电位实现在线、模拟、概率计算的设备。这些建议意味着传入神经尖峰统计与传出神经膜电生理学之间存在可测量的关系。
{"title":"Electrical Signaling Beyond Neurons.","authors":"Travis Monk, Nik Dennler, Nicholas Ralph, Shavika Rastogi, Saeed Afshar, Pablo Urbizagastegui, Russell Jarvis, André van Schaik, Andrew Adamatzky","doi":"10.1162/neco_a_01696","DOIUrl":"10.1162/neco_a_01696","url":null,"abstract":"<p><p>Neural action potentials (APs) are difficult to interpret as signal encoders and/or computational primitives. Their relationships with stimuli and behaviors are obscured by the staggering complexity of nervous systems themselves. We can reduce this complexity by observing that \"simpler\" neuron-less organisms also transduce stimuli into transient electrical pulses that affect their behaviors. Without a complicated nervous system, APs are often easier to understand as signal/response mechanisms. We review examples of nonneural stimulus transductions in domains of life largely neglected by theoretical neuroscience: bacteria, protozoans, plants, fungi, and neuron-less animals. We report properties of those electrical signals-for example, amplitudes, durations, ionic bases, refractory periods, and particularly their ecological purposes. We compare those properties with those of neurons to infer the tasks and selection pressures that neurons satisfy. Throughout the tree of life, nonneural stimulus transductions time behavioral responses to environmental changes. Nonneural organisms represent the presence or absence of a stimulus with the presence or absence of an electrical signal. Their transductions usually exhibit high sensitivity and specificity to a stimulus, but are often slow compared to neurons. Neurons appear to be sacrificing the specificity of their stimulus transductions for sensitivity and speed. We interpret cellular stimulus transductions as a cell's assertion that it detected something important at that moment in time. In particular, we consider neural APs as fast but noisy detection assertions. We infer that a principal goal of nervous systems is to detect extremely weak signals from noisy sensory spikes under enormous time pressure. We discuss neural computation proposals that address this goal by casting neurons as devices that implement online, analog, probabilistic computations with their membrane potentials. Those proposals imply a measurable relationship between afferent neural spiking statistics and efferent neural membrane electrophysiology.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs) are the next-generation neural networks composed of biologically plausible neurons that communicate through trains of spikes. By modifying the plastic parameters of SNNs, including weights and time delays, SNNs can be trained to perform various AI tasks, although in general not at the same level of performance as typical artificial neural networks (ANNs). One possible solution to improve the performance of SNNs is to consider plastic parameters other than just weights and time delays drawn from the inherent complexity of the neural system of the brain, which may help SNNs improve their information processing ability and achieve brainlike functions. Here, we propose reference spikes as a new type of plastic parameters in a supervised learning scheme in SNNs. A neuron receives reference spikes through synapses providing reference information independent of input to help during learning, whose number of spikes and timings are trainable by error backpropagation. Theoretically, reference spikes improve the temporal information processing of SNNs by modulating the integration of incoming spikes at a detailed level. Through comparative computational experiments, we demonstrate using supervised learning that reference spikes improve the memory capacity of SNNs to map input spike patterns to target output spike patterns and increase classification accuracy on the MNIST, Fashion-MNIST, and SHD data sets, where both input and target output are temporally encoded. Our results demonstrate that applying reference spikes improves the performance of SNNs by enhancing their temporal information processing ability.
{"title":"Trainable Reference Spikes Improve Temporal Information Processing of SNNs With Supervised Learning.","authors":"Zeyuan Wang, Luis Cruz","doi":"10.1162/neco_a_01702","DOIUrl":"10.1162/neco_a_01702","url":null,"abstract":"<p><p>Spiking neural networks (SNNs) are the next-generation neural networks composed of biologically plausible neurons that communicate through trains of spikes. By modifying the plastic parameters of SNNs, including weights and time delays, SNNs can be trained to perform various AI tasks, although in general not at the same level of performance as typical artificial neural networks (ANNs). One possible solution to improve the performance of SNNs is to consider plastic parameters other than just weights and time delays drawn from the inherent complexity of the neural system of the brain, which may help SNNs improve their information processing ability and achieve brainlike functions. Here, we propose reference spikes as a new type of plastic parameters in a supervised learning scheme in SNNs. A neuron receives reference spikes through synapses providing reference information independent of input to help during learning, whose number of spikes and timings are trainable by error backpropagation. Theoretically, reference spikes improve the temporal information processing of SNNs by modulating the integration of incoming spikes at a detailed level. Through comparative computational experiments, we demonstrate using supervised learning that reference spikes improve the memory capacity of SNNs to map input spike patterns to target output spike patterns and increase classification accuracy on the MNIST, Fashion-MNIST, and SHD data sets, where both input and target output are temporally encoded. Our results demonstrate that applying reference spikes improves the performance of SNNs by enhancing their temporal information processing ability.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nina Baldy, Martin Breyton, Marmaduke M Woodman, Viktor K Jirsa, Meysam Hashemi
The process of inference on networks of spiking neurons is essential to decipher the underlying mechanisms of brain computation and function. In this study, we conduct inference on parameters and dynamics of a mean-field approximation, simplifying the interactions of neurons. Estimating parameters of this class of generative model allows one to predict the system's dynamics and responses under changing inputs and, indeed, changing parameters. We first assume a set of known state-space equations and address the problem of inferring the lumped parameters from observed time series. Crucially, we consider this problem in the setting of bistability, random fluctuations in system dynamics, and partial observations, in which some states are hidden. To identify the most efficient estimation or inversion scheme in this particular system identification, we benchmark against state-of-the-art optimization and Bayesian estimation algorithms, highlighting their strengths and weaknesses. Additionally, we explore how well the statistical relationships between parameters are maintained across different scales. We found that deep neural density estimators outperform other algorithms in the inversion scheme, despite potentially resulting in overestimated uncertainty and correlation between parameters. Nevertheless, this issue can be improved by incorporating time-delay embedding. We then eschew the mean-field approximation and employ deep neural ODEs on spiking neurons, illustrating prediction of system dynamics and vector fields from microscopic states. Overall, this study affords an opportunity to predict brain dynamics and responses to various perturbations or pharmacological interventions using deep neural networks.
{"title":"Inference on the Macroscopic Dynamics of Spiking Neurons.","authors":"Nina Baldy, Martin Breyton, Marmaduke M Woodman, Viktor K Jirsa, Meysam Hashemi","doi":"10.1162/neco_a_01701","DOIUrl":"10.1162/neco_a_01701","url":null,"abstract":"<p><p>The process of inference on networks of spiking neurons is essential to decipher the underlying mechanisms of brain computation and function. In this study, we conduct inference on parameters and dynamics of a mean-field approximation, simplifying the interactions of neurons. Estimating parameters of this class of generative model allows one to predict the system's dynamics and responses under changing inputs and, indeed, changing parameters. We first assume a set of known state-space equations and address the problem of inferring the lumped parameters from observed time series. Crucially, we consider this problem in the setting of bistability, random fluctuations in system dynamics, and partial observations, in which some states are hidden. To identify the most efficient estimation or inversion scheme in this particular system identification, we benchmark against state-of-the-art optimization and Bayesian estimation algorithms, highlighting their strengths and weaknesses. Additionally, we explore how well the statistical relationships between parameters are maintained across different scales. We found that deep neural density estimators outperform other algorithms in the inversion scheme, despite potentially resulting in overestimated uncertainty and correlation between parameters. Nevertheless, this issue can be improved by incorporating time-delay embedding. We then eschew the mean-field approximation and employ deep neural ODEs on spiking neurons, illustrating prediction of system dynamics and vector fields from microscopic states. Overall, this study affords an opportunity to predict brain dynamics and responses to various perturbations or pharmacological interventions using deep neural networks.</p>","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}