Spiking neural networks (SNNs) have garnered significant attention owing to their adeptness in processing temporal information, low power consumption, and enhanced biological plausibility. Despite these advantages, the development of efficient and high-performing learning algorithms for SNNs remains a formidable challenge. Techniques such as artificial neural network (ANN)-to-SNN conversion can convert ANNs to SNNs with minimal performance loss, but they necessitate prolonged simulations to approximate rate coding accurately. Conversely, the direct training of SNNs using spike-based backpropagation (BP), such as surrogate gradient approximation, is more flexible and widely adopted. Nevertheless, our research revealed that the shape of the surrogate gradient function profoundly influences the training and inference accuracy of SNNs. Importantly, we identified that the shape of the surrogate gradient function significantly affects the final training accuracy. The shape of the surrogate gradient function is typically manually selected before training and remains static throughout the training process. In this article, we introduce a novel k-based leaky integrate-and-fire (KLIF) spiking neural model. KLIF, featuring a learnable parameter, enables the dynamic adjustment of the height and width of the effective surrogate gradient near threshold during training. Our proposed model undergoes evaluation on static CIFAR-10 and CIFAR-100 data sets, as well as neuromorphic CIFAR10-DVS and DVS128-Gesture data sets. Experimental results demonstrate that KLIF outperforms the leaky Integrate-and-Fire (LIF) model across multiple data sets and network architectures. The superior performance of KLIF positions it as a viable replacement for the essential role of LIF in SNNs across diverse tasks.
{"title":"KLIF: An Optimized Spiking Neuron Unit for Tuning Surrogate Gradient Function","authors":"Chunming Jiang;Yilei Zhang","doi":"10.1162/neco_a_01712","DOIUrl":"10.1162/neco_a_01712","url":null,"abstract":"Spiking neural networks (SNNs) have garnered significant attention owing to their adeptness in processing temporal information, low power consumption, and enhanced biological plausibility. Despite these advantages, the development of efficient and high-performing learning algorithms for SNNs remains a formidable challenge. Techniques such as artificial neural network (ANN)-to-SNN conversion can convert ANNs to SNNs with minimal performance loss, but they necessitate prolonged simulations to approximate rate coding accurately. Conversely, the direct training of SNNs using spike-based backpropagation (BP), such as surrogate gradient approximation, is more flexible and widely adopted. Nevertheless, our research revealed that the shape of the surrogate gradient function profoundly influences the training and inference accuracy of SNNs. Importantly, we identified that the shape of the surrogate gradient function significantly affects the final training accuracy. The shape of the surrogate gradient function is typically manually selected before training and remains static throughout the training process. In this article, we introduce a novel k-based leaky integrate-and-fire (KLIF) spiking neural model. KLIF, featuring a learnable parameter, enables the dynamic adjustment of the height and width of the effective surrogate gradient near threshold during training. Our proposed model undergoes evaluation on static CIFAR-10 and CIFAR-100 data sets, as well as neuromorphic CIFAR10-DVS and DVS128-Gesture data sets. Experimental results demonstrate that KLIF outperforms the leaky Integrate-and-Fire (LIF) model across multiple data sets and network architectures. The superior performance of KLIF positions it as a viable replacement for the essential role of LIF in SNNs across diverse tasks.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 12","pages":"2636-2650"},"PeriodicalIF":2.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Petr Anokhin;Artyom Sorokin;Mikhail Burtsev;Karl Friston
Associative learning is a behavioral phenomenon in which individuals develop connections between stimuli or events based on their co-occurrence. Initially studied by Pavlov in his conditioning experiments, the fundamental principles of learning have been expanded on through the discovery of a wide range of learning phenomena. Computational models have been developed based on the concept of minimizing reward prediction errors. The Rescorla-Wagner model, in particular, is a well-known model that has greatly influenced the field of reinforcement learning. However, the simplicity of these models restricts their ability to fully explain the diverse range of behavioral phenomena associated with learning. In this study, we adopt the free energy principle, which suggests that living systems strive to minimize surprise or uncertainty under their internal models of the world. We consider the learning process as the minimization of free energy and investigate its relationship with the Rescorla-Wagner model, focusing on the informational aspects of learning, different types of surprise, and prediction errors based on beliefs and values. Furthermore, we explore how well-known behavioral phenomena such as blocking, overshadowing, and latent inhibition can be modeled within the active inference framework. We accomplish this by using the informational and novelty aspects of attention, which share similar ideas proposed by seemingly contradictory models such as Mackintosh and Pearce-Hall models. Thus, we demonstrate that the free energy principle, as a theoretical framework derived from first principles, can integrate the ideas and models of associative learning proposed based on empirical experiments and serve as a framework for a better understanding of the computational processes behind associative learning in the brain.
{"title":"Associative Learning and Active Inference","authors":"Petr Anokhin;Artyom Sorokin;Mikhail Burtsev;Karl Friston","doi":"10.1162/neco_a_01711","DOIUrl":"10.1162/neco_a_01711","url":null,"abstract":"Associative learning is a behavioral phenomenon in which individuals develop connections between stimuli or events based on their co-occurrence. Initially studied by Pavlov in his conditioning experiments, the fundamental principles of learning have been expanded on through the discovery of a wide range of learning phenomena. Computational models have been developed based on the concept of minimizing reward prediction errors. The Rescorla-Wagner model, in particular, is a well-known model that has greatly influenced the field of reinforcement learning. However, the simplicity of these models restricts their ability to fully explain the diverse range of behavioral phenomena associated with learning. In this study, we adopt the free energy principle, which suggests that living systems strive to minimize surprise or uncertainty under their internal models of the world. We consider the learning process as the minimization of free energy and investigate its relationship with the Rescorla-Wagner model, focusing on the informational aspects of learning, different types of surprise, and prediction errors based on beliefs and values. Furthermore, we explore how well-known behavioral phenomena such as blocking, overshadowing, and latent inhibition can be modeled within the active inference framework. We accomplish this by using the informational and novelty aspects of attention, which share similar ideas proposed by seemingly contradictory models such as Mackintosh and Pearce-Hall models. Thus, we demonstrate that the free energy principle, as a theoretical framework derived from first principles, can integrate the ideas and models of associative learning proposed based on empirical experiments and serve as a framework for a better understanding of the computational processes behind associative learning in the brain.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 12","pages":"2602-2635"},"PeriodicalIF":2.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The current reinforcement learning framework focuses exclusively on performance, often at the expense of efficiency. In contrast, biological control achieves remarkable performance while also optimizing computational energy expenditure and decision frequency. We propose a decision-bounded Markov decision process (DB-MDP) that constrains the number of decisions and computational energy available to agents in reinforcement learning environments. Our experiments demonstrate that existing reinforcement learning algorithms struggle within this framework, leading to either failure or suboptimal performance. To address this, we introduce a biologically inspired, temporally layered architecture (TLA), enabling agents to manage computational costs through two layers with distinct timescales and energy requirements. TLA achieves optimal performance in decision-bounded environments and in continuous control environments, matching state-of-the-art performance while using a fraction of the computing cost. Compared to current reinforcement learning algorithms that solely prioritize performance, our approach significantly lowers computational energy expenditure while maintaining performance. These findings establish a benchmark and pave the way for future research on energy and time-aware control.
{"title":"Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures","authors":"Devdhar Patel;Terrence Sejnowski;Hava Siegelmann","doi":"10.1162/neco_a_01718","DOIUrl":"10.1162/neco_a_01718","url":null,"abstract":"The current reinforcement learning framework focuses exclusively on performance, often at the expense of efficiency. In contrast, biological control achieves remarkable performance while also optimizing computational energy expenditure and decision frequency. We propose a decision-bounded Markov decision process (DB-MDP) that constrains the number of decisions and computational energy available to agents in reinforcement learning environments. Our experiments demonstrate that existing reinforcement learning algorithms struggle within this framework, leading to either failure or suboptimal performance. To address this, we introduce a biologically inspired, temporally layered architecture (TLA), enabling agents to manage computational costs through two layers with distinct timescales and energy requirements. TLA achieves optimal performance in decision-bounded environments and in continuous control environments, matching state-of-the-art performance while using a fraction of the computing cost. Compared to current reinforcement learning algorithms that solely prioritize performance, our approach significantly lowers computational energy expenditure while maintaining performance. These findings establish a benchmark and pave the way for future research on energy and time-aware control.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 12","pages":"2734-2763"},"PeriodicalIF":2.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142395375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Valentin Leplat;Le T. K. Hien;Akwum Onwunta;Nicolas Gillis
Deep nonnegative matrix factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse data sets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that ß-divergences offer a more suitable alternative. In this article, we develop new models and algorithms for deep NMF using some ß-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.
{"title":"Deep Nonnegative Matrix Factorization With Beta Divergences","authors":"Valentin Leplat;Le T. K. Hien;Akwum Onwunta;Nicolas Gillis","doi":"10.1162/neco_a_01679","DOIUrl":"10.1162/neco_a_01679","url":null,"abstract":"Deep nonnegative matrix factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse data sets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that ß-divergences offer a more suitable alternative. In this article, we develop new models and algorithms for deep NMF using some ß-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2365-2402"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Active inference is a state-of-the-art framework for modeling the brain that explains a wide range of mechanisms. Recently, two versions of branching time active inference (BTAI) have been developed to handle the exponential (space and time) complexity class that occurs when computing the prior over all possible policies up to the time horizon. However, those two versions of BTAI still suffer from an exponential complexity class with regard to the number of observed and latent variables being modeled. We resolve this limitation by allowing each observation to have its own likelihood mapping and each latent variable to have its own transition mapping. The implicit mean field approximation was tested in terms of its efficiency and computational cost using a dSprites environment in which the metadata of the dSprites data set was used as input to the model. In this setting, earlier implementations of branching time active inference (namely, BTAIVMP and BTAIBF) underperformed in relation to the mean field approximation (BTAI3MF) in terms of performance and computational efficiency. Specifically, BTAIVMP was able to solve 96.9% of the task in 5.1 seconds, and BTAIBF was able to solve 98.6% of the task in 17.5 seconds. Our new approach outperformed both of its predecessors by solving the task completely (100%) in only 2.559 seconds.
{"title":"Multimodal and Multifactor Branching Time Active Inference","authors":"Théophile Champion;Marek Grześ;Howard Bowman","doi":"10.1162/neco_a_01703","DOIUrl":"10.1162/neco_a_01703","url":null,"abstract":"Active inference is a state-of-the-art framework for modeling the brain that explains a wide range of mechanisms. Recently, two versions of branching time active inference (BTAI) have been developed to handle the exponential (space and time) complexity class that occurs when computing the prior over all possible policies up to the time horizon. However, those two versions of BTAI still suffer from an exponential complexity class with regard to the number of observed and latent variables being modeled. We resolve this limitation by allowing each observation to have its own likelihood mapping and each latent variable to have its own transition mapping. The implicit mean field approximation was tested in terms of its efficiency and computational cost using a dSprites environment in which the metadata of the dSprites data set was used as input to the model. In this setting, earlier implementations of branching time active inference (namely, BTAIVMP and BTAIBF) underperformed in relation to the mean field approximation (BTAI3MF) in terms of performance and computational efficiency. Specifically, BTAIVMP was able to solve 96.9% of the task in 5.1 seconds, and BTAIBF was able to solve 98.6% of the task in 17.5 seconds. Our new approach outperformed both of its predecessors by solving the task completely (100%) in only 2.559 seconds.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2479-2504"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We discuss prototype formation in the Hopfield network. Typically, Hebbian learning with highly correlated states leads to degraded memory performance. We show that this type of learning can lead to prototype formation, where unlearned states emerge as representatives of large correlated subsets of states, alleviating capacity woes. This process has similarities to prototype learning in human cognition. We provide a substantial literature review of prototype learning in associative memories, covering contributions from psychology, statistical physics, and computer science. We analyze prototype formation from a theoretical perspective and derive a stability condition for these states based on the number of examples of the prototype presented for learning, the noise in those examples, and the number of nonexample states presented. The stability condition is used to construct a probability of stability for a prototype state as the factors of stability change. We also note similarities to traditional network analysis, allowing us to find a prototype capacity. We corroborate these expectations of prototype formation with experiments using a simple Hopfield network with standard Hebbian learning. We extend our experiments to a Hopfield network trained on data with multiple prototypes and find the network is capable of stabilizing multiple prototypes concurrently. We measure the basins of attraction of the multiple prototype states, finding attractor strength grows with the number of examples and the agreement of examples. We link the stability and dominance of prototype states to the energy profile of these states, particularly when comparing the profile shape to target states or other spurious states.
{"title":"Prototype Analysis in Hopfield Networks With Hebbian Learning","authors":"Hayden McAlister;Anthony Robins;Lech Szymanski","doi":"10.1162/neco_a_01704","DOIUrl":"10.1162/neco_a_01704","url":null,"abstract":"We discuss prototype formation in the Hopfield network. Typically, Hebbian learning with highly correlated states leads to degraded memory performance. We show that this type of learning can lead to prototype formation, where unlearned states emerge as representatives of large correlated subsets of states, alleviating capacity woes. This process has similarities to prototype learning in human cognition. We provide a substantial literature review of prototype learning in associative memories, covering contributions from psychology, statistical physics, and computer science. We analyze prototype formation from a theoretical perspective and derive a stability condition for these states based on the number of examples of the prototype presented for learning, the noise in those examples, and the number of nonexample states presented. The stability condition is used to construct a probability of stability for a prototype state as the factors of stability change. We also note similarities to traditional network analysis, allowing us to find a prototype capacity. We corroborate these expectations of prototype formation with experiments using a simple Hopfield network with standard Hebbian learning. We extend our experiments to a Hopfield network trained on data with multiple prototypes and find the network is capable of stabilizing multiple prototypes concurrently. We measure the basins of attraction of the multiple prototype states, finding attractor strength grows with the number of examples and the agreement of examples. We link the stability and dominance of prototype states to the energy profile of these states, particularly when comparing the profile shape to target states or other spurious states.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2322-2364"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Latent space Bayesian optimization (LSBO) combines generative models, typically variational autoencoders (VAE), with Bayesian optimization (BO), to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this article, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the VAE-BO mismatch. To address this, we propose the latent consistent aware-acquisition function (LCA-AF) that leverages consistent points in LSBO. Additionally, we present LCA-VAE, a novel VAE method that creates a latent space with increased consistent points through data augmentation in latent space and penalization of latent inconsistencies. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Our approach achieves high sample efficiency and effective exploration, emphasizing the significance of addressing latent consistency through the novel incorporation of data augmentation in latent space within LCA-VAE in LSBO. We showcase the performance of our proposal via de novo image generation and de novo chemical design tasks.
{"title":"Latent Space Bayesian Optimization With Latent Data Augmentation for Enhanced Exploration","authors":"Onur Boyar;Ichiro Takeuchi","doi":"10.1162/neco_a_01708","DOIUrl":"10.1162/neco_a_01708","url":null,"abstract":"Latent space Bayesian optimization (LSBO) combines generative models, typically variational autoencoders (VAE), with Bayesian optimization (BO), to generate de novo objects of interest. However, LSBO faces challenges due to the mismatch between the objectives of BO and VAE, resulting in poor exploration capabilities. In this article, we propose novel contributions to enhance LSBO efficiency and overcome this challenge. We first introduce the concept of latent consistency/inconsistency as a crucial problem in LSBO, arising from the VAE-BO mismatch. To address this, we propose the latent consistent aware-acquisition function (LCA-AF) that leverages consistent points in LSBO. Additionally, we present LCA-VAE, a novel VAE method that creates a latent space with increased consistent points through data augmentation in latent space and penalization of latent inconsistencies. Combining LCA-VAE and LCA-AF, we develop LCA-LSBO. Our approach achieves high sample efficiency and effective exploration, emphasizing the significance of addressing latent consistency through the novel incorporation of data augmentation in latent space within LCA-VAE in LSBO. We showcase the performance of our proposal via de novo image generation and de novo chemical design tasks.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2446-2478"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe a computational model for inferring 3D structure from the motion of projected 2D points in an image, with the aim of understanding how biological vision systems learn and internally represent 3D transformations from the statistics of their input. The model uses manifold transport operators to describe the action of 3D points in a scene as they undergo transformation. We show that the model can learn the generator of the Lie group for these transformations from purely 2D input, providing a proof-of-concept demonstration for how biological systems could adapt their internal representations based on sensory input. Focusing on a rotational model, we evaluate the ability of the model to infer depth from moving 2D projected points and to learn rotational transformations from 2D training stimuli. Finally, we compare the model performance to psychophysical performance on structure-from-motion tasks.
{"title":"Learning Internal Representations of 3D Transformations From 2D Projected Inputs","authors":"Marissa Connor;Bruno Olshausen;Christopher Rozell","doi":"10.1162/neco_a_01695","DOIUrl":"10.1162/neco_a_01695","url":null,"abstract":"We describe a computational model for inferring 3D structure from the motion of projected 2D points in an image, with the aim of understanding how biological vision systems learn and internally represent 3D transformations from the statistics of their input. The model uses manifold transport operators to describe the action of 3D points in a scene as they undergo transformation. We show that the model can learn the generator of the Lie group for these transformations from purely 2D input, providing a proof-of-concept demonstration for how biological systems could adapt their internal representations based on sensory input. Focusing on a rotational model, we evaluate the ability of the model to infer depth from moving 2D projected points and to learn rotational transformations from 2D training stimuli. Finally, we compare the model performance to psychophysical performance on structure-from-motion tasks.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2505-2539"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michał Markiewicz;Ireneusz Brzozowski;Szymon Janusz
Von Neumann architecture requires information to be encoded as numerical values. For that reason, artificial neural networks running on computers require the data coming from sensors to be discretized. Other network architectures that more closely mimic biological neural networks (e.g., spiking neural networks) can be simulated on von Neumann architecture, but more important, they can also be executed on dedicated electrical circuits having orders of magnitude less power consumption. Unfortunately, input signal conditioning and encoding are usually not supported by such circuits, so a separate module consisting of an analog-to-digital converter, encoder, and transmitter is required. The aim of this article is to propose a sensor architecture, the output signal of which can be directly connected to the input of a spiking neural network. We demonstrate that the output signal is a valid spike source for the Izhikevich model neurons, ensuring the proper operation of a number of neurocomputational features. The advantages are clear: much lower power consumption, smaller area, and a less complex electronic circuit. The main disadvantage is that sensor characteristics somehow limit the parameters of applicable spiking neurons. The proposed architecture is illustrated by a case study involving a capacitive pressure sensor circuit, which is compatible with most of the neurocomputational properties of the Izhikevich neuron model. The sensor itself is characterized by very low power consumption: it draws only 3.49 μA at 3.3 V.
{"title":"Spiking Neural Network Pressure Sensor","authors":"Michał Markiewicz;Ireneusz Brzozowski;Szymon Janusz","doi":"10.1162/neco_a_01706","DOIUrl":"10.1162/neco_a_01706","url":null,"abstract":"Von Neumann architecture requires information to be encoded as numerical values. For that reason, artificial neural networks running on computers require the data coming from sensors to be discretized. Other network architectures that more closely mimic biological neural networks (e.g., spiking neural networks) can be simulated on von Neumann architecture, but more important, they can also be executed on dedicated electrical circuits having orders of magnitude less power consumption. Unfortunately, input signal conditioning and encoding are usually not supported by such circuits, so a separate module consisting of an analog-to-digital converter, encoder, and transmitter is required. The aim of this article is to propose a sensor architecture, the output signal of which can be directly connected to the input of a spiking neural network. We demonstrate that the output signal is a valid spike source for the Izhikevich model neurons, ensuring the proper operation of a number of neurocomputational features. The advantages are clear: much lower power consumption, smaller area, and a less complex electronic circuit. The main disadvantage is that sensor characteristics somehow limit the parameters of applicable spiking neurons. The proposed architecture is illustrated by a case study involving a capacitive pressure sensor circuit, which is compatible with most of the neurocomputational properties of the Izhikevich neuron model. The sensor itself is characterized by very low power consumption: it draws only 3.49 μA at 3.3 V.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2299-2321"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new method of independent component analysis (ICA) in order to extract appropriate features from high-dimensional data. In general, matrix factorization methods including ICA have a problem regarding the interpretability of extracted features. For the improvement of interpretability, sparse constraint on a factorized matrix is helpful. With this background, we construct a new ICA method with sparsity. In our method, the ℓ1-regularization term is added to the cost function of ICA, and minimization of the cost function is performed by a difference of convex functions algorithm. For the validity of our proposed method, we apply it to synthetic data and real functional magnetic resonance imaging data.
我们提出了一种新的独立分量分析(ICA)方法,以便从高维数据中提取适当的特征。一般来说,包括 ICA 在内的矩阵因式分解方法在提取特征的可解释性方面存在问题。为了提高可解释性,对因式分解矩阵进行稀疏约束很有帮助。在此背景下,我们构建了一种具有稀疏性的新 ICA 方法。在我们的方法中,ICA 的代价函数中加入了 ℓ1-regularized IC 项,代价函数的最小化是通过凸函数差分算法来实现的。为了证明我们提出的方法的有效性,我们将其应用于合成数据和真实的功能磁共振成像数据。
{"title":"ℓ1-Regularized ICA: A Novel Method for Analysis of Task-Related fMRI Data","authors":"Yusuke Endo;Koujin Takeda","doi":"10.1162/neco_a_01709","DOIUrl":"10.1162/neco_a_01709","url":null,"abstract":"We propose a new method of independent component analysis (ICA) in order to extract appropriate features from high-dimensional data. In general, matrix factorization methods including ICA have a problem regarding the interpretability of extracted features. For the improvement of interpretability, sparse constraint on a factorized matrix is helpful. With this background, we construct a new ICA method with sparsity. In our method, the ℓ1-regularization term is added to the cost function of ICA, and minimization of the cost function is performed by a difference of convex functions algorithm. For the validity of our proposed method, we apply it to synthetic data and real functional magnetic resonance imaging data.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 11","pages":"2540-2570"},"PeriodicalIF":2.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142309090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}