The minimum expected energy cost model, which has been proposed as one of the optimization principles for movement planning, can reproduce many characteristics of the human upper-arm reaching movement when signal-dependent noise and the co-contraction of the antagonist’s muscles are considered. Regarding the optimization principles, discussion has been mainly based on feedforward control; however, there is debate as to whether the central nervous system uses a feedforward or feedback control process. Previous studies have shown that feedback control based on the modified linear-quadratic gaussian (LQG) control, including multiplicative noise, can reproduce many characteristics of the reaching movement. Although the cost of the LQG control consists of state and energy costs, the relationship between the energy cost and the characteristics of the reaching movement in the LQG control has not been studied. In this work, I investigated how the optimal movement based on the LQG control varied with the proportion of energy cost, assuming that the central nervous system used feedback control. When the cost contained specific proportions of energy cost, the optimal movement reproduced the characteristics of the reaching movement. This result shows that energy cost is essential in both feedforward and feedback control for reproducing the characteristics of the upper-arm reaching movement.
{"title":"Optimal Feedback Control for the Proportion of Energy Cost in the Upper-Arm Reaching Movement","authors":"Yoshiaki Taniai","doi":"10.1162/neco_a_01614","DOIUrl":"10.1162/neco_a_01614","url":null,"abstract":"The minimum expected energy cost model, which has been proposed as one of the optimization principles for movement planning, can reproduce many characteristics of the human upper-arm reaching movement when signal-dependent noise and the co-contraction of the antagonist’s muscles are considered. Regarding the optimization principles, discussion has been mainly based on feedforward control; however, there is debate as to whether the central nervous system uses a feedforward or feedback control process. Previous studies have shown that feedback control based on the modified linear-quadratic gaussian (LQG) control, including multiplicative noise, can reproduce many characteristics of the reaching movement. Although the cost of the LQG control consists of state and energy costs, the relationship between the energy cost and the characteristics of the reaching movement in the LQG control has not been studied. In this work, I investigated how the optimal movement based on the LQG control varied with the proportion of energy cost, assuming that the central nervous system used feedback control. When the cost contained specific proportions of energy cost, the optimal movement reproduced the characteristics of the reaching movement. This result shows that energy cost is essential in both feedforward and feedback control for reproducing the characteristics of the upper-arm reaching movement.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41170840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Grid cells play a principal role in enabling cognitive representations of ambient environments. The key property of these cells—the regular arrangement of their firing fields—is commonly viewed as a means for establishing spatial scales or encoding specific locations. However, using grid cells’ spiking outputs for deducing geometric orderliness proves to be a strenuous task due to fairly irregular activation patterns triggered by the animal’s sporadic visits to the grid fields. This article addresses statistical mechanisms enabling emergent regularity of grid cell firing activity from the perspective of percolation theory. Using percolation phenomena for modeling the effect of the rat’s moves through the lattices of firing fields sheds new light on the mechanisms of spatial information processing, spatial learning, path integration, and establishing spatial metrics. It is also shown that physiological parameters required for spiking percolation match the experimental range, including the characteristic 2/3 ratio between the grid fields’ size and the grid spacing, pointing at a biological viability of the approach.
{"title":"Grid Cell Percolation","authors":"Yuri Dabaghian","doi":"10.1162/neco_a_01606","DOIUrl":"10.1162/neco_a_01606","url":null,"abstract":"Grid cells play a principal role in enabling cognitive representations of ambient environments. The key property of these cells—the regular arrangement of their firing fields—is commonly viewed as a means for establishing spatial scales or encoding specific locations. However, using grid cells’ spiking outputs for deducing geometric orderliness proves to be a strenuous task due to fairly irregular activation patterns triggered by the animal’s sporadic visits to the grid fields. This article addresses statistical mechanisms enabling emergent regularity of grid cell firing activity from the perspective of percolation theory. Using percolation phenomena for modeling the effect of the rat’s moves through the lattices of firing fields sheds new light on the mechanisms of spatial information processing, spatial learning, path integration, and establishing spatial metrics. It is also shown that physiological parameters required for spiking percolation match the experimental range, including the characteristic 2/3 ratio between the grid fields’ size and the grid spacing, pointing at a biological viability of the approach.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10199172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Zhao;S. Wu;G. Li;Y. Chen;G. Niu;Masashi Sugiyama
Deep reinforcement learning (DRL) provides an agent with an optimal policy so as to maximize the cumulative rewards. The policy defined in DRL mainly depends on the state, historical memory, and policy model parameters. However, we humans usually take actions according to our own intentions, such as moving fast or slow, besides the elements included in the traditional policy models. In order to make the action-choosing mechanism more similar to humans and make the agent to select actions that incorporate intentions, we propose an intention-aware policy learning method in this letter To formalize this process, we first define an intention-aware policy by incorporating the intention information into the policy model, which is learned by maximizing the cumulative rewards with the mutual information (MI) between the intention and the action. Then we derive an approximation of the MI objective that can be optimized efficiently. Finally, we demonstrate the effectiveness of the intention-aware policy in the classical MuJoCo control task and the multigoal continuous chain walking task.
{"title":"Learning Intention-Aware Policies in Deep Reinforcement Learning","authors":"T. Zhao;S. Wu;G. Li;Y. Chen;G. Niu;Masashi Sugiyama","doi":"10.1162/neco_a_01607","DOIUrl":"10.1162/neco_a_01607","url":null,"abstract":"Deep reinforcement learning (DRL) provides an agent with an optimal policy so as to maximize the cumulative rewards. The policy defined in DRL mainly depends on the state, historical memory, and policy model parameters. However, we humans usually take actions according to our own intentions, such as moving fast or slow, besides the elements included in the traditional policy models. In order to make the action-choosing mechanism more similar to humans and make the agent to select actions that incorporate intentions, we propose an intention-aware policy learning method in this letter To formalize this process, we first define an intention-aware policy by incorporating the intention information into the policy model, which is learned by maximizing the cumulative rewards with the mutual information (MI) between the intention and the action. Then we derive an approximation of the MI objective that can be optimized efficiently. Finally, we demonstrate the effectiveness of the intention-aware policy in the classical MuJoCo control task and the multigoal continuous chain walking task.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10207905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs) have emerged as a promising alternative to traditional deep neural networks for low-power computing. However, the effectiveness of SNNs is not solely determined by their performance but also by their energy consumption, prediction speed, and robustness to noise. The recent method Fast & Deep, along with others, achieves fast and energy-efficient computation by constraining neurons to fire at most once. Known as time-to-first-spike (TTFS), this constraint, however, restricts the capabilities of SNNs in many aspects. In this work, we explore the relationships of performance, energy consumption, speed, and stability when using this constraint. More precisely, we highlight the existence of trade-offs where performance and robustness are gained at the cost of sparsity and prediction latency. To improve these trade-offs, we propose a relaxed version of Fast & Deep that allows for multiple spikes per neuron. Our experiments show that relaxing the spike constraint provides higher performance while also benefiting from faster convergence, similar sparsity, comparable prediction latency, and better robustness to noise compared to TTFS SNNs. By highlighting the limitations of TTFS and demonstrating the advantages of unconstrained SNNs, we provide valuable insight for the development of effective learning strategies for neuromorphic computing.
{"title":"Exploring Trade-Offs in Spiking Neural Networks","authors":"Florian Bacho;Dominique Chu","doi":"10.1162/neco_a_01609","DOIUrl":"10.1162/neco_a_01609","url":null,"abstract":"Spiking neural networks (SNNs) have emerged as a promising alternative to traditional deep neural networks for low-power computing. However, the effectiveness of SNNs is not solely determined by their performance but also by their energy consumption, prediction speed, and robustness to noise. The recent method Fast & Deep, along with others, achieves fast and energy-efficient computation by constraining neurons to fire at most once. Known as time-to-first-spike (TTFS), this constraint, however, restricts the capabilities of SNNs in many aspects. In this work, we explore the relationships of performance, energy consumption, speed, and stability when using this constraint. More precisely, we highlight the existence of trade-offs where performance and robustness are gained at the cost of sparsity and prediction latency. To improve these trade-offs, we propose a relaxed version of Fast & Deep that allows for multiple spikes per neuron. Our experiments show that relaxing the spike constraint provides higher performance while also benefiting from faster convergence, similar sparsity, comparable prediction latency, and better robustness to noise compared to TTFS SNNs. By highlighting the limitations of TTFS and demonstrating the advantages of unconstrained SNNs, we provide valuable insight for the development of effective learning strategies for neuromorphic computing.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10199175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tak Shing Au Yeung;Ka Chun Cheung;Michael K. Ng;Simon See;Andy Yip
The task of transfer learning using pretrained convolutional neural networks is considered. We propose a convolution-SVD layer to analyze the convolution operators with a singular value decomposition computed in the Fourier domain. Singular vectors extracted from the source domain are transferred to the target domain, whereas the singular values are fine-tuned with a target data set. In this way, dimension reduction is achieved to avoid overfitting, while some flexibility to fine-tune the convolution kernels is maintained. We extend an existing convolution kernel reconstruction algorithm to allow for a reconstruction from an arbitrary set of learned singular values. A generalization bound for a single convolution-SVD layer is devised to show the consistency between training and testing errors. We further introduce a notion of transfer learning gap. We prove that the testing error for a single convolution-SVD layer is bounded in terms of the gap, which motivates us to develop a regularization model with the gap as the regularizer. Numerical experiments are conducted to demonstrate the superiority of the proposed model in solving classification problems and the influence of various parameters. In particular, the regularization is shown to yield a significantly higher prediction accuracy.
{"title":"Transfer Learning With Singular Value Decomposition of Multichannel Convolution Matrices","authors":"Tak Shing Au Yeung;Ka Chun Cheung;Michael K. Ng;Simon See;Andy Yip","doi":"10.1162/neco_a_01608","DOIUrl":"10.1162/neco_a_01608","url":null,"abstract":"The task of transfer learning using pretrained convolutional neural networks is considered. We propose a convolution-SVD layer to analyze the convolution operators with a singular value decomposition computed in the Fourier domain. Singular vectors extracted from the source domain are transferred to the target domain, whereas the singular values are fine-tuned with a target data set. In this way, dimension reduction is achieved to avoid overfitting, while some flexibility to fine-tune the convolution kernels is maintained. We extend an existing convolution kernel reconstruction algorithm to allow for a reconstruction from an arbitrary set of learned singular values. A generalization bound for a single convolution-SVD layer is devised to show the consistency between training and testing errors. We further introduce a notion of transfer learning gap. We prove that the testing error for a single convolution-SVD layer is bounded in terms of the gap, which motivates us to develop a regularization model with the gap as the regularizer. Numerical experiments are conducted to demonstrate the superiority of the proposed model in solving classification problems and the influence of various parameters. In particular, the regularization is shown to yield a significantly higher prediction accuracy.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10207904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs) are receiving increasing attention due to their low power consumption and strong bioplausibility. Optimization of SNNs is a challenging task. Two main methods, artificial neural network (ANN)-to-SNN conversion and spike-based backpropagation (BP), both have advantages and limitations. ANN-to-SNN conversion requires a long inference time to approximate the accuracy of ANN, thus diminishing the benefits of SNN. With spike-based BP, training high-precision SNNs typically consumes dozens of times more computational resources and time than their ANN counterparts. In this letter, we propose a novel SNN training approach that combines the benefits of the two methods. We first train a single-step SNN(T = 1) by approximating the neural potential distribution with random noise, then convert the single-step SNN(T = 1) to a multistep SNN(T = N) losslessly. The introduction of gaussian distributed noise leads to a significant gain in accuracy after conversion. The results show that our method considerably reduces the training and inference times of SNNs while maintaining their high accuracy. Compared to the previous two methods, ours can reduce training time by 65% to 75% and achieves more than 100 times faster inference speed. We also argue that the neuron model augmented with noise makes it more bioplausible.
{"title":"A Noise-Based Novel Strategy for Faster SNN Training","authors":"Chunming Jiang;Yilei Zhang","doi":"10.1162/neco_a_01604","DOIUrl":"10.1162/neco_a_01604","url":null,"abstract":"Spiking neural networks (SNNs) are receiving increasing attention due to their low power consumption and strong bioplausibility. Optimization of SNNs is a challenging task. Two main methods, artificial neural network (ANN)-to-SNN conversion and spike-based backpropagation (BP), both have advantages and limitations. ANN-to-SNN conversion requires a long inference time to approximate the accuracy of ANN, thus diminishing the benefits of SNN. With spike-based BP, training high-precision SNNs typically consumes dozens of times more computational resources and time than their ANN counterparts. In this letter, we propose a novel SNN training approach that combines the benefits of the two methods. We first train a single-step SNN(T = 1) by approximating the neural potential distribution with random noise, then convert the single-step SNN(T = 1) to a multistep SNN(T = N) losslessly. The introduction of gaussian distributed noise leads to a significant gain in accuracy after conversion. The results show that our method considerably reduces the training and inference times of SNNs while maintaining their high accuracy. Compared to the previous two methods, ours can reduce training time by 65% to 75% and achieves more than 100 times faster inference speed. We also argue that the neuron model augmented with noise makes it more bioplausible.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9947487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this letter, we use composite optimization algorithms to solve sigmoid networks. We equivalently transfer the sigmoid networks to a convex composite optimization and propose the composite optimization algorithms based on the linearized proximal algorithms and the alternating direction method of multipliers. Under the assumptions of the weak sharp minima and the regularity condition, the algorithm is guaranteed to converge to a globally optimal solution of the objective function even in the case of nonconvex and nonsmooth problems. Furthermore, the convergence results can be directly related to the amount of training data and provide a general guide for setting the size of sigmoid networks. Numerical experiments on Franke’s function fitting and handwritten digit recognition show that the proposed algorithms perform satisfactorily and robustly.
{"title":"Composite Optimization Algorithms for Sigmoid Networks","authors":"Huixiong Chen;Qi Ye","doi":"10.1162/neco_a_01603","DOIUrl":"10.1162/neco_a_01603","url":null,"abstract":"In this letter, we use composite optimization algorithms to solve sigmoid networks. We equivalently transfer the sigmoid networks to a convex composite optimization and propose the composite optimization algorithms based on the linearized proximal algorithms and the alternating direction method of multipliers. Under the assumptions of the weak sharp minima and the regularity condition, the algorithm is guaranteed to converge to a globally optimal solution of the objective function even in the case of nonconvex and nonsmooth problems. Furthermore, the convergence results can be directly related to the amount of training data and provide a general guide for setting the size of sigmoid networks. Numerical experiments on Franke’s function fitting and handwritten digit recognition show that the proposed algorithms perform satisfactorily and robustly.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9941798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for using mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate that by using the Hopfield model as a prototype for neural networks, mirror descent can effectively train the model with significantly improved performance compared to traditional gradient descent methods that rely on random parameter initialization. Our findings highlight the potential of mirror descent as a promising initialization technique for enhancing the optimization of machine learning models.
{"title":"Mirror Descent of Hopfield Model","authors":"Hyungjoon Soh;Dongyeob Kim;Juno Hwang;Junghyo Jo","doi":"10.1162/neco_a_01602","DOIUrl":"10.1162/neco_a_01602","url":null,"abstract":"Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for using mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate that by using the Hopfield model as a prototype for neural networks, mirror descent can effectively train the model with significantly improved performance compared to traditional gradient descent methods that rely on random parameter initialization. Our findings highlight the potential of mirror descent as a promising initialization technique for enhancing the optimization of machine learning models.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9947482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Understanding the effect of spike-timing-dependent plasticity (STDP) is key to elucidating how neural networks change over long timescales and to design interventions aimed at modulating such networks in neurological disorders. However, progress is restricted by the significant computational cost associated with simulating neural network models with STDP and by the lack of low-dimensional description that could provide analytical insights. Phase-difference-dependent plasticity (PDDP) rules approximate STDP in phase oscillator networks, which prescribe synaptic changes based on phase differences of neuron pairs rather than differences in spike timing. Here we construct mean-field approximations for phase oscillator networks with STDP to describe part of the phase space for this very high-dimensional system. We first show that single-harmonic PDDP rules can approximate a simple form of symmetric STDP, while multiharmonic rules are required to accurately approximate causal STDP. We then derive exact expressions for the evolution of the average PDDP coupling weight in terms of network synchrony. For adaptive networks of Kuramoto oscillators that form clusters, we formulate a family of low-dimensional descriptions based on the mean-field dynamics of each cluster and average coupling weights between and within clusters. Finally, we show that such a two-cluster mean-field model can be fitted to synthetic data to provide a low-dimensional approximation of a full adaptive network with symmetric STDP. Our framework represents a step toward a low-dimensional description of adaptive networks with STDP, and could for example inform the development of new therapies aimed at maximizing the long-lasting effects of brain stimulation.
理解spike- time -dependent plasticity (STDP)的影响是阐明神经网络如何在长时间尺度上变化的关键,也是设计针对神经系统疾病调节这种网络的干预措施的关键。然而,由于使用STDP模拟神经网络模型的计算成本很高,并且缺乏可以提供分析见解的低维描述,因此进展受到限制。相位差依赖的可塑性(PDDP)规则近似于相振网络中的STDP,它规定了基于神经元对相位差而不是脉冲时间差异的突触变化。在这里,我们用STDP构造相振网络的平均场近似来描述这个非常高维系统的部分相空间。我们首先证明了单谐波PDDP规则可以近似一种简单形式的对称STDP,而多谐波规则需要精确地近似因果STDP。然后,我们推导了平均PDDP耦合权在网络同步方面的精确表达式。对于形成簇的Kuramoto振子自适应网络,我们基于每个簇的平均场动力学和簇之间和簇内的平均耦合权,制定了一组低维描述。最后,我们证明了这种双簇平均场模型可以拟合到合成数据中,以提供具有对称STDP的全自适应网络的低维近似。我们的框架代表了用STDP对自适应网络进行低维描述的一步,例如,可以为旨在最大化脑刺激持久效果的新疗法的开发提供信息。
{"title":"Mean-Field Approximations With Adaptive Coupling for Networks With Spike-Timing-Dependent Plasticity","authors":"Benoit Duchet;Christian Bick;Áine Byrne","doi":"10.1162/neco_a_01601","DOIUrl":"10.1162/neco_a_01601","url":null,"abstract":"Understanding the effect of spike-timing-dependent plasticity (STDP) is key to elucidating how neural networks change over long timescales and to design interventions aimed at modulating such networks in neurological disorders. However, progress is restricted by the significant computational cost associated with simulating neural network models with STDP and by the lack of low-dimensional description that could provide analytical insights. Phase-difference-dependent plasticity (PDDP) rules approximate STDP in phase oscillator networks, which prescribe synaptic changes based on phase differences of neuron pairs rather than differences in spike timing. Here we construct mean-field approximations for phase oscillator networks with STDP to describe part of the phase space for this very high-dimensional system. We first show that single-harmonic PDDP rules can approximate a simple form of symmetric STDP, while multiharmonic rules are required to accurately approximate causal STDP. We then derive exact expressions for the evolution of the average PDDP coupling weight in terms of network synchrony. For adaptive networks of Kuramoto oscillators that form clusters, we formulate a family of low-dimensional descriptions based on the mean-field dynamics of each cluster and average coupling weights between and within clusters. Finally, we show that such a two-cluster mean-field model can be fitted to synthetic data to provide a low-dimensional approximation of a full adaptive network with symmetric STDP. Our framework represents a step toward a low-dimensional description of adaptive networks with STDP, and could for example inform the development of new therapies aimed at maximizing the long-lasting effects of brain stimulation.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9995666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This letter first constructs a typical solution of ResNets for multicategory classifications based on the idea of the gate control of LSTMs, from which a general interpretation of the ResNet architecture is given and the performance mechanism is explained. We also use more solutions to further demonstrate the generality of that interpretation. The classification result is then extended to the universal-approximation capability of the type of ResNet with two-layer gate networks, an architecture that was proposed in an original paper of ResNets and has both theoretical and practical significance.
{"title":"On an Interpretation of ResNets via Gate-Network Control","authors":"Changcun Huang","doi":"10.1162/neco_a_01600","DOIUrl":"10.1162/neco_a_01600","url":null,"abstract":"This letter first constructs a typical solution of ResNets for multicategory classifications based on the idea of the gate control of LSTMs, from which a general interpretation of the ResNet architecture is given and the performance mechanism is explained. We also use more solutions to further demonstrate the generality of that interpretation. The classification result is then extended to the universal-approximation capability of the type of ResNet with two-layer gate networks, an architecture that was proposed in an original paper of ResNets and has both theoretical and practical significance.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9947488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}