首页 > 最新文献

Neural Computation最新文献

英文 中文
Optimal Feedback Control for the Proportion of Energy Cost in the Upper-Arm Reaching Movement 上臂伸展运动能量消耗比例的最优反馈控制。
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-10-10 DOI: 10.1162/neco_a_01614
Yoshiaki Taniai
The minimum expected energy cost model, which has been proposed as one of the optimization principles for movement planning, can reproduce many characteristics of the human upper-arm reaching movement when signal-dependent noise and the co-contraction of the antagonist’s muscles are considered. Regarding the optimization principles, discussion has been mainly based on feedforward control; however, there is debate as to whether the central nervous system uses a feedforward or feedback control process. Previous studies have shown that feedback control based on the modified linear-quadratic gaussian (LQG) control, including multiplicative noise, can reproduce many characteristics of the reaching movement. Although the cost of the LQG control consists of state and energy costs, the relationship between the energy cost and the characteristics of the reaching movement in the LQG control has not been studied. In this work, I investigated how the optimal movement based on the LQG control varied with the proportion of energy cost, assuming that the central nervous system used feedback control. When the cost contained specific proportions of energy cost, the optimal movement reproduced the characteristics of the reaching movement. This result shows that energy cost is essential in both feedforward and feedback control for reproducing the characteristics of the upper-arm reaching movement.
最小期望能量成本模型是运动规划的优化原则之一,当考虑到信号依赖性噪声和对手肌肉的共同收缩时,该模型可以再现人类上臂伸展运动的许多特征。关于优化原理,讨论主要基于前馈控制;然而,关于中枢神经系统是使用前馈控制过程还是使用反馈控制过程,存在争议。先前的研究表明,基于改进的线性二次高斯(LQG)控制的反馈控制,包括乘性噪声,可以再现到达运动的许多特性。尽管LQG控制的成本由状态成本和能量成本组成,但尚未研究LQG控制中能量成本与到达运动特性之间的关系。在这项工作中,我研究了基于LQG控制的最优运动如何随着能量成本的比例而变化,假设中枢神经系统使用反馈控制。当成本包含特定比例的能量成本时,最佳运动再现了到达运动的特征。这一结果表明,能量成本在前馈和反馈控制中都是至关重要的,以再现上臂伸展运动的特性。
{"title":"Optimal Feedback Control for the Proportion of Energy Cost in the Upper-Arm Reaching Movement","authors":"Yoshiaki Taniai","doi":"10.1162/neco_a_01614","DOIUrl":"10.1162/neco_a_01614","url":null,"abstract":"The minimum expected energy cost model, which has been proposed as one of the optimization principles for movement planning, can reproduce many characteristics of the human upper-arm reaching movement when signal-dependent noise and the co-contraction of the antagonist’s muscles are considered. Regarding the optimization principles, discussion has been mainly based on feedforward control; however, there is debate as to whether the central nervous system uses a feedforward or feedback control process. Previous studies have shown that feedback control based on the modified linear-quadratic gaussian (LQG) control, including multiplicative noise, can reproduce many characteristics of the reaching movement. Although the cost of the LQG control consists of state and energy costs, the relationship between the energy cost and the characteristics of the reaching movement in the LQG control has not been studied. In this work, I investigated how the optimal movement based on the LQG control varied with the proportion of energy cost, assuming that the central nervous system used feedback control. When the cost contained specific proportions of energy cost, the optimal movement reproduced the characteristics of the reaching movement. This result shows that energy cost is essential in both feedforward and feedback control for reproducing the characteristics of the upper-arm reaching movement.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41170840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grid Cell Percolation 网格单元格渗透。
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-09-08 DOI: 10.1162/neco_a_01606
Yuri Dabaghian
Grid cells play a principal role in enabling cognitive representations of ambient environments. The key property of these cells—the regular arrangement of their firing fields—is commonly viewed as a means for establishing spatial scales or encoding specific locations. However, using grid cells’ spiking outputs for deducing geometric orderliness proves to be a strenuous task due to fairly irregular activation patterns triggered by the animal’s sporadic visits to the grid fields. This article addresses statistical mechanisms enabling emergent regularity of grid cell firing activity from the perspective of percolation theory. Using percolation phenomena for modeling the effect of the rat’s moves through the lattices of firing fields sheds new light on the mechanisms of spatial information processing, spatial learning, path integration, and establishing spatial metrics. It is also shown that physiological parameters required for spiking percolation match the experimental range, including the characteristic 2/3 ratio between the grid fields’ size and the grid spacing, pointing at a biological viability of the approach.
网格细胞在实现周围环境的认知表征方面发挥着主要作用。这些细胞的关键特性——它们的发射场的规则排列——通常被视为建立空间尺度或编码特定位置的一种手段。然而,使用网格细胞的尖峰输出来推断几何有序性被证明是一项艰巨的任务,因为动物偶尔造访网格场会触发相当不规则的激活模式。本文从渗流理论的角度探讨了使网格细胞放电活动呈现规律性的统计机制。使用渗流现象来模拟大鼠通过发射场晶格的运动效果,为空间信息处理、空间学习、路径整合和建立空间度量的机制提供了新的线索。研究还表明,峰值渗流所需的生理参数与实验范围相匹配,包括网格场大小和网格间距之间的特征2/3比,表明该方法具有生物学可行性。
{"title":"Grid Cell Percolation","authors":"Yuri Dabaghian","doi":"10.1162/neco_a_01606","DOIUrl":"10.1162/neco_a_01606","url":null,"abstract":"Grid cells play a principal role in enabling cognitive representations of ambient environments. The key property of these cells—the regular arrangement of their firing fields—is commonly viewed as a means for establishing spatial scales or encoding specific locations. However, using grid cells’ spiking outputs for deducing geometric orderliness proves to be a strenuous task due to fairly irregular activation patterns triggered by the animal’s sporadic visits to the grid fields. This article addresses statistical mechanisms enabling emergent regularity of grid cell firing activity from the perspective of percolation theory. Using percolation phenomena for modeling the effect of the rat’s moves through the lattices of firing fields sheds new light on the mechanisms of spatial information processing, spatial learning, path integration, and establishing spatial metrics. It is also shown that physiological parameters required for spiking percolation match the experimental range, including the characteristic 2/3 ratio between the grid fields’ size and the grid spacing, pointing at a biological viability of the approach.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10199172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Intention-Aware Policies in Deep Reinforcement Learning 在深度强化学习中学习意图感知策略
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-09-08 DOI: 10.1162/neco_a_01607
T. Zhao;S. Wu;G. Li;Y. Chen;G. Niu;Masashi Sugiyama
Deep reinforcement learning (DRL) provides an agent with an optimal policy so as to maximize the cumulative rewards. The policy defined in DRL mainly depends on the state, historical memory, and policy model parameters. However, we humans usually take actions according to our own intentions, such as moving fast or slow, besides the elements included in the traditional policy models. In order to make the action-choosing mechanism more similar to humans and make the agent to select actions that incorporate intentions, we propose an intention-aware policy learning method in this letter To formalize this process, we first define an intention-aware policy by incorporating the intention information into the policy model, which is learned by maximizing the cumulative rewards with the mutual information (MI) between the intention and the action. Then we derive an approximation of the MI objective that can be optimized efficiently. Finally, we demonstrate the effectiveness of the intention-aware policy in the classical MuJoCo control task and the multigoal continuous chain walking task.
深度强化学习(Deep reinforcement learning, DRL)为智能体提供最优策略,使累积奖励最大化。DRL中定义的策略主要取决于状态、历史内存和策略模型参数。然而,除了传统的政策模型中包含的因素外,我们人类通常会根据自己的意图采取行动,比如快或慢。为了使行为选择机制更类似于人类,并使智能体选择包含意图的行为,本文提出了一种意图感知策略学习方法。为了形式化这一过程,我们首先通过将意图信息纳入策略模型来定义意图感知策略,该策略通过意图和行为之间的互信息(MI)最大化累积奖励来学习。然后,我们推导出一个可以有效优化的人工智能目标的近似。最后,我们在经典的MuJoCo控制任务和多目标连续链行走任务中验证了意图感知策略的有效性。
{"title":"Learning Intention-Aware Policies in Deep Reinforcement Learning","authors":"T. Zhao;S. Wu;G. Li;Y. Chen;G. Niu;Masashi Sugiyama","doi":"10.1162/neco_a_01607","DOIUrl":"10.1162/neco_a_01607","url":null,"abstract":"Deep reinforcement learning (DRL) provides an agent with an optimal policy so as to maximize the cumulative rewards. The policy defined in DRL mainly depends on the state, historical memory, and policy model parameters. However, we humans usually take actions according to our own intentions, such as moving fast or slow, besides the elements included in the traditional policy models. In order to make the action-choosing mechanism more similar to humans and make the agent to select actions that incorporate intentions, we propose an intention-aware policy learning method in this letter To formalize this process, we first define an intention-aware policy by incorporating the intention information into the policy model, which is learned by maximizing the cumulative rewards with the mutual information (MI) between the intention and the action. Then we derive an approximation of the MI objective that can be optimized efficiently. Finally, we demonstrate the effectiveness of the intention-aware policy in the classical MuJoCo control task and the multigoal continuous chain walking task.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10207905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Trade-Offs in Spiking Neural Networks 探索尖峰神经网络的权衡
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-09-08 DOI: 10.1162/neco_a_01609
Florian Bacho;Dominique Chu
Spiking neural networks (SNNs) have emerged as a promising alternative to traditional deep neural networks for low-power computing. However, the effectiveness of SNNs is not solely determined by their performance but also by their energy consumption, prediction speed, and robustness to noise. The recent method Fast & Deep, along with others, achieves fast and energy-efficient computation by constraining neurons to fire at most once. Known as time-to-first-spike (TTFS), this constraint, however, restricts the capabilities of SNNs in many aspects. In this work, we explore the relationships of performance, energy consumption, speed, and stability when using this constraint. More precisely, we highlight the existence of trade-offs where performance and robustness are gained at the cost of sparsity and prediction latency. To improve these trade-offs, we propose a relaxed version of Fast & Deep that allows for multiple spikes per neuron. Our experiments show that relaxing the spike constraint provides higher performance while also benefiting from faster convergence, similar sparsity, comparable prediction latency, and better robustness to noise compared to TTFS SNNs. By highlighting the limitations of TTFS and demonstrating the advantages of unconstrained SNNs, we provide valuable insight for the development of effective learning strategies for neuromorphic computing.
脉冲神经网络(snn)已成为传统深度神经网络在低功耗计算领域的一个有前途的替代方案。然而,snn的有效性不仅仅取决于它们的性能,还取决于它们的能量消耗、预测速度和对噪声的鲁棒性。最近的方法Fast & Deep,以及其他方法,通过限制神经元最多触发一次来实现快速和节能的计算。然而,这种被称为第一次尖峰时间(TTFS)的约束在许多方面限制了snn的能力。在这项工作中,我们探索了使用此约束时性能、能耗、速度和稳定性之间的关系。更准确地说,我们强调了以稀疏性和预测延迟为代价获得性能和鲁棒性的权衡的存在。为了改善这些权衡,我们提出了一个宽松版本的Fast & Deep,允许每个神经元产生多个尖峰。我们的实验表明,与TTFS snn相比,放松峰值约束提供了更高的性能,同时还受益于更快的收敛、相似的稀疏性、可比较的预测延迟和更好的噪声鲁棒性。通过强调TTFS的局限性和展示无约束snn的优势,我们为开发有效的神经形态计算学习策略提供了有价值的见解。
{"title":"Exploring Trade-Offs in Spiking Neural Networks","authors":"Florian Bacho;Dominique Chu","doi":"10.1162/neco_a_01609","DOIUrl":"10.1162/neco_a_01609","url":null,"abstract":"Spiking neural networks (SNNs) have emerged as a promising alternative to traditional deep neural networks for low-power computing. However, the effectiveness of SNNs is not solely determined by their performance but also by their energy consumption, prediction speed, and robustness to noise. The recent method Fast & Deep, along with others, achieves fast and energy-efficient computation by constraining neurons to fire at most once. Known as time-to-first-spike (TTFS), this constraint, however, restricts the capabilities of SNNs in many aspects. In this work, we explore the relationships of performance, energy consumption, speed, and stability when using this constraint. More precisely, we highlight the existence of trade-offs where performance and robustness are gained at the cost of sparsity and prediction latency. To improve these trade-offs, we propose a relaxed version of Fast & Deep that allows for multiple spikes per neuron. Our experiments show that relaxing the spike constraint provides higher performance while also benefiting from faster convergence, similar sparsity, comparable prediction latency, and better robustness to noise compared to TTFS SNNs. By highlighting the limitations of TTFS and demonstrating the advantages of unconstrained SNNs, we provide valuable insight for the development of effective learning strategies for neuromorphic computing.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10199175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer Learning With Singular Value Decomposition of Multichannel Convolution Matrices 基于多通道卷积矩阵奇异值分解的迁移学习
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-09-08 DOI: 10.1162/neco_a_01608
Tak Shing Au Yeung;Ka Chun Cheung;Michael K. Ng;Simon See;Andy Yip
The task of transfer learning using pretrained convolutional neural networks is considered. We propose a convolution-SVD layer to analyze the convolution operators with a singular value decomposition computed in the Fourier domain. Singular vectors extracted from the source domain are transferred to the target domain, whereas the singular values are fine-tuned with a target data set. In this way, dimension reduction is achieved to avoid overfitting, while some flexibility to fine-tune the convolution kernels is maintained. We extend an existing convolution kernel reconstruction algorithm to allow for a reconstruction from an arbitrary set of learned singular values. A generalization bound for a single convolution-SVD layer is devised to show the consistency between training and testing errors. We further introduce a notion of transfer learning gap. We prove that the testing error for a single convolution-SVD layer is bounded in terms of the gap, which motivates us to develop a regularization model with the gap as the regularizer. Numerical experiments are conducted to demonstrate the superiority of the proposed model in solving classification problems and the influence of various parameters. In particular, the regularization is shown to yield a significantly higher prediction accuracy.
考虑了使用预训练卷积神经网络进行迁移学习的任务。我们提出了一个卷积- svd层来分析在傅里叶域中计算奇异值分解的卷积算子。将源域提取的奇异向量转移到目标域,并根据目标数据集对奇异值进行微调。通过这种方式,实现了降维以避免过拟合,同时保持了微调卷积核的灵活性。我们扩展了现有的卷积核重构算法,以允许从任意一组学习到的奇异值进行重构。设计了单个卷积- svd层的泛化界,以显示训练误差和测试误差之间的一致性。我们进一步引入迁移学习差距的概念。我们证明了单个卷积- svd层的测试误差在间隙方面是有界的,这激励我们开发一个以间隙作为正则化器的正则化模型。通过数值实验验证了所提模型在解决分类问题和各种参数影响方面的优越性。特别是,正则化被证明可以产生显着更高的预测精度。
{"title":"Transfer Learning With Singular Value Decomposition of Multichannel Convolution Matrices","authors":"Tak Shing Au Yeung;Ka Chun Cheung;Michael K. Ng;Simon See;Andy Yip","doi":"10.1162/neco_a_01608","DOIUrl":"10.1162/neco_a_01608","url":null,"abstract":"The task of transfer learning using pretrained convolutional neural networks is considered. We propose a convolution-SVD layer to analyze the convolution operators with a singular value decomposition computed in the Fourier domain. Singular vectors extracted from the source domain are transferred to the target domain, whereas the singular values are fine-tuned with a target data set. In this way, dimension reduction is achieved to avoid overfitting, while some flexibility to fine-tune the convolution kernels is maintained. We extend an existing convolution kernel reconstruction algorithm to allow for a reconstruction from an arbitrary set of learned singular values. A generalization bound for a single convolution-SVD layer is devised to show the consistency between training and testing errors. We further introduce a notion of transfer learning gap. We prove that the testing error for a single convolution-SVD layer is bounded in terms of the gap, which motivates us to develop a regularization model with the gap as the regularizer. Numerical experiments are conducted to demonstrate the superiority of the proposed model in solving classification problems and the influence of various parameters. In particular, the regularization is shown to yield a significantly higher prediction accuracy.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10207904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Noise-Based Novel Strategy for Faster SNN Training 基于噪声的SNN快速训练新策略
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-08-07 DOI: 10.1162/neco_a_01604
Chunming Jiang;Yilei Zhang
Spiking neural networks (SNNs) are receiving increasing attention due to their low power consumption and strong bioplausibility. Optimization of SNNs is a challenging task. Two main methods, artificial neural network (ANN)-to-SNN conversion and spike-based backpropagation (BP), both have advantages and limitations. ANN-to-SNN conversion requires a long inference time to approximate the accuracy of ANN, thus diminishing the benefits of SNN. With spike-based BP, training high-precision SNNs typically consumes dozens of times more computational resources and time than their ANN counterparts. In this letter, we propose a novel SNN training approach that combines the benefits of the two methods. We first train a single-step SNN(T = 1) by approximating the neural potential distribution with random noise, then convert the single-step SNN(T = 1) to a multistep SNN(T = N) losslessly. The introduction of gaussian distributed noise leads to a significant gain in accuracy after conversion. The results show that our method considerably reduces the training and inference times of SNNs while maintaining their high accuracy. Compared to the previous two methods, ours can reduce training time by 65% to 75% and achieves more than 100 times faster inference speed. We also argue that the neuron model augmented with noise makes it more bioplausible.
脉冲神经网络(SNNs)因其低功耗和强生物合理性而受到越来越多的关注。snn的优化是一项具有挑战性的任务。两种主要的方法,人工神经网络(ANN)到snn的转换和基于峰值的反向传播(BP),都有各自的优点和局限性。ANN到SNN的转换需要很长的推理时间来近似ANN的精度,从而削弱了SNN的优势。使用基于峰值的BP,训练高精度snn通常比人工神经网络多消耗数十倍的计算资源和时间。在这封信中,我们提出了一种新的SNN训练方法,它结合了两种方法的优点。首先用随机噪声近似神经电位分布,训练出单步SNN(T = 1),然后将单步SNN(T = 1)无损地转换成多步SNN(T = N)。高斯分布噪声的引入使转换后的精度显著提高。结果表明,该方法在保持snn高精度的同时,大大减少了snn的训练和推理时间。与前两种方法相比,我们的方法可以减少65%到75%的训练时间,并实现100倍以上的推理速度。我们还认为,加入噪声的神经元模型使其更具生物合理性。
{"title":"A Noise-Based Novel Strategy for Faster SNN Training","authors":"Chunming Jiang;Yilei Zhang","doi":"10.1162/neco_a_01604","DOIUrl":"10.1162/neco_a_01604","url":null,"abstract":"Spiking neural networks (SNNs) are receiving increasing attention due to their low power consumption and strong bioplausibility. Optimization of SNNs is a challenging task. Two main methods, artificial neural network (ANN)-to-SNN conversion and spike-based backpropagation (BP), both have advantages and limitations. ANN-to-SNN conversion requires a long inference time to approximate the accuracy of ANN, thus diminishing the benefits of SNN. With spike-based BP, training high-precision SNNs typically consumes dozens of times more computational resources and time than their ANN counterparts. In this letter, we propose a novel SNN training approach that combines the benefits of the two methods. We first train a single-step SNN(T = 1) by approximating the neural potential distribution with random noise, then convert the single-step SNN(T = 1) to a multistep SNN(T = N) losslessly. The introduction of gaussian distributed noise leads to a significant gain in accuracy after conversion. The results show that our method considerably reduces the training and inference times of SNNs while maintaining their high accuracy. Compared to the previous two methods, ours can reduce training time by 65% to 75% and achieves more than 100 times faster inference speed. We also argue that the neuron model augmented with noise makes it more bioplausible.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9947487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Composite Optimization Algorithms for Sigmoid Networks Sigmoid网络的复合优化算法
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-08-07 DOI: 10.1162/neco_a_01603
Huixiong Chen;Qi Ye
In this letter, we use composite optimization algorithms to solve sigmoid networks. We equivalently transfer the sigmoid networks to a convex composite optimization and propose the composite optimization algorithms based on the linearized proximal algorithms and the alternating direction method of multipliers. Under the assumptions of the weak sharp minima and the regularity condition, the algorithm is guaranteed to converge to a globally optimal solution of the objective function even in the case of nonconvex and nonsmooth problems. Furthermore, the convergence results can be directly related to the amount of training data and provide a general guide for setting the size of sigmoid networks. Numerical experiments on Franke’s function fitting and handwritten digit recognition show that the proposed algorithms perform satisfactorily and robustly.
在这封信中,我们使用复合优化算法来求解s型网络。将s型网络等效转化为凸复合优化,提出了基于线性化近端算法和乘法器交替方向法的复合优化算法。在弱锐极小值和正则性条件的假设下,该算法即使在非凸非光滑问题上也能保证收敛到目标函数的全局最优解。此外,收敛结果可以直接与训练数据量相关,并为设置s形网络的大小提供一般指导。对Franke函数拟合和手写数字识别的数值实验表明,该算法具有良好的鲁棒性。
{"title":"Composite Optimization Algorithms for Sigmoid Networks","authors":"Huixiong Chen;Qi Ye","doi":"10.1162/neco_a_01603","DOIUrl":"10.1162/neco_a_01603","url":null,"abstract":"In this letter, we use composite optimization algorithms to solve sigmoid networks. We equivalently transfer the sigmoid networks to a convex composite optimization and propose the composite optimization algorithms based on the linearized proximal algorithms and the alternating direction method of multipliers. Under the assumptions of the weak sharp minima and the regularity condition, the algorithm is guaranteed to converge to a globally optimal solution of the objective function even in the case of nonconvex and nonsmooth problems. Furthermore, the convergence results can be directly related to the amount of training data and provide a general guide for setting the size of sigmoid networks. Numerical experiments on Franke’s function fitting and handwritten digit recognition show that the proposed algorithms perform satisfactorily and robustly.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9941798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mirror Descent of Hopfield Model Hopfield模型的镜像下降
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-08-07 DOI: 10.1162/neco_a_01602
Hyungjoon Soh;Dongyeob Kim;Juno Hwang;Junghyo Jo
Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for using mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate that by using the Hopfield model as a prototype for neural networks, mirror descent can effectively train the model with significantly improved performance compared to traditional gradient descent methods that rely on random parameter initialization. Our findings highlight the potential of mirror descent as a promising initialization technique for enhancing the optimization of machine learning models.
镜像下降是一种优雅的优化技术,它利用参数模型的对偶空间来执行梯度下降。虽然最初是为凸优化而开发的,但它已越来越多地应用于机器学习领域。在本研究中,我们提出了一种使用镜像下降来初始化神经网络参数的新方法。具体来说,我们证明了通过使用Hopfield模型作为神经网络的原型,与依赖随机参数初始化的传统梯度下降方法相比,镜像下降可以有效地训练模型,并且性能显着提高。我们的研究结果强调了镜像下降作为一种有前途的初始化技术的潜力,可以增强机器学习模型的优化。
{"title":"Mirror Descent of Hopfield Model","authors":"Hyungjoon Soh;Dongyeob Kim;Juno Hwang;Junghyo Jo","doi":"10.1162/neco_a_01602","DOIUrl":"10.1162/neco_a_01602","url":null,"abstract":"Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for using mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate that by using the Hopfield model as a prototype for neural networks, mirror descent can effectively train the model with significantly improved performance compared to traditional gradient descent methods that rely on random parameter initialization. Our findings highlight the potential of mirror descent as a promising initialization technique for enhancing the optimization of machine learning models.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9947482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mean-Field Approximations With Adaptive Coupling for Networks With Spike-Timing-Dependent Plasticity 具有峰值时间依赖的可塑性网络的自适应耦合平均场逼近
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-08-07 DOI: 10.1162/neco_a_01601
Benoit Duchet;Christian Bick;Áine Byrne
Understanding the effect of spike-timing-dependent plasticity (STDP) is key to elucidating how neural networks change over long timescales and to design interventions aimed at modulating such networks in neurological disorders. However, progress is restricted by the significant computational cost associated with simulating neural network models with STDP and by the lack of low-dimensional description that could provide analytical insights. Phase-difference-dependent plasticity (PDDP) rules approximate STDP in phase oscillator networks, which prescribe synaptic changes based on phase differences of neuron pairs rather than differences in spike timing. Here we construct mean-field approximations for phase oscillator networks with STDP to describe part of the phase space for this very high-dimensional system. We first show that single-harmonic PDDP rules can approximate a simple form of symmetric STDP, while multiharmonic rules are required to accurately approximate causal STDP. We then derive exact expressions for the evolution of the average PDDP coupling weight in terms of network synchrony. For adaptive networks of Kuramoto oscillators that form clusters, we formulate a family of low-dimensional descriptions based on the mean-field dynamics of each cluster and average coupling weights between and within clusters. Finally, we show that such a two-cluster mean-field model can be fitted to synthetic data to provide a low-dimensional approximation of a full adaptive network with symmetric STDP. Our framework represents a step toward a low-dimensional description of adaptive networks with STDP, and could for example inform the development of new therapies aimed at maximizing the long-lasting effects of brain stimulation.
理解spike- time -dependent plasticity (STDP)的影响是阐明神经网络如何在长时间尺度上变化的关键,也是设计针对神经系统疾病调节这种网络的干预措施的关键。然而,由于使用STDP模拟神经网络模型的计算成本很高,并且缺乏可以提供分析见解的低维描述,因此进展受到限制。相位差依赖的可塑性(PDDP)规则近似于相振网络中的STDP,它规定了基于神经元对相位差而不是脉冲时间差异的突触变化。在这里,我们用STDP构造相振网络的平均场近似来描述这个非常高维系统的部分相空间。我们首先证明了单谐波PDDP规则可以近似一种简单形式的对称STDP,而多谐波规则需要精确地近似因果STDP。然后,我们推导了平均PDDP耦合权在网络同步方面的精确表达式。对于形成簇的Kuramoto振子自适应网络,我们基于每个簇的平均场动力学和簇之间和簇内的平均耦合权,制定了一组低维描述。最后,我们证明了这种双簇平均场模型可以拟合到合成数据中,以提供具有对称STDP的全自适应网络的低维近似。我们的框架代表了用STDP对自适应网络进行低维描述的一步,例如,可以为旨在最大化脑刺激持久效果的新疗法的开发提供信息。
{"title":"Mean-Field Approximations With Adaptive Coupling for Networks With Spike-Timing-Dependent Plasticity","authors":"Benoit Duchet;Christian Bick;Áine Byrne","doi":"10.1162/neco_a_01601","DOIUrl":"10.1162/neco_a_01601","url":null,"abstract":"Understanding the effect of spike-timing-dependent plasticity (STDP) is key to elucidating how neural networks change over long timescales and to design interventions aimed at modulating such networks in neurological disorders. However, progress is restricted by the significant computational cost associated with simulating neural network models with STDP and by the lack of low-dimensional description that could provide analytical insights. Phase-difference-dependent plasticity (PDDP) rules approximate STDP in phase oscillator networks, which prescribe synaptic changes based on phase differences of neuron pairs rather than differences in spike timing. Here we construct mean-field approximations for phase oscillator networks with STDP to describe part of the phase space for this very high-dimensional system. We first show that single-harmonic PDDP rules can approximate a simple form of symmetric STDP, while multiharmonic rules are required to accurately approximate causal STDP. We then derive exact expressions for the evolution of the average PDDP coupling weight in terms of network synchrony. For adaptive networks of Kuramoto oscillators that form clusters, we formulate a family of low-dimensional descriptions based on the mean-field dynamics of each cluster and average coupling weights between and within clusters. Finally, we show that such a two-cluster mean-field model can be fitted to synthetic data to provide a low-dimensional approximation of a full adaptive network with symmetric STDP. Our framework represents a step toward a low-dimensional description of adaptive networks with STDP, and could for example inform the development of new therapies aimed at maximizing the long-lasting effects of brain stimulation.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9995666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On an Interpretation of ResNets via Gate-Network Control 门网控制对ResNets的解释
IF 2.9 4区 计算机科学 Q1 Arts and Humanities Pub Date : 2023-08-07 DOI: 10.1162/neco_a_01600
Changcun Huang
This letter first constructs a typical solution of ResNets for multicategory classifications based on the idea of the gate control of LSTMs, from which a general interpretation of the ResNet architecture is given and the performance mechanism is explained. We also use more solutions to further demonstrate the generality of that interpretation. The classification result is then extended to the universal-approximation capability of the type of ResNet with two-layer gate networks, an architecture that was proposed in an original paper of ResNets and has both theoretical and practical significance.
本文首先基于lstm的门控制思想构建了ResNet多类别分类的典型解决方案,并由此给出了ResNet体系结构的一般解释,并解释了其性能机制。我们还使用了更多的解来进一步证明这种解释的普遍性。然后将分类结果扩展到具有双层门网络的ResNet类型的通用逼近能力,双层门网络是ResNets的一篇原创论文中提出的一种架构,具有理论和实践意义。
{"title":"On an Interpretation of ResNets via Gate-Network Control","authors":"Changcun Huang","doi":"10.1162/neco_a_01600","DOIUrl":"10.1162/neco_a_01600","url":null,"abstract":"This letter first constructs a typical solution of ResNets for multicategory classifications based on the idea of the gate control of LSTMs, from which a general interpretation of the ResNet architecture is given and the performance mechanism is explained. We also use more solutions to further demonstrate the generality of that interpretation. The classification result is then extended to the universal-approximation capability of the type of ResNet with two-layer gate networks, an architecture that was proposed in an original paper of ResNets and has both theoretical and practical significance.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9947488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1