首页 > 最新文献

arXiv - CS - Neural and Evolutionary Computing最新文献

英文 中文
Evolutionary Algorithms Are Significantly More Robust to Noise When They Ignore It 当进化算法忽略噪声时,其鲁棒性显著提高
Pub Date : 2024-08-31 DOI: arxiv-2409.00306
Denis Antipov, Benjamin Doerr
Randomized search heuristics (RHSs) are generally believed to be robust tonoise. However, almost all mathematical analyses on how RSHs cope with a noisyaccess to the objective function assume that each solution is re-evaluatedwhenever it is compared to others. This is unfortunate, both because it wastescomputational resources and because it requires the user to foresee that noiseis present (as in a noise-free setting, one would never re-evaluate solutions). In this work, we show the need for re-evaluations could be overestimated, andin fact, detrimental. For the classic benchmark problem of how the $(1+1)$evolutionary algorithm optimizes the LeadingOnes benchmark, we show thatwithout re-evaluations up to constant noise rates can be tolerated, much morethan the $O(n^{-2} log n)$ noise rates that can be tolerated whenre-evaluating solutions. This first runtime analysis of an evolutionary algorithm solving asingle-objective noisy problem without re-evaluations could indicate that suchalgorithms cope with noise much better than previously thought, and without theneed to foresee the presence of noise.
一般认为,随机搜索启发式(RHS)对噪声具有鲁棒性。然而,几乎所有关于 RSH 如何应对目标函数噪声访问的数学分析都假定,每一个解决方案在与其他解决方案比较时都要重新评估。这是很不幸的,因为它既浪费了计算资源,又要求用户预见到噪声的存在(因为在无噪声的情况下,人们永远不会重新评估解)。在这项工作中,我们表明重新评估的需求可能被高估了,而且事实上是有害的。对于"$(1+1)$进化算法如何优化LeadingOnes基准 "这一经典基准问题,我们证明了无需重新评估就能容忍恒定的噪声率,远高于重新评估解决方案时所能容忍的$O(n^{-2} log n)$ 噪声率。这是对进化算法在不重新评估的情况下求解单目标噪声问题的首次运行时间分析,它表明这种算法应对噪声的能力比以前想象的要好得多,而且不需要预见噪声的存在。
{"title":"Evolutionary Algorithms Are Significantly More Robust to Noise When They Ignore It","authors":"Denis Antipov, Benjamin Doerr","doi":"arxiv-2409.00306","DOIUrl":"https://doi.org/arxiv-2409.00306","url":null,"abstract":"Randomized search heuristics (RHSs) are generally believed to be robust to\u0000noise. However, almost all mathematical analyses on how RSHs cope with a noisy\u0000access to the objective function assume that each solution is re-evaluated\u0000whenever it is compared to others. This is unfortunate, both because it wastes\u0000computational resources and because it requires the user to foresee that noise\u0000is present (as in a noise-free setting, one would never re-evaluate solutions). In this work, we show the need for re-evaluations could be overestimated, and\u0000in fact, detrimental. For the classic benchmark problem of how the $(1+1)$\u0000evolutionary algorithm optimizes the LeadingOnes benchmark, we show that\u0000without re-evaluations up to constant noise rates can be tolerated, much more\u0000than the $O(n^{-2} log n)$ noise rates that can be tolerated when\u0000re-evaluating solutions. This first runtime analysis of an evolutionary algorithm solving a\u0000single-objective noisy problem without re-evaluations could indicate that such\u0000algorithms cope with noise much better than previously thought, and without the\u0000need to foresee the presence of noise.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continual learning with the neural tangent ensemble 利用神经切线集合进行持续学习
Pub Date : 2024-08-30 DOI: arxiv-2408.17394
Ari S. Benjamin, Christian Pehle, Kyle Daruwalla
A natural strategy for continual learning is to weigh a Bayesian ensemble offixed functions. This suggests that if a (single) neural network could beinterpreted as an ensemble, one could design effective algorithms that learnwithout forgetting. To realize this possibility, we observe that a neuralnetwork classifier with N parameters can be interpreted as a weighted ensembleof N classifiers, and that in the lazy regime limit these classifiers are fixedthroughout learning. We term these classifiers the neural tangent experts andshow they output valid probability distributions over the labels. We thenderive the likelihood and posterior probability of each expert given past data.Surprisingly, we learn that the posterior updates for these experts areequivalent to a scaled and projected form of stochastic gradient descent (SGD)over the network weights. Away from the lazy regime, networks can be seen asensembles of adaptive experts which improve over time. These results offer anew interpretation of neural networks as Bayesian ensembles of experts,providing a principled framework for understanding and mitigating catastrophicforgetting in continual learning settings.
持续学习的一种自然策略是权衡一个贝叶斯集合的固定函数。这表明,如果(单个)神经网络可以被解释为一个集合,那么我们就可以设计出有效的算法,实现无遗忘学习。为了实现这种可能性,我们观察到,具有 N 个参数的神经网络分类器可以被解释为 N 个分类器的加权集合,而且在懒惰机制限制下,这些分类器在整个学习过程中都是固定的。我们称这些分类器为神经切线专家,并证明它们能输出有效的标签概率分布。令人惊讶的是,我们发现这些专家的后验更新等同于网络权重上的随机梯度下降(SGD)的缩放和投影形式。脱离了懒惰机制,网络可以被看作是随时间不断改进的自适应专家的集合体。这些结果为神经网络作为贝叶斯专家集合提供了新的解释,为理解和减轻持续学习环境中的灾难性遗忘提供了一个原则性框架。
{"title":"Continual learning with the neural tangent ensemble","authors":"Ari S. Benjamin, Christian Pehle, Kyle Daruwalla","doi":"arxiv-2408.17394","DOIUrl":"https://doi.org/arxiv-2408.17394","url":null,"abstract":"A natural strategy for continual learning is to weigh a Bayesian ensemble of\u0000fixed functions. This suggests that if a (single) neural network could be\u0000interpreted as an ensemble, one could design effective algorithms that learn\u0000without forgetting. To realize this possibility, we observe that a neural\u0000network classifier with N parameters can be interpreted as a weighted ensemble\u0000of N classifiers, and that in the lazy regime limit these classifiers are fixed\u0000throughout learning. We term these classifiers the neural tangent experts and\u0000show they output valid probability distributions over the labels. We then\u0000derive the likelihood and posterior probability of each expert given past data.\u0000Surprisingly, we learn that the posterior updates for these experts are\u0000equivalent to a scaled and projected form of stochastic gradient descent (SGD)\u0000over the network weights. Away from the lazy regime, networks can be seen as\u0000ensembles of adaptive experts which improve over time. These results offer a\u0000new interpretation of neural networks as Bayesian ensembles of experts,\u0000providing a principled framework for understanding and mitigating catastrophic\u0000forgetting in continual learning settings.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"2010 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stepwise Weighted Spike Coding for Deep Spiking Neural Networks 深度尖峰神经网络的逐步加权尖峰编码
Pub Date : 2024-08-30 DOI: arxiv-2408.17245
Yiwen Gu, Junchuan Gu, Haibin Shen, Kejie Huang
Spiking Neural Networks (SNNs) seek to mimic the spiking behavior ofbiological neurons and are expected to play a key role in the advancement ofneural computing and artificial intelligence. The efficiency of SNNs is oftendetermined by the neural coding schemes. Existing coding schemes either causehuge delays and energy consumption or necessitate intricate neuron models andtraining techniques. To address these issues, we propose a novel StepwiseWeighted Spike (SWS) coding scheme to enhance the encoding of information inspikes. This approach compresses the spikes by weighting the significance ofthe spike in each step of neural computation, achieving high performance andlow energy consumption. A Ternary Self-Amplifying (TSA) neuron model with asilent period is proposed for supporting SWS-based computing, aimed atminimizing the residual error resulting from stepwise weighting in neuralcomputation. Our experimental results show that the SWS coding schemeoutperforms the existing neural coding schemes in very deep SNNs, andsignificantly reduces operations and latency.
尖峰神经网络(SNN)试图模仿生物神经元的尖峰行为,有望在神经计算和人工智能的发展中发挥关键作用。SNN 的效率通常由神经编码方案决定。现有的编码方案要么会造成巨大的延迟和能耗,要么需要复杂的神经元模型和训练技术。为了解决这些问题,我们提出了一种新颖的逐步加权尖峰(SWS)编码方案,以增强对信息尖峰的编码。这种方法通过在神经计算的每个步骤中对尖峰的重要性进行加权来压缩尖峰,从而实现高性能和低能耗。为支持基于 SWS 的计算,我们提出了一种具有无周期的三元自放大(TSA)神经元模型,旨在最大限度地减少神经计算中分步加权产生的残余误差。实验结果表明,在深度 SNN 中,SWS 编码方案优于现有的神经编码方案,并显著降低了运算量和延迟。
{"title":"Stepwise Weighted Spike Coding for Deep Spiking Neural Networks","authors":"Yiwen Gu, Junchuan Gu, Haibin Shen, Kejie Huang","doi":"arxiv-2408.17245","DOIUrl":"https://doi.org/arxiv-2408.17245","url":null,"abstract":"Spiking Neural Networks (SNNs) seek to mimic the spiking behavior of\u0000biological neurons and are expected to play a key role in the advancement of\u0000neural computing and artificial intelligence. The efficiency of SNNs is often\u0000determined by the neural coding schemes. Existing coding schemes either cause\u0000huge delays and energy consumption or necessitate intricate neuron models and\u0000training techniques. To address these issues, we propose a novel Stepwise\u0000Weighted Spike (SWS) coding scheme to enhance the encoding of information in\u0000spikes. This approach compresses the spikes by weighting the significance of\u0000the spike in each step of neural computation, achieving high performance and\u0000low energy consumption. A Ternary Self-Amplifying (TSA) neuron model with a\u0000silent period is proposed for supporting SWS-based computing, aimed at\u0000minimizing the residual error resulting from stepwise weighting in neural\u0000computation. Our experimental results show that the SWS coding scheme\u0000outperforms the existing neural coding schemes in very deep SNNs, and\u0000significantly reduces operations and latency.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Estimation of Unique Components in Independent Component Analysis by Matrix Representation 用矩阵表示法高效估计独立成分分析中的独特成分
Pub Date : 2024-08-30 DOI: arxiv-2408.17118
Yoshitatsu Matsuda, Kazunori Yamaguch
Independent component analysis (ICA) is a widely used method in variousapplications of signal processing and feature extraction. It extends principalcomponent analysis (PCA) and can extract important and complicated componentswith small variances. One of the major problems of ICA is that the uniquenessof the solution is not guaranteed, unlike PCA. That is because there are manylocal optima in optimizing the objective function of ICA. It has been shownpreviously that the unique global optimum of ICA can be estimated from manyrandom initializations by handcrafted thread computation. In this paper, theunique estimation of ICA is highly accelerated by reformulating the algorithmin matrix representation and reducing redundant calculations. Experimentalresults on artificial datasets and EEG data verified the efficiency of theproposed method.
独立分量分析(ICA)是一种广泛应用于信号处理和特征提取的方法。它是主成分分析法(PCA)的延伸,可以提取方差较小的重要复杂成分。与 PCA 不同,ICA 的一个主要问题是无法保证解的唯一性。这是因为在优化 ICA 目标函数的过程中存在许多局部最优点。以前的研究表明,通过手工线程计算,可以从许多随机初始化中估计出 ICA 的唯一全局最优值。本文通过重新制定矩阵表示法和减少冗余计算,大大加快了 ICA 的唯一估计。人工数据集和脑电图数据的实验结果验证了所提方法的高效性。
{"title":"Efficient Estimation of Unique Components in Independent Component Analysis by Matrix Representation","authors":"Yoshitatsu Matsuda, Kazunori Yamaguch","doi":"arxiv-2408.17118","DOIUrl":"https://doi.org/arxiv-2408.17118","url":null,"abstract":"Independent component analysis (ICA) is a widely used method in various\u0000applications of signal processing and feature extraction. It extends principal\u0000component analysis (PCA) and can extract important and complicated components\u0000with small variances. One of the major problems of ICA is that the uniqueness\u0000of the solution is not guaranteed, unlike PCA. That is because there are many\u0000local optima in optimizing the objective function of ICA. It has been shown\u0000previously that the unique global optimum of ICA can be estimated from many\u0000random initializations by handcrafted thread computation. In this paper, the\u0000unique estimation of ICA is highly accelerated by reformulating the algorithm\u0000in matrix representation and reducing redundant calculations. Experimental\u0000results on artificial datasets and EEG data verified the efficiency of the\u0000proposed method.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"393 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ART: Actually Robust Training ART: 实际上的稳健培训
Pub Date : 2024-08-29 DOI: arxiv-2408.16285
Sebastian Chwilczyński, Kacper Trębacz, Karol Cyganik, Mateusz Małecki, Dariusz Brzezinski
Current interest in deep learning captures the attention of many programmersand researchers. Unfortunately, the lack of a unified schema for developingdeep learning models results in methodological inconsistencies, uncleardocumentation, and problems with reproducibility. Some guidelines have beenproposed, yet currently, they lack practical implementations. Furthermore,neural network training often takes on the form of trial and error, lacking astructured and thoughtful process. To alleviate these issues, in this paper, weintroduce Art, a Python library designed to help automatically impose rules andstandards while developing deep learning pipelines. Art divides modeldevelopment into a series of smaller steps of increasing complexity, eachconcluded with a validation check improving the interpretability and robustnessof the process. The current version of Art comes equipped with nine predefinedsteps inspired by Andrej Karpathy's Recipe for Training Neural Networks, avisualization dashboard, and integration with loggers such as Neptune. The coderelated to this paper is available at:https://github.com/SebChw/Actually-Robust-Training.
当前,深度学习吸引了众多程序员和研究人员的关注。遗憾的是,由于缺乏开发深度学习模型的统一模式,导致了方法上的不一致、文档的不完整以及可重复性的问题。虽然已经提出了一些指导原则,但目前还缺乏实际应用。此外,神经网络的训练往往采取试错的形式,缺乏结构化和深思熟虑的过程。为了缓解这些问题,我们在本文中介绍了 Art,这是一个 Python 库,旨在帮助在开发深度学习管道时自动施加规则和标准。Art 将模型开发分为一系列复杂度不断增加的较小步骤,每个步骤都有一个验证检查,以提高过程的可解释性和鲁棒性。受 Andrej Karpathy 的《神经网络训练配方》(Recipe for Training Neural Networks)启发,Art 的当前版本配备了九个预定义步骤、可视化仪表板,并与 Neptune 等记录仪集成。与本文相关的代码请访问:https://github.com/SebChw/Actually-Robust-Training。
{"title":"ART: Actually Robust Training","authors":"Sebastian Chwilczyński, Kacper Trębacz, Karol Cyganik, Mateusz Małecki, Dariusz Brzezinski","doi":"arxiv-2408.16285","DOIUrl":"https://doi.org/arxiv-2408.16285","url":null,"abstract":"Current interest in deep learning captures the attention of many programmers\u0000and researchers. Unfortunately, the lack of a unified schema for developing\u0000deep learning models results in methodological inconsistencies, unclear\u0000documentation, and problems with reproducibility. Some guidelines have been\u0000proposed, yet currently, they lack practical implementations. Furthermore,\u0000neural network training often takes on the form of trial and error, lacking a\u0000structured and thoughtful process. To alleviate these issues, in this paper, we\u0000introduce Art, a Python library designed to help automatically impose rules and\u0000standards while developing deep learning pipelines. Art divides model\u0000development into a series of smaller steps of increasing complexity, each\u0000concluded with a validation check improving the interpretability and robustness\u0000of the process. The current version of Art comes equipped with nine predefined\u0000steps inspired by Andrej Karpathy's Recipe for Training Neural Networks, a\u0000visualization dashboard, and integration with loggers such as Neptune. The code\u0000related to this paper is available at:\u0000https://github.com/SebChw/Actually-Robust-Training.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"68 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maelstrom Networks 漩涡网络
Pub Date : 2024-08-29 DOI: arxiv-2408.16632
Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos
Artificial Neural Networks has struggled to devise a way to incorporateworking memory into neural networks. While the ``long term'' memory can be seenas the learned weights, the working memory consists likely more of dynamicalactivity, that is missing from feed-forward models. Current state of the artmodels such as transformers tend to ``solve'' this by ignoring working memoryentirely and simply process the sequence as an entire piece of data; howeverthis means the network cannot process the sequence in an online fashion, andleads to an immense explosion in memory requirements. Here, inspired by acombination of controls, reservoir computing, deep learning, and recurrentneural networks, we offer an alternative paradigm that combines the strength ofrecurrent networks, with the pattern matching capability of feed-forward neuralnetworks, which we call the textit{Maelstrom Networks} paradigm. This paradigmleaves the recurrent component - the textit{Maelstrom} - unlearned, andoffloads the learning to a powerful feed-forward network. This allows thenetwork to leverage the strength of feed-forward training without unrolling thenetwork, and allows for the memory to be implemented in new neuromorphichardware. It endows a neural network with a sequential memory that takesadvantage of the inductive bias that data is organized causally in the temporaldomain, and imbues the network with a state that represents the agent's``self'', moving through the environment. This could also lead the way tocontinual learning, with the network modularized and ``'protected'' fromoverwrites that come with new data. In addition to aiding in solving theseperformance problems that plague current non-temporal deep networks, this alsocould finally lead towards endowing artificial networks with a sense of``self''.
人工神经网络(Artificial Neural Networks)一直在努力设计一种将工作记忆纳入神经网络的方法。虽然 "长期 "记忆可以看作是学习到的权重,但工作记忆可能更多地由动态活动组成,这是前馈模型所缺少的。目前最先进的模型(如变换器)倾向于通过完全忽略工作记忆来 "解决 "这一问题,并简单地将序列作为整块数据进行处理;然而,这意味着网络无法以在线方式处理序列,并导致内存需求急剧膨胀。在此,我们从控制、水库计算、深度学习和递归神经网络的结合中得到启发,提出了一种替代范式,它结合了递归网络的优势和前馈神经网络的模式匹配能力,我们称之为 textit{Maelstrom 网络}范式。这种范式不学习递归组件--textit{Maelstrom},而是将学习工作交给功能强大的前馈网络。这样,网络就可以在不展开的情况下利用前馈训练的优势,并允许在新的神经形态硬件中实现记忆。它赋予神经网络一种顺序存储器,利用了数据在时间域中因果组织的归纳偏差,并为网络注入了一种状态,这种状态代表了在环境中移动的代理 "自己"。这也可能导致持续学习,使网络模块化,并"'保护'"网络免受新数据带来的改写。除了有助于解决困扰当前非时态深度网络的性能问题,这还可能最终赋予人工网络 "自我 "感。
{"title":"Maelstrom Networks","authors":"Matthew Evanusa, Cornelia Fermüller, Yiannis Aloimonos","doi":"arxiv-2408.16632","DOIUrl":"https://doi.org/arxiv-2408.16632","url":null,"abstract":"Artificial Neural Networks has struggled to devise a way to incorporate\u0000working memory into neural networks. While the ``long term'' memory can be seen\u0000as the learned weights, the working memory consists likely more of dynamical\u0000activity, that is missing from feed-forward models. Current state of the art\u0000models such as transformers tend to ``solve'' this by ignoring working memory\u0000entirely and simply process the sequence as an entire piece of data; however\u0000this means the network cannot process the sequence in an online fashion, and\u0000leads to an immense explosion in memory requirements. Here, inspired by a\u0000combination of controls, reservoir computing, deep learning, and recurrent\u0000neural networks, we offer an alternative paradigm that combines the strength of\u0000recurrent networks, with the pattern matching capability of feed-forward neural\u0000networks, which we call the textit{Maelstrom Networks} paradigm. This paradigm\u0000leaves the recurrent component - the textit{Maelstrom} - unlearned, and\u0000offloads the learning to a powerful feed-forward network. This allows the\u0000network to leverage the strength of feed-forward training without unrolling the\u0000network, and allows for the memory to be implemented in new neuromorphic\u0000hardware. It endows a neural network with a sequential memory that takes\u0000advantage of the inductive bias that data is organized causally in the temporal\u0000domain, and imbues the network with a state that represents the agent's\u0000``self'', moving through the environment. This could also lead the way to\u0000continual learning, with the network modularized and ``'protected'' from\u0000overwrites that come with new data. In addition to aiding in solving these\u0000performance problems that plague current non-temporal deep networks, this also\u0000could finally lead towards endowing artificial networks with a sense of\u0000``self''.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconsidering the energy efficiency of spiking neural networks 重新考虑尖峰神经网络的能效
Pub Date : 2024-08-29 DOI: arxiv-2409.08290
Zhanglu Yan, Zhenyu Bai, Weng-Fai Wong
Spiking neural networks (SNNs) are generally regarded as moreenergy-efficient because they do not use multiplications. However, most SNNworks only consider the counting of additions to evaluate energy consumption,neglecting other overheads such as memory accesses and data movementoperations. This oversight can lead to a misleading perception of efficiency,especially when state-of-the-art SNN accelerators operate with very small timewindow sizes. In this paper, we present a detailed comparison of the energyconsumption of artificial neural networks (ANNs) and SNNs from a hardwareperspective. We provide accurate formulas for energy consumption based onclassical multi-level memory hierarchy architectures, commonly usedneuromorphic dataflow architectures, and our proposed improved spatial-dataflowarchitecture. Our research demonstrates that to achieve comparable accuracy andgreater energy efficiency than ANNs, SNNs require strict limitations on bothtime window size T and sparsity s. For instance, with the VGG16 model and afixed T of 6, the neuron sparsity rate must exceed 93% to ensure energyefficiency across most architectures. Inspired by our findings, we explorestrategies to enhance energy efficiency by increasing sparsity. We introducetwo regularization terms during training that constrain weights andactivations, effectively boosting the sparsity rate. Our experiments on theCIFAR-10 dataset, using T of 6, show that our SNNs consume 69% of the energyused by optimized ANNs on spatial-dataflow architectures, while maintaining anSNN accuracy of 94.18%. This framework, developed using PyTorch, is publiclyavailable for use and further research.
由于尖峰神经网络(SNN)不使用乘法运算,因此通常被认为更节能。然而,大多数 SNNworks 在评估能耗时只考虑加法运算,而忽略了其他开销,如内存访问和数据移动操作。这种疏忽可能会导致对效率的误解,尤其是当最先进的 SNN 加速器以非常小的时间窗口尺寸运行时。在本文中,我们从硬件角度详细比较了人工神经网络(ANN)和 SNN 的能耗。我们根据经典的多级内存分层架构、常用的超形态数据流架构以及我们提出的改进型空间数据流架构,提供了精确的能耗公式。我们的研究表明,为了达到与人工神经网络相当的精度和更高的能效,人工神经网络需要严格限制时间窗口大小 T 和稀疏度 s。例如,在 VGG16 模型和固定 T 为 6 的情况下,神经元稀疏率必须超过 93%,才能确保大多数架构的能效。受这一发现的启发,我们探索了通过增加稀疏性来提高能效的策略。我们在训练过程中引入了两个正则化项,对权重和激活进行约束,从而有效提高了稀疏率。我们在 CIFAR-10 数据集上使用 6 T 进行的实验表明,我们的 SNN 所消耗的能量是空间数据流架构上优化 ANN 所消耗能量的 69%,同时保持了 94.18% 的 SNN 准确率。该框架使用 PyTorch 开发,可公开使用和进一步研究。
{"title":"Reconsidering the energy efficiency of spiking neural networks","authors":"Zhanglu Yan, Zhenyu Bai, Weng-Fai Wong","doi":"arxiv-2409.08290","DOIUrl":"https://doi.org/arxiv-2409.08290","url":null,"abstract":"Spiking neural networks (SNNs) are generally regarded as more\u0000energy-efficient because they do not use multiplications. However, most SNN\u0000works only consider the counting of additions to evaluate energy consumption,\u0000neglecting other overheads such as memory accesses and data movement\u0000operations. This oversight can lead to a misleading perception of efficiency,\u0000especially when state-of-the-art SNN accelerators operate with very small time\u0000window sizes. In this paper, we present a detailed comparison of the energy\u0000consumption of artificial neural networks (ANNs) and SNNs from a hardware\u0000perspective. We provide accurate formulas for energy consumption based on\u0000classical multi-level memory hierarchy architectures, commonly used\u0000neuromorphic dataflow architectures, and our proposed improved spatial-dataflow\u0000architecture. Our research demonstrates that to achieve comparable accuracy and\u0000greater energy efficiency than ANNs, SNNs require strict limitations on both\u0000time window size T and sparsity s. For instance, with the VGG16 model and a\u0000fixed T of 6, the neuron sparsity rate must exceed 93% to ensure energy\u0000efficiency across most architectures. Inspired by our findings, we explore\u0000strategies to enhance energy efficiency by increasing sparsity. We introduce\u0000two regularization terms during training that constrain weights and\u0000activations, effectively boosting the sparsity rate. Our experiments on the\u0000CIFAR-10 dataset, using T of 6, show that our SNNs consume 69% of the energy\u0000used by optimized ANNs on spatial-dataflow architectures, while maintaining an\u0000SNN accuracy of 94.18%. This framework, developed using PyTorch, is publicly\u0000available for use and further research.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spiking Diffusion Models 尖峰扩散模型
Pub Date : 2024-08-29 DOI: arxiv-2408.16467
Jiahang Cao, Hanzhong Guo, Ziqing Wang, Deming Zhou, Hao Cheng, Qiang Zhang, Renjing Xu
Recent years have witnessed Spiking Neural Networks (SNNs) gaining attentionfor their ultra-low energy consumption and high biological plausibilitycompared with traditional Artificial Neural Networks (ANNs). Despite theirdistinguished properties, the application of SNNs in the computationallyintensive field of image generation is still under exploration. In this paper,we propose the Spiking Diffusion Models (SDMs), an innovative family ofSNN-based generative models that excel in producing high-quality samples withsignificantly reduced energy consumption. In particular, we propose aTemporal-wise Spiking Mechanism (TSM) that allows SNNs to capture more temporalfeatures from a bio-plasticity perspective. In addition, we propose athreshold-guided strategy that can further improve the performances by up to16.7% without any additional training. We also make the first attempt to usethe ANN-SNN approach for SNN-based generation tasks. Extensive experimentalresults reveal that our approach not only exhibits comparable performance toits ANN counterpart with few spiking time steps, but also outperforms previousSNN-based generative models by a large margin. Moreover, we also demonstratethe high-quality generation ability of SDM on large-scale datasets, e.g., LSUNbedroom. This development marks a pivotal advancement in the capabilities ofSNN-based generation, paving the way for future research avenues to realizelow-energy and low-latency generative applications. Our code is available athttps://github.com/AndyCao1125/SDM.
与传统的人工神经网络(ANN)相比,尖峰神经网络(SNN)具有超低能耗和高生物可信度的特点,因此近年来备受关注。尽管 SNNs 具有与众不同的特性,但其在计算密集型图像生成领域的应用仍处于探索阶段。在本文中,我们提出了尖峰扩散模型(SDMs),这是基于 SNN 的生成模型的一个创新系列,在生成高质量样本的同时能显著降低能耗。特别是,我们提出了一种时序性尖峰机制(TSM),它允许 SNNs 从生物可塑性的角度捕捉更多的时序特征。此外,我们还提出了阈值引导策略,无需额外训练即可进一步提高性能达 16.7%。我们还首次尝试将 ANN-SNN 方法用于基于 SNN 的生成任务。广泛的实验结果表明,我们的方法不仅在尖峰时间步数较少的情况下表现出与 ANN 类似的性能,而且在很大程度上优于之前基于 SNN 的生成模型。此外,我们还在 LSUNbedroom 等大规模数据集上证明了 SDM 的高质量生成能力。这一发展标志着基于 SNN 的生成能力取得了关键性的进步,为未来实现低能耗、低延迟生成应用的研究铺平了道路。我们的代码可在https://github.com/AndyCao1125/SDM。
{"title":"Spiking Diffusion Models","authors":"Jiahang Cao, Hanzhong Guo, Ziqing Wang, Deming Zhou, Hao Cheng, Qiang Zhang, Renjing Xu","doi":"arxiv-2408.16467","DOIUrl":"https://doi.org/arxiv-2408.16467","url":null,"abstract":"Recent years have witnessed Spiking Neural Networks (SNNs) gaining attention\u0000for their ultra-low energy consumption and high biological plausibility\u0000compared with traditional Artificial Neural Networks (ANNs). Despite their\u0000distinguished properties, the application of SNNs in the computationally\u0000intensive field of image generation is still under exploration. In this paper,\u0000we propose the Spiking Diffusion Models (SDMs), an innovative family of\u0000SNN-based generative models that excel in producing high-quality samples with\u0000significantly reduced energy consumption. In particular, we propose a\u0000Temporal-wise Spiking Mechanism (TSM) that allows SNNs to capture more temporal\u0000features from a bio-plasticity perspective. In addition, we propose a\u0000threshold-guided strategy that can further improve the performances by up to\u000016.7% without any additional training. We also make the first attempt to use\u0000the ANN-SNN approach for SNN-based generation tasks. Extensive experimental\u0000results reveal that our approach not only exhibits comparable performance to\u0000its ANN counterpart with few spiking time steps, but also outperforms previous\u0000SNN-based generative models by a large margin. Moreover, we also demonstrate\u0000the high-quality generation ability of SDM on large-scale datasets, e.g., LSUN\u0000bedroom. This development marks a pivotal advancement in the capabilities of\u0000SNN-based generation, paving the way for future research avenues to realize\u0000low-energy and low-latency generative applications. Our code is available at\u0000https://github.com/AndyCao1125/SDM.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing Common Misinterpretations of KART and UAT in Neural Network Literature 消除神经网络文献中对 KART 和 UAT 的常见误读
Pub Date : 2024-08-29 DOI: arxiv-2408.16389
Vugar Ismailov
This note addresses the Kolmogorov-Arnold Representation Theorem (KART) andthe Universal Approximation Theorem (UAT), focusing on their commonmisinterpretations in some papers related to neural network approximation. Ourremarks aim to support a more accurate understanding of KART and UAT amongneural network specialists.
这篇论文讨论了科尔莫哥罗德-阿诺德表征定理(KART)和通用逼近定理(UAT),重点是它们在一些与神经网络逼近相关的论文中常见的错误解释。我们的评论旨在帮助神经网络专家更准确地理解 KART 和 UAT。
{"title":"Addressing Common Misinterpretations of KART and UAT in Neural Network Literature","authors":"Vugar Ismailov","doi":"arxiv-2408.16389","DOIUrl":"https://doi.org/arxiv-2408.16389","url":null,"abstract":"This note addresses the Kolmogorov-Arnold Representation Theorem (KART) and\u0000the Universal Approximation Theorem (UAT), focusing on their common\u0000misinterpretations in some papers related to neural network approximation. Our\u0000remarks aim to support a more accurate understanding of KART and UAT among\u0000neural network specialists.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Denoising Technique and Deep Learning Based Hybrid Wind Speed Forecasting Model for Variable Terrain Conditions 基于去噪技术和深度学习的新型混合风速预报模型,适用于多变地形条件
Pub Date : 2024-08-28 DOI: arxiv-2408.15554
Sourav Malakar, Saptarsi Goswami, Amlan Chakrabarti, Bhaswati Ganguli
Wind flow can be highly unpredictable and can suffer substantial fluctuationsin speed and direction due to the shape and height of hills, mountains, andvalleys, making accurate wind speed (WS) forecasting essential in complexterrain. This paper presents a novel and adaptive model for short-termforecasting of WS. The paper's key contributions are as follows: (a) ThePartial Auto Correlation Function (PACF) is utilised to minimise the dimensionof the set of Intrinsic Mode Functions (IMF), hence reducing training time; (b)The sample entropy (SampEn) was used to calculate the complexity of the reducedset of IMFs. The proposed technique is adaptive since a specific Deep Learning(DL) model-feature combination was chosen based on complexity; (c) A novelbidirectional feature-LSTM framework for complicated IMFs has been suggested,resulting in improved forecasting accuracy; (d) The proposed model showssuperior forecasting performance compared to the persistence, hybrid, Ensembleempirical mode decomposition (EEMD), and Variational Mode Decomposition(VMD)-based deep learning models. It has achieved the lowest variance in termsof forecasting accuracy between simple and complex terrain conditions 0.70%.Dimension reduction of IMF's and complexity-based model-feature selection helpsreduce the training time by 68.77% and improve forecasting quality by 58.58% onaverage.
由于丘陵、山脉和山谷的形状和高度不同,风流的速度和方向会有很大的波动,因此准确的风速(WS)预报对复杂地形至关重要。本文提出了一种用于短期风速预报的新型自适应模型。本文的主要贡献如下:(a) 利用部分自动相关函数(PACF)最小化本征模式函数(IMF)集的维度,从而减少训练时间;(b) 利用样本熵(SampEn)计算缩减后的本征模式函数集的复杂度。(c) 针对复杂的 IMFs 提出了一种新的双向特征-LSTM 框架,从而提高了预测精度;(d) 与基于持久性、混合、集合经验模式分解(EEMD)和变异模式分解(VMD)的深度学习模型相比,所提出的模型显示出更优越的预测性能。它在简单地形条件和复杂地形条件之间的预测准确率差异最小,仅为 0.70%。IMF 的降维和基于复杂性的模型特征选择有助于减少 68.77% 的训练时间,平均提高 58.58% 的预测质量。
{"title":"A Novel Denoising Technique and Deep Learning Based Hybrid Wind Speed Forecasting Model for Variable Terrain Conditions","authors":"Sourav Malakar, Saptarsi Goswami, Amlan Chakrabarti, Bhaswati Ganguli","doi":"arxiv-2408.15554","DOIUrl":"https://doi.org/arxiv-2408.15554","url":null,"abstract":"Wind flow can be highly unpredictable and can suffer substantial fluctuations\u0000in speed and direction due to the shape and height of hills, mountains, and\u0000valleys, making accurate wind speed (WS) forecasting essential in complex\u0000terrain. This paper presents a novel and adaptive model for short-term\u0000forecasting of WS. The paper's key contributions are as follows: (a) The\u0000Partial Auto Correlation Function (PACF) is utilised to minimise the dimension\u0000of the set of Intrinsic Mode Functions (IMF), hence reducing training time; (b)\u0000The sample entropy (SampEn) was used to calculate the complexity of the reduced\u0000set of IMFs. The proposed technique is adaptive since a specific Deep Learning\u0000(DL) model-feature combination was chosen based on complexity; (c) A novel\u0000bidirectional feature-LSTM framework for complicated IMFs has been suggested,\u0000resulting in improved forecasting accuracy; (d) The proposed model shows\u0000superior forecasting performance compared to the persistence, hybrid, Ensemble\u0000empirical mode decomposition (EEMD), and Variational Mode Decomposition\u0000(VMD)-based deep learning models. It has achieved the lowest variance in terms\u0000of forecasting accuracy between simple and complex terrain conditions 0.70%.\u0000Dimension reduction of IMF's and complexity-based model-feature selection helps\u0000reduce the training time by 68.77% and improve forecasting quality by 58.58% on\u0000average.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Neural and Evolutionary Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1