首页 > 最新文献

arXiv - CS - Neural and Evolutionary Computing最新文献

英文 中文
Runtime analysis of a coevolutionary algorithm on impartial combinatorial games 公正组合博弈协同进化算法的运行分析
Pub Date : 2024-09-06 DOI: arxiv-2409.04177
Alistair Benford, Per Kristian Lehre
Due to their complex dynamics, combinatorial games are a key test case andapplication for algorithms that train game playing agents. Among thosealgorithms that train using self-play are coevolutionary algorithms (CoEAs).CoEAs evolve a population of individuals by iteratively selecting the strongestbased on their interactions against contemporaries, and using those selected asparents for the following generation (via randomised mutation and crossover).However, the successful application of CoEAs for game playing is difficult dueto pathological behaviours such as cycling, an issue especially critical forgames with intransitive payoff landscapes. Insight into how to design CoEAs to avoid such behaviours can be provided byruntime analysis. In this paper, we push the scope of runtime analysis tocombinatorial games, proving a general upper bound for the number of simulatedgames needed for UMDA (a type of CoEA) to discover (with high probability) anoptimal strategy for an impartial combinatorial game. This result applies toany impartial combinatorial game, and for many games the implied bound ispolynomial or quasipolynomial as a function of the number of game positions.After proving the main result, we provide several applications to simplewell-known games: Nim, Chomp, Silver Dollar, and Turning Turtles. As the firstruntime analysis for CoEAs on combinatorial games, this result is a criticalstep towards a comprehensive theoretical framework for coevolution.
由于其复杂的动态性,组合博弈是训练博弈代理的算法的一个关键测试案例和应用。CoEAs 根据个体与同时代个体之间的相互作用,迭代选择最强的个体,并将这些个体作为下一代个体的父母(通过随机变异和交叉),从而演化出一个个体群体。然而,CoEAs 在博弈中的成功应用却很难避免诸如循环等病态行为,这个问题对于具有不连续报酬景观的博弈尤为关键。如何设计 CoEA 以避免此类行为,可以通过运行时间分析获得洞察力。在本文中,我们将运行时间分析的范围扩展到组合博弈,证明了 UMDA(CoEA 的一种)发现(高概率)公正组合博弈最优策略所需的模拟博弈数的一般上限。这一结果适用于任何不偏不倚的组合博弈,而且对于许多博弈来说,隐含的上界是博弈位置数的多项式或准多项式函数:在证明了主要结果之后,我们提供了几个简单的著名游戏的应用:Nim、Chomp、Silver Dollar 和 Turning Turtles。作为第一个对组合博弈的 CoEA 进行的运行时间分析,这一结果是朝着建立一个全面的协同演化理论框架迈出的关键一步。
{"title":"Runtime analysis of a coevolutionary algorithm on impartial combinatorial games","authors":"Alistair Benford, Per Kristian Lehre","doi":"arxiv-2409.04177","DOIUrl":"https://doi.org/arxiv-2409.04177","url":null,"abstract":"Due to their complex dynamics, combinatorial games are a key test case and\u0000application for algorithms that train game playing agents. Among those\u0000algorithms that train using self-play are coevolutionary algorithms (CoEAs).\u0000CoEAs evolve a population of individuals by iteratively selecting the strongest\u0000based on their interactions against contemporaries, and using those selected as\u0000parents for the following generation (via randomised mutation and crossover).\u0000However, the successful application of CoEAs for game playing is difficult due\u0000to pathological behaviours such as cycling, an issue especially critical for\u0000games with intransitive payoff landscapes. Insight into how to design CoEAs to avoid such behaviours can be provided by\u0000runtime analysis. In this paper, we push the scope of runtime analysis to\u0000combinatorial games, proving a general upper bound for the number of simulated\u0000games needed for UMDA (a type of CoEA) to discover (with high probability) an\u0000optimal strategy for an impartial combinatorial game. This result applies to\u0000any impartial combinatorial game, and for many games the implied bound is\u0000polynomial or quasipolynomial as a function of the number of game positions.\u0000After proving the main result, we provide several applications to simple\u0000well-known games: Nim, Chomp, Silver Dollar, and Turning Turtles. As the first\u0000runtime analysis for CoEAs on combinatorial games, this result is a critical\u0000step towards a comprehensive theoretical framework for coevolution.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models 通过大型语言模型推进进化多任务中的自动知识转移
Pub Date : 2024-09-06 DOI: arxiv-2409.04270
Yuxiao Huang, Xuebin Lv, Shenghao Wu, Jibin Wu, Liang Feng, Kay Chen Tan
Evolutionary Multi-task Optimization (EMTO) is a paradigm that leveragesknowledge transfer across simultaneously optimized tasks for enhanced searchperformance. To facilitate EMTO's performance, various knowledge transfermodels have been developed for specific optimization tasks. However, designingthese models often requires substantial expert knowledge. Recently, largelanguage models (LLMs) have achieved remarkable success in autonomousprogramming, aiming to produce effective solvers for specific problems. In thiswork, a LLM-based optimization paradigm is introduced to establish anautonomous model factory for generating knowledge transfer models, ensuringeffective and efficient knowledge transfer across various optimization tasks.To evaluate the performance of the proposed method, we conducted comprehensiveempirical studies comparing the knowledge transfer model generated by the LLMwith existing state-of-the-art knowledge transfer methods. The resultsdemonstrate that the generated model is able to achieve superior or competitiveperformance against hand-crafted knowledge transfer models in terms of bothefficiency and effectiveness.
进化多任务优化(EMTO)是一种利用跨同时优化任务的知识转移来提高搜索性能的范式。为了提高 EMTO 的性能,针对特定优化任务开发了各种知识转移模型。然而,设计这些模型往往需要大量的专家知识。最近,大型语言模型(LLM)在自主编程方面取得了显著的成功,其目的是为特定问题生成有效的求解器。为了评估所提出方法的性能,我们进行了全面的实证研究,将 LLM 生成的知识转移模型与现有最先进的知识转移方法进行了比较。研究结果表明,与手工创建的知识转移模型相比,LLM 生成的知识转移模型在效率和效果方面都具有优势或竞争力。
{"title":"Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models","authors":"Yuxiao Huang, Xuebin Lv, Shenghao Wu, Jibin Wu, Liang Feng, Kay Chen Tan","doi":"arxiv-2409.04270","DOIUrl":"https://doi.org/arxiv-2409.04270","url":null,"abstract":"Evolutionary Multi-task Optimization (EMTO) is a paradigm that leverages\u0000knowledge transfer across simultaneously optimized tasks for enhanced search\u0000performance. To facilitate EMTO's performance, various knowledge transfer\u0000models have been developed for specific optimization tasks. However, designing\u0000these models often requires substantial expert knowledge. Recently, large\u0000language models (LLMs) have achieved remarkable success in autonomous\u0000programming, aiming to produce effective solvers for specific problems. In this\u0000work, a LLM-based optimization paradigm is introduced to establish an\u0000autonomous model factory for generating knowledge transfer models, ensuring\u0000effective and efficient knowledge transfer across various optimization tasks.\u0000To evaluate the performance of the proposed method, we conducted comprehensive\u0000empirical studies comparing the knowledge transfer model generated by the LLM\u0000with existing state-of-the-art knowledge transfer methods. The results\u0000demonstrate that the generated model is able to achieve superior or competitive\u0000performance against hand-crafted knowledge transfer models in terms of both\u0000efficiency and effectiveness.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pareto Set Prediction Assisted Bilevel Multi-objective Optimization 帕累托集合预测辅助双层多目标优化
Pub Date : 2024-09-05 DOI: arxiv-2409.03328
Bing Wang, Hemant K. Singh, Tapabrata Ray
Bilevel optimization problems comprise an upper level optimization task thatcontains a lower level optimization task as a constraint. While there is asignificant and growing literature devoted to solving bilevel problems withsingle objective at both levels using evolutionary computation, there isrelatively scarce work done to address problems with multiple objectives(BLMOP) at both levels. For black-box BLMOPs, the existing evolutionarytechniques typically utilize nested search, which in its native form consumeslarge number of function evaluations. In this work, we propose to reduce thisexpense by predicting the lower level Pareto set for a candidate upper levelsolution directly, instead of conducting an optimization from scratch. Such aprediction is significantly challenging for BLMOPs as it involves one-to-manymapping scenario. We resolve this bottleneck by supplementing the dataset usinga helper variable and construct a neural network, which can then be trained tomap the variables in a meaningful manner. Then, we embed this initializationwithin a bilevel optimization framework, termed Pareto set prediction assistedevolutionary bilevel multi-objective optimization (PSP-BLEMO). Systematicexperiments with existing state-of-the-art methods are presented to demonstrateits benefit. The experiments show that the proposed approach is competitiveacross a range of problems, including both deceptive and non-deceptive problems
双层优化问题包括一个上层优化任务,该任务包含一个下层优化任务作为约束条件。虽然利用进化计算解决双层单一目标问题的文献数量可观且在不断增加,但解决双层多目标(BLMOP)问题的文献却相对较少。对于黑盒子 BLMOP,现有的进化技术通常使用嵌套搜索,其原始形式会消耗大量的函数评估。在这项工作中,我们建议直接预测候选上层解决方案的下层帕累托集合,而不是从头开始优化,从而减少这种消耗。这种预测对于 BLMOPs 来说具有很大的挑战性,因为它涉及一对多的映射场景。为了解决这一瓶颈,我们使用辅助变量对数据集进行补充,并构建一个神经网络,然后对其进行训练,使其能够以有意义的方式映射变量。然后,我们将这一初始化嵌入到双层优化框架中,即帕累托集预测辅助进化双层多目标优化(PSP-BLEMO)。为了证明这种方法的优势,我们对现有的最先进方法进行了系统实验。实验表明,所提出的方法在包括欺骗性和非欺骗性问题在内的一系列问题上都具有竞争力。
{"title":"Pareto Set Prediction Assisted Bilevel Multi-objective Optimization","authors":"Bing Wang, Hemant K. Singh, Tapabrata Ray","doi":"arxiv-2409.03328","DOIUrl":"https://doi.org/arxiv-2409.03328","url":null,"abstract":"Bilevel optimization problems comprise an upper level optimization task that\u0000contains a lower level optimization task as a constraint. While there is a\u0000significant and growing literature devoted to solving bilevel problems with\u0000single objective at both levels using evolutionary computation, there is\u0000relatively scarce work done to address problems with multiple objectives\u0000(BLMOP) at both levels. For black-box BLMOPs, the existing evolutionary\u0000techniques typically utilize nested search, which in its native form consumes\u0000large number of function evaluations. In this work, we propose to reduce this\u0000expense by predicting the lower level Pareto set for a candidate upper level\u0000solution directly, instead of conducting an optimization from scratch. Such a\u0000prediction is significantly challenging for BLMOPs as it involves one-to-many\u0000mapping scenario. We resolve this bottleneck by supplementing the dataset using\u0000a helper variable and construct a neural network, which can then be trained to\u0000map the variables in a meaningful manner. Then, we embed this initialization\u0000within a bilevel optimization framework, termed Pareto set prediction assisted\u0000evolutionary bilevel multi-objective optimization (PSP-BLEMO). Systematic\u0000experiments with existing state-of-the-art methods are presented to demonstrate\u0000its benefit. The experiments show that the proposed approach is competitive\u0000across a range of problems, including both deceptive and non-deceptive problems","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small 评估开源稀疏自动编码器在 GPT-2 Small 中析取事实知识的能力
Pub Date : 2024-09-05 DOI: arxiv-2409.04478
Maheep Chaudhary, Atticus Geiger
A popular new method in mechanistic interpretability is to trainhigh-dimensional sparse autoencoders (SAEs) on neuron activations and use SAEfeatures as the atomic units of analysis. However, the body of evidence onwhether SAE feature spaces are useful for causal analysis is underdeveloped. Inthis work, we use the RAVEL benchmark to evaluate whether SAEs trained onhidden representations of GPT-2 small have sets of features that separatelymediate knowledge of which country a city is in and which continent it is in.We evaluate four open-source SAEs for GPT-2 small against each other, withneurons serving as a baseline, and linear features learned via distributedalignment search (DAS) serving as a skyline. For each, we learn a binary maskto select features that will be patched to change the country of a city withoutchanging the continent, or vice versa. Our results show that SAEs struggle toreach the neuron baseline, and none come close to the DAS skyline. We releasecode here: https://github.com/MaheepChaudhary/SAE-Ravel
在机理可解释性方面,一种流行的新方法是在神经元激活上训练高维稀疏自动编码器(SAE),并使用 SAE 特征作为分析的原子单位。然而,关于 SAE 特征空间是否有助于因果分析的证据尚不充分。在这项工作中,我们使用 RAVEL 基准来评估在 GPT-2 small 的隐藏表征上训练出来的 SAE 是否拥有一组特征集,可以分别传递城市在哪个国家和哪个大洲的知识。我们以神经元作为基线,以通过分布式对齐搜索(DAS)学习到的线性特征作为天际线,对 GPT-2 small 的四个开源 SAE 进行了对比评估。对于每种方法,我们都会学习一个二进制掩码,以选择将被修补的特征,从而在不改变大陆的情况下改变一个城市的国家,反之亦然。我们的结果表明,SAE 难以达到神经元基线,而且没有一个能接近 DAS 的天际线。我们在此发布代码:https://github.com/MaheepChaudhary/SAE-Ravel
{"title":"Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small","authors":"Maheep Chaudhary, Atticus Geiger","doi":"arxiv-2409.04478","DOIUrl":"https://doi.org/arxiv-2409.04478","url":null,"abstract":"A popular new method in mechanistic interpretability is to train\u0000high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE\u0000features as the atomic units of analysis. However, the body of evidence on\u0000whether SAE feature spaces are useful for causal analysis is underdeveloped. In\u0000this work, we use the RAVEL benchmark to evaluate whether SAEs trained on\u0000hidden representations of GPT-2 small have sets of features that separately\u0000mediate knowledge of which country a city is in and which continent it is in.\u0000We evaluate four open-source SAEs for GPT-2 small against each other, with\u0000neurons serving as a baseline, and linear features learned via distributed\u0000alignment search (DAS) serving as a skyline. For each, we learn a binary mask\u0000to select features that will be patched to change the country of a city without\u0000changing the continent, or vice versa. Our results show that SAEs struggle to\u0000reach the neuron baseline, and none come close to the DAS skyline. We release\u0000code here: https://github.com/MaheepChaudhary/SAE-Ravel","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training-free Conversion of Pretrained ANNs to SNNs for Low-Power and High-Performance Applications 免训练将预训练 ANN 转换为 SNN,以实现低功耗和高性能应用
Pub Date : 2024-09-05 DOI: arxiv-2409.03368
Tong Bu, Maohua Li, Zhaofei Yu
Spiking Neural Networks (SNNs) have emerged as a promising substitute forArtificial Neural Networks (ANNs) due to their advantages of fast inference andlow power consumption. However, the lack of efficient training algorithms hashindered their widespread adoption. Existing supervised learning algorithms forSNNs require significantly more memory and time than their ANN counterparts.Even commonly used ANN-SNN conversion methods necessitate re-training of ANNsto enhance conversion efficiency, incurring additional computational costs. Toaddress these challenges, we propose a novel training-free ANN-SNN conversionpipeline. Our approach directly converts pre-trained ANN models intohigh-performance SNNs without additional training. The conversion pipelineincludes a local-learning-based threshold balancing algorithm, which enablesefficient calculation of the optimal thresholds and fine-grained adjustment ofthreshold value by channel-wise scaling. We demonstrate the scalability of ourframework across three typical computer vision tasks: image classification,semantic segmentation, and object detection. This showcases its applicabilityto both classification and regression tasks. Moreover, we have evaluated theenergy consumption of the converted SNNs, demonstrating their superiorlow-power advantage compared to conventional ANNs. Our training-free algorithmoutperforms existing methods, highlighting its practical applicability andefficiency. This approach simplifies the deployment of SNNs by leveragingopen-source pre-trained ANN models and neuromorphic hardware, enabling fast,low-power inference with negligible performance reduction.
尖峰神经网络(SNN)具有推理速度快、功耗低等优点,因此有望取代人工神经网络(ANN)。然而,高效训练算法的缺乏阻碍了其广泛应用。即使是常用的 ANN-SNN 转换方法,也需要重新训练 ANNN 以提高转换效率,从而产生额外的计算成本。为了应对这些挑战,我们提出了一种新型免训练 ANN-SNN 转换管道。我们的方法可直接将预先训练好的 ANN 模型转换为高性能 SNN,无需额外训练。转换管道包括基于本地学习的阈值平衡算法,该算法可以高效计算最佳阈值,并通过信道缩放对阈值进行细粒度调整。我们在三个典型的计算机视觉任务中展示了我们框架的可扩展性:图像分类、语义分割和物体检测。这展示了它对分类和回归任务的适用性。此外,我们还对转换后的 SNN 的能耗进行了评估,证明与传统 ANN 相比,SNN 具有更低功耗的优势。我们的免训练算法优于现有方法,凸显了其实用性和高效性。这种方法通过利用开源预训练 ANN 模型和神经形态硬件,简化了 SNN 的部署,实现了快速、低功耗推理,性能降低可忽略不计。
{"title":"Training-free Conversion of Pretrained ANNs to SNNs for Low-Power and High-Performance Applications","authors":"Tong Bu, Maohua Li, Zhaofei Yu","doi":"arxiv-2409.03368","DOIUrl":"https://doi.org/arxiv-2409.03368","url":null,"abstract":"Spiking Neural Networks (SNNs) have emerged as a promising substitute for\u0000Artificial Neural Networks (ANNs) due to their advantages of fast inference and\u0000low power consumption. However, the lack of efficient training algorithms has\u0000hindered their widespread adoption. Existing supervised learning algorithms for\u0000SNNs require significantly more memory and time than their ANN counterparts.\u0000Even commonly used ANN-SNN conversion methods necessitate re-training of ANNs\u0000to enhance conversion efficiency, incurring additional computational costs. To\u0000address these challenges, we propose a novel training-free ANN-SNN conversion\u0000pipeline. Our approach directly converts pre-trained ANN models into\u0000high-performance SNNs without additional training. The conversion pipeline\u0000includes a local-learning-based threshold balancing algorithm, which enables\u0000efficient calculation of the optimal thresholds and fine-grained adjustment of\u0000threshold value by channel-wise scaling. We demonstrate the scalability of our\u0000framework across three typical computer vision tasks: image classification,\u0000semantic segmentation, and object detection. This showcases its applicability\u0000to both classification and regression tasks. Moreover, we have evaluated the\u0000energy consumption of the converted SNNs, demonstrating their superior\u0000low-power advantage compared to conventional ANNs. Our training-free algorithm\u0000outperforms existing methods, highlighting its practical applicability and\u0000efficiency. This approach simplifies the deployment of SNNs by leveraging\u0000open-source pre-trained ANN models and neuromorphic hardware, enabling fast,\u0000low-power inference with negligible performance reduction.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SNNAX -- Spiking Neural Networks in JAX SNNAX -- JAX 中的尖峰神经网络
Pub Date : 2024-09-04 DOI: arxiv-2409.02842
Jamie Lohoff, Jan Finkbeiner, Emre Neftci
Spiking Neural Networks (SNNs) simulators are essential tools to prototypebiologically inspired models and neuromorphic hardware architectures andpredict their performance. For such a tool, ease of use and flexibility arecritical, but so is simulation speed especially given the complexity inherentto simulating SNN. Here, we present SNNAX, a JAX-based framework for simulatingand training such models with PyTorch-like intuitiveness and JAX-like executionspeed. SNNAX models are easily extended and customized to fit the desired modelspecifications and target neuromorphic hardware. Additionally, SNNAX offers keyfeatures for optimizing the training and deployment of SNNs such as flexibleautomatic differentiation and just-in-time compilation. We evaluate and compareSNNAX to other commonly used machine learning (ML) frameworks used forprogramming SNNs. We provide key performance metrics, best practices,documented examples for simulating SNNs in SNNAX, and implement severalbenchmarks used in the literature.
尖峰神经网络(SNN)模拟器是对受生物学启发的模型和神经形态硬件架构进行原型设计并预测其性能的重要工具。对于这样一种工具来说,易用性和灵活性至关重要,但仿真速度也同样重要,尤其是考虑到尖峰神经网络仿真固有的复杂性。在此,我们介绍 SNNAX,这是一个基于 JAX 的框架,用于模拟和训练此类模型,具有 PyTorch 的直观性和 JAX 的执行速度。SNNAX 模型可轻松扩展和定制,以适应所需的模型规格和目标神经形态硬件。此外,SNNAX 还提供了优化 SNN 训练和部署的关键功能,如灵活的自动区分和即时编译。我们评估了 SNNAX,并将其与用于编程 SNN 的其他常用机器学习(ML)框架进行了比较。我们提供了在 SNNAX 中模拟 SNN 的关键性能指标、最佳实践和文档示例,并实现了文献中使用的多个基准。
{"title":"SNNAX -- Spiking Neural Networks in JAX","authors":"Jamie Lohoff, Jan Finkbeiner, Emre Neftci","doi":"arxiv-2409.02842","DOIUrl":"https://doi.org/arxiv-2409.02842","url":null,"abstract":"Spiking Neural Networks (SNNs) simulators are essential tools to prototype\u0000biologically inspired models and neuromorphic hardware architectures and\u0000predict their performance. For such a tool, ease of use and flexibility are\u0000critical, but so is simulation speed especially given the complexity inherent\u0000to simulating SNN. Here, we present SNNAX, a JAX-based framework for simulating\u0000and training such models with PyTorch-like intuitiveness and JAX-like execution\u0000speed. SNNAX models are easily extended and customized to fit the desired model\u0000specifications and target neuromorphic hardware. Additionally, SNNAX offers key\u0000features for optimizing the training and deployment of SNNs such as flexible\u0000automatic differentiation and just-in-time compilation. We evaluate and compare\u0000SNNAX to other commonly used machine learning (ML) frameworks used for\u0000programming SNNs. We provide key performance metrics, best practices,\u0000documented examples for simulating SNNs in SNNAX, and implement several\u0000benchmarks used in the literature.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brain-Inspired Online Adaptation for Remote Sensing with Spiking Neural Network 利用尖峰神经网络实现遥感的脑启发式在线适应
Pub Date : 2024-09-03 DOI: arxiv-2409.02146
Dexin Duan, Peilin liu, Fei Wen
On-device computing, or edge computing, is becoming increasingly importantfor remote sensing, particularly in applications like deep network-basedperception on on-orbit satellites and unmanned aerial vehicles (UAVs). In thesescenarios, two brain-like capabilities are crucial for remote sensing models:(1) high energy efficiency, allowing the model to operate on edge devices withlimited computing resources, and (2) online adaptation, enabling the model toquickly adapt to environmental variations, weather changes, and sensor drift.This work addresses these needs by proposing an online adaptation frameworkbased on spiking neural networks (SNNs) for remote sensing. Starting with apretrained SNN model, we design an efficient, unsupervised online adaptationalgorithm, which adopts an approximation of the BPTT algorithm and onlyinvolves forward-in-time computation that significantly reduces thecomputational complexity of SNN adaptation learning. Besides, we propose anadaptive activation scaling scheme to boost online SNN adaptation performance,particularly in low time-steps. Furthermore, for the more challenging remotesensing detection task, we propose a confidence-based instance weightingscheme, which substantially improves adaptation performance in the detectiontask. To our knowledge, this work is the first to address the online adaptationof SNNs. Extensive experiments on seven benchmark datasets acrossclassification, segmentation, and detection tasks demonstrate that our proposedmethod significantly outperforms existing domain adaptation and domaingeneralization approaches under varying weather conditions. The proposed methodenables energy-efficient and fast online adaptation on edge devices, and hasmuch potential in applications such as remote perception on on-orbit satellitesand UAV.
设备上计算或边缘计算对遥感技术越来越重要,特别是在基于深度网络的在轨卫星和无人机(UAV)感知等应用中。在这些应用场景中,两个类似大脑的能力对遥感模型至关重要:(1)高能效,使模型能够在计算资源有限的边缘设备上运行;(2)在线自适应,使模型能够快速适应环境变化、天气变化和传感器漂移。从经过训练的 SNN 模型开始,我们设计了一种高效、无监督的在线自适应算法,该算法采用了 BPTT 算法的近似值,只涉及前向实时计算,大大降低了 SNN 自适应学习的计算复杂度。此外,我们还提出了一种自适应激活缩放方案,以提高 SNN 的在线自适应性能,尤其是在低时间步长的情况下。此外,针对更具挑战性的遥感检测任务,我们提出了基于置信度的实例加权方案,大大提高了检测任务中的适应性能。据我们所知,这项工作是首次解决 SNN 的在线适应问题。在分类、分割和检测任务的七个基准数据集上进行的广泛实验表明,在不同天气条件下,我们提出的方法明显优于现有的领域适应和领域泛化方法。所提出的方法可以在边缘设备上实现高能效和快速的在线适配,在轨道卫星和无人机的远程感知等应用中大有可为。
{"title":"Brain-Inspired Online Adaptation for Remote Sensing with Spiking Neural Network","authors":"Dexin Duan, Peilin liu, Fei Wen","doi":"arxiv-2409.02146","DOIUrl":"https://doi.org/arxiv-2409.02146","url":null,"abstract":"On-device computing, or edge computing, is becoming increasingly important\u0000for remote sensing, particularly in applications like deep network-based\u0000perception on on-orbit satellites and unmanned aerial vehicles (UAVs). In these\u0000scenarios, two brain-like capabilities are crucial for remote sensing models:\u0000(1) high energy efficiency, allowing the model to operate on edge devices with\u0000limited computing resources, and (2) online adaptation, enabling the model to\u0000quickly adapt to environmental variations, weather changes, and sensor drift.\u0000This work addresses these needs by proposing an online adaptation framework\u0000based on spiking neural networks (SNNs) for remote sensing. Starting with a\u0000pretrained SNN model, we design an efficient, unsupervised online adaptation\u0000algorithm, which adopts an approximation of the BPTT algorithm and only\u0000involves forward-in-time computation that significantly reduces the\u0000computational complexity of SNN adaptation learning. Besides, we propose an\u0000adaptive activation scaling scheme to boost online SNN adaptation performance,\u0000particularly in low time-steps. Furthermore, for the more challenging remote\u0000sensing detection task, we propose a confidence-based instance weighting\u0000scheme, which substantially improves adaptation performance in the detection\u0000task. To our knowledge, this work is the first to address the online adaptation\u0000of SNNs. Extensive experiments on seven benchmark datasets across\u0000classification, segmentation, and detection tasks demonstrate that our proposed\u0000method significantly outperforms existing domain adaptation and domain\u0000generalization approaches under varying weather conditions. The proposed method\u0000enables energy-efficient and fast online adaptation on edge devices, and has\u0000much potential in applications such as remote perception on on-orbit satellites\u0000and UAV.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CoLaNET -- A Spiking Neural Network with Columnar Layered Architecture for Classification CoLaNET -- 用于分类的柱状分层结构尖峰神经网络
Pub Date : 2024-09-02 DOI: arxiv-2409.01230
Mikhail Kiselev
In the present paper, I describe a spiking neural network (SNN) architecturewhich, can be used in wide range of supervised learning classification tasks.It is assumed, that all participating signals (the classified objectdescription, correct class label and SNN decision) have spiking nature. Thedistinctive feature of this architecture is a combination of prototypicalnetwork structures corresponding to different classes and significantlydistinctive instances of one class (=columns) and functionally differingpopulations of neurons inside columns (=layers). The other distinctive featureis a novel combination of anti-Hebbian and dopamine-modulated plasticity. Theplasticity rules are local and do not use the backpropagation principle.Besides that, as in my previous studies, I was guided by the requirement thatthe all neuron/plasticity models should be easily implemented on modernneurochips. I illustrate the high performance of my network on the MNISTbenchmark.
在本文中,我描述了一种尖峰神经网络(SNN)架构,该架构可用于广泛的监督学习分类任务。假设所有参与信号(分类对象描述、正确类别标签和 SNN 决策)都具有尖峰特性。该架构的显著特点是结合了对应不同类别的原型网络结构和一个类别的显著不同实例(=列)以及列内功能不同的神经元群(=层)。另一个显著特点是反黑比安和多巴胺调节的可塑性的新组合。此外,与之前的研究一样,我的要求是所有神经元/可塑性模型都应易于在现代神经芯片上实现。我在 MNIST 基准测试中展示了我的网络的高性能。
{"title":"CoLaNET -- A Spiking Neural Network with Columnar Layered Architecture for Classification","authors":"Mikhail Kiselev","doi":"arxiv-2409.01230","DOIUrl":"https://doi.org/arxiv-2409.01230","url":null,"abstract":"In the present paper, I describe a spiking neural network (SNN) architecture\u0000which, can be used in wide range of supervised learning classification tasks.\u0000It is assumed, that all participating signals (the classified object\u0000description, correct class label and SNN decision) have spiking nature. The\u0000distinctive feature of this architecture is a combination of prototypical\u0000network structures corresponding to different classes and significantly\u0000distinctive instances of one class (=columns) and functionally differing\u0000populations of neurons inside columns (=layers). The other distinctive feature\u0000is a novel combination of anti-Hebbian and dopamine-modulated plasticity. The\u0000plasticity rules are local and do not use the backpropagation principle.\u0000Besides that, as in my previous studies, I was guided by the requirement that\u0000the all neuron/plasticity models should be easily implemented on modern\u0000neurochips. I illustrate the high performance of my network on the MNIST\u0000benchmark.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Landscape-Aware Automated Algorithm Configuration using Multi-output Mixed Regression and Classification 使用多输出混合回归和分类的景观感知自动算法配置
Pub Date : 2024-09-02 DOI: arxiv-2409.01446
Fu Xing Long, Moritz Frenzel, Peter Krause, Markus Gitterle, Thomas Bäck, Niki van Stein
In landscape-aware algorithm selection problem, the effectiveness offeature-based predictive models strongly depends on the representativeness oftraining data for practical applications. In this work, we investigate thepotential of randomly generated functions (RGF) for the model training, whichcover a much more diverse set of optimization problem classes compared to thewidely-used black-box optimization benchmarking (BBOB) suite. Correspondingly,we focus on automated algorithm configuration (AAC), that is, selecting thebest suited algorithm and fine-tuning its hyperparameters based on thelandscape features of problem instances. Precisely, we analyze the performanceof dense neural network (NN) models in handling the multi-output mixedregression and classification tasks using different training data sets, such asRGF and many-affine BBOB (MA-BBOB) functions. Based on our results on the BBOBfunctions in 5d and 20d, near optimal configurations can be identified usingthe proposed approach, which can most of the time outperform the off-the-shelfdefault configuration considered by practitioners with limited knowledge aboutAAC. Furthermore, the predicted configurations are competitive against thesingle best solver in many cases. Overall, configurations with betterperformance can be best identified by using NN models trained on a combinationof RGF and MA-BBOB functions.
在景观感知算法选择问题中,基于特征的预测模型的有效性在很大程度上取决于实际应用中训练数据的代表性。在这项工作中,我们研究了用于模型训练的随机生成函数(RGF)的潜力,与广泛使用的黑盒优化基准(BBOB)套件相比,RGF涵盖了更多样化的优化问题类别。相应地,我们关注自动算法配置(AAC),即根据问题实例的景观特征选择最合适的算法并微调其超参数。准确地说,我们分析了密集神经网络(NN)模型在使用不同训练数据集处理多输出混合回归和分类任务时的性能,如RGF和多参数BBOB(MA-BBOB)函数。根据我们对 5d 和 20d BBOB 函数的研究结果,使用所提出的方法可以确定接近最优的配置,这在大多数情况下都优于对 AAC 了解有限的从业人员所考虑的现成默认配置。此外,在许多情况下,预测的配置与单一最佳求解器相比具有竞争力。总体而言,通过使用 RGF 和 MA-BBOB 函数组合训练的 NN 模型,可以识别出性能更好的配置。
{"title":"Landscape-Aware Automated Algorithm Configuration using Multi-output Mixed Regression and Classification","authors":"Fu Xing Long, Moritz Frenzel, Peter Krause, Markus Gitterle, Thomas Bäck, Niki van Stein","doi":"arxiv-2409.01446","DOIUrl":"https://doi.org/arxiv-2409.01446","url":null,"abstract":"In landscape-aware algorithm selection problem, the effectiveness of\u0000feature-based predictive models strongly depends on the representativeness of\u0000training data for practical applications. In this work, we investigate the\u0000potential of randomly generated functions (RGF) for the model training, which\u0000cover a much more diverse set of optimization problem classes compared to the\u0000widely-used black-box optimization benchmarking (BBOB) suite. Correspondingly,\u0000we focus on automated algorithm configuration (AAC), that is, selecting the\u0000best suited algorithm and fine-tuning its hyperparameters based on the\u0000landscape features of problem instances. Precisely, we analyze the performance\u0000of dense neural network (NN) models in handling the multi-output mixed\u0000regression and classification tasks using different training data sets, such as\u0000RGF and many-affine BBOB (MA-BBOB) functions. Based on our results on the BBOB\u0000functions in 5d and 20d, near optimal configurations can be identified using\u0000the proposed approach, which can most of the time outperform the off-the-shelf\u0000default configuration considered by practitioners with limited knowledge about\u0000AAC. Furthermore, the predicted configurations are competitive against the\u0000single best solver in many cases. Overall, configurations with better\u0000performance can be best identified by using NN models trained on a combination\u0000of RGF and MA-BBOB functions.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"68 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
JaxLife: An Open-Ended Agentic Simulator JaxLife:开放式代理模拟器
Pub Date : 2024-09-01 DOI: arxiv-2409.00853
Chris Lu, Michael Beukman, Michael Matthews, Jakob Foerster
Human intelligence emerged through the process of natural selection andevolution on Earth. We investigate what it would take to re-create this processin silico. While past work has often focused on low-level processes (such assimulating physics or chemistry), we instead take a more targeted approach,aiming to evolve agents that can accumulate open-ended culture and technologiesacross generations. Towards this, we present JaxLife: an artificial lifesimulator in which embodied agents, parameterized by deep neural networks, mustlearn to survive in an expressive world containing programmable systems. First,we describe the environment and show that it can facilitate meaningfulTuring-complete computation. We then analyze the evolved emergent agents'behavior, such as rudimentary communication protocols, agriculture, and tooluse. Finally, we investigate how complexity scales with the amount of computeused. We believe JaxLife takes a step towards studying evolved behavior in moreopen-ended simulations. Our code is available athttps://github.com/luchris429/JaxLife
人类智慧是通过地球上的自然选择和进化过程产生的。我们研究了在硅学中重新创造这一过程所需要的条件。过去的研究通常侧重于低级过程(如模仿物理或化学),而我们则采取了一种更有针对性的方法,旨在进化出能够跨代积累开放式文化和技术的代理。为此,我们提出了 "JaxLife":一个人工生命模拟器,在这个模拟器中,由深度神经网络参数化的代理必须学会在一个包含可编程系统的富有表现力的世界中生存。首先,我们描述了这个环境,并证明它可以促进有意义的图灵完备计算。然后,我们分析了进化出的新兴代理行为,如初级通信协议、农业和工具使用。最后,我们研究了复杂性如何随着计算量的增加而增加。我们相信,JaxLife 为在更开放的模拟中研究进化行为迈出了一步。我们的代码可在https://github.com/luchris429/JaxLife
{"title":"JaxLife: An Open-Ended Agentic Simulator","authors":"Chris Lu, Michael Beukman, Michael Matthews, Jakob Foerster","doi":"arxiv-2409.00853","DOIUrl":"https://doi.org/arxiv-2409.00853","url":null,"abstract":"Human intelligence emerged through the process of natural selection and\u0000evolution on Earth. We investigate what it would take to re-create this process\u0000in silico. While past work has often focused on low-level processes (such as\u0000simulating physics or chemistry), we instead take a more targeted approach,\u0000aiming to evolve agents that can accumulate open-ended culture and technologies\u0000across generations. Towards this, we present JaxLife: an artificial life\u0000simulator in which embodied agents, parameterized by deep neural networks, must\u0000learn to survive in an expressive world containing programmable systems. First,\u0000we describe the environment and show that it can facilitate meaningful\u0000Turing-complete computation. We then analyze the evolved emergent agents'\u0000behavior, such as rudimentary communication protocols, agriculture, and tool\u0000use. Finally, we investigate how complexity scales with the amount of compute\u0000used. We believe JaxLife takes a step towards studying evolved behavior in more\u0000open-ended simulations. Our code is available at\u0000https://github.com/luchris429/JaxLife","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Neural and Evolutionary Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1