Due to their complex dynamics, combinatorial games are a key test case and application for algorithms that train game playing agents. Among those algorithms that train using self-play are coevolutionary algorithms (CoEAs). CoEAs evolve a population of individuals by iteratively selecting the strongest based on their interactions against contemporaries, and using those selected as parents for the following generation (via randomised mutation and crossover). However, the successful application of CoEAs for game playing is difficult due to pathological behaviours such as cycling, an issue especially critical for games with intransitive payoff landscapes. Insight into how to design CoEAs to avoid such behaviours can be provided by runtime analysis. In this paper, we push the scope of runtime analysis to combinatorial games, proving a general upper bound for the number of simulated games needed for UMDA (a type of CoEA) to discover (with high probability) an optimal strategy for an impartial combinatorial game. This result applies to any impartial combinatorial game, and for many games the implied bound is polynomial or quasipolynomial as a function of the number of game positions. After proving the main result, we provide several applications to simple well-known games: Nim, Chomp, Silver Dollar, and Turning Turtles. As the first runtime analysis for CoEAs on combinatorial games, this result is a critical step towards a comprehensive theoretical framework for coevolution.
{"title":"Runtime analysis of a coevolutionary algorithm on impartial combinatorial games","authors":"Alistair Benford, Per Kristian Lehre","doi":"arxiv-2409.04177","DOIUrl":"https://doi.org/arxiv-2409.04177","url":null,"abstract":"Due to their complex dynamics, combinatorial games are a key test case and\u0000application for algorithms that train game playing agents. Among those\u0000algorithms that train using self-play are coevolutionary algorithms (CoEAs).\u0000CoEAs evolve a population of individuals by iteratively selecting the strongest\u0000based on their interactions against contemporaries, and using those selected as\u0000parents for the following generation (via randomised mutation and crossover).\u0000However, the successful application of CoEAs for game playing is difficult due\u0000to pathological behaviours such as cycling, an issue especially critical for\u0000games with intransitive payoff landscapes. Insight into how to design CoEAs to avoid such behaviours can be provided by\u0000runtime analysis. In this paper, we push the scope of runtime analysis to\u0000combinatorial games, proving a general upper bound for the number of simulated\u0000games needed for UMDA (a type of CoEA) to discover (with high probability) an\u0000optimal strategy for an impartial combinatorial game. This result applies to\u0000any impartial combinatorial game, and for many games the implied bound is\u0000polynomial or quasipolynomial as a function of the number of game positions.\u0000After proving the main result, we provide several applications to simple\u0000well-known games: Nim, Chomp, Silver Dollar, and Turning Turtles. As the first\u0000runtime analysis for CoEAs on combinatorial games, this result is a critical\u0000step towards a comprehensive theoretical framework for coevolution.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuxiao Huang, Xuebin Lv, Shenghao Wu, Jibin Wu, Liang Feng, Kay Chen Tan
Evolutionary Multi-task Optimization (EMTO) is a paradigm that leverages knowledge transfer across simultaneously optimized tasks for enhanced search performance. To facilitate EMTO's performance, various knowledge transfer models have been developed for specific optimization tasks. However, designing these models often requires substantial expert knowledge. Recently, large language models (LLMs) have achieved remarkable success in autonomous programming, aiming to produce effective solvers for specific problems. In this work, a LLM-based optimization paradigm is introduced to establish an autonomous model factory for generating knowledge transfer models, ensuring effective and efficient knowledge transfer across various optimization tasks. To evaluate the performance of the proposed method, we conducted comprehensive empirical studies comparing the knowledge transfer model generated by the LLM with existing state-of-the-art knowledge transfer methods. The results demonstrate that the generated model is able to achieve superior or competitive performance against hand-crafted knowledge transfer models in terms of both efficiency and effectiveness.
{"title":"Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models","authors":"Yuxiao Huang, Xuebin Lv, Shenghao Wu, Jibin Wu, Liang Feng, Kay Chen Tan","doi":"arxiv-2409.04270","DOIUrl":"https://doi.org/arxiv-2409.04270","url":null,"abstract":"Evolutionary Multi-task Optimization (EMTO) is a paradigm that leverages\u0000knowledge transfer across simultaneously optimized tasks for enhanced search\u0000performance. To facilitate EMTO's performance, various knowledge transfer\u0000models have been developed for specific optimization tasks. However, designing\u0000these models often requires substantial expert knowledge. Recently, large\u0000language models (LLMs) have achieved remarkable success in autonomous\u0000programming, aiming to produce effective solvers for specific problems. In this\u0000work, a LLM-based optimization paradigm is introduced to establish an\u0000autonomous model factory for generating knowledge transfer models, ensuring\u0000effective and efficient knowledge transfer across various optimization tasks.\u0000To evaluate the performance of the proposed method, we conducted comprehensive\u0000empirical studies comparing the knowledge transfer model generated by the LLM\u0000with existing state-of-the-art knowledge transfer methods. The results\u0000demonstrate that the generated model is able to achieve superior or competitive\u0000performance against hand-crafted knowledge transfer models in terms of both\u0000efficiency and effectiveness.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bilevel optimization problems comprise an upper level optimization task that contains a lower level optimization task as a constraint. While there is a significant and growing literature devoted to solving bilevel problems with single objective at both levels using evolutionary computation, there is relatively scarce work done to address problems with multiple objectives (BLMOP) at both levels. For black-box BLMOPs, the existing evolutionary techniques typically utilize nested search, which in its native form consumes large number of function evaluations. In this work, we propose to reduce this expense by predicting the lower level Pareto set for a candidate upper level solution directly, instead of conducting an optimization from scratch. Such a prediction is significantly challenging for BLMOPs as it involves one-to-many mapping scenario. We resolve this bottleneck by supplementing the dataset using a helper variable and construct a neural network, which can then be trained to map the variables in a meaningful manner. Then, we embed this initialization within a bilevel optimization framework, termed Pareto set prediction assisted evolutionary bilevel multi-objective optimization (PSP-BLEMO). Systematic experiments with existing state-of-the-art methods are presented to demonstrate its benefit. The experiments show that the proposed approach is competitive across a range of problems, including both deceptive and non-deceptive problems
{"title":"Pareto Set Prediction Assisted Bilevel Multi-objective Optimization","authors":"Bing Wang, Hemant K. Singh, Tapabrata Ray","doi":"arxiv-2409.03328","DOIUrl":"https://doi.org/arxiv-2409.03328","url":null,"abstract":"Bilevel optimization problems comprise an upper level optimization task that\u0000contains a lower level optimization task as a constraint. While there is a\u0000significant and growing literature devoted to solving bilevel problems with\u0000single objective at both levels using evolutionary computation, there is\u0000relatively scarce work done to address problems with multiple objectives\u0000(BLMOP) at both levels. For black-box BLMOPs, the existing evolutionary\u0000techniques typically utilize nested search, which in its native form consumes\u0000large number of function evaluations. In this work, we propose to reduce this\u0000expense by predicting the lower level Pareto set for a candidate upper level\u0000solution directly, instead of conducting an optimization from scratch. Such a\u0000prediction is significantly challenging for BLMOPs as it involves one-to-many\u0000mapping scenario. We resolve this bottleneck by supplementing the dataset using\u0000a helper variable and construct a neural network, which can then be trained to\u0000map the variables in a meaningful manner. Then, we embed this initialization\u0000within a bilevel optimization framework, termed Pareto set prediction assisted\u0000evolutionary bilevel multi-objective optimization (PSP-BLEMO). Systematic\u0000experiments with existing state-of-the-art methods are presented to demonstrate\u0000its benefit. The experiments show that the proposed approach is competitive\u0000across a range of problems, including both deceptive and non-deceptive problems","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A popular new method in mechanistic interpretability is to train high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE features as the atomic units of analysis. However, the body of evidence on whether SAE feature spaces are useful for causal analysis is underdeveloped. In this work, we use the RAVEL benchmark to evaluate whether SAEs trained on hidden representations of GPT-2 small have sets of features that separately mediate knowledge of which country a city is in and which continent it is in. We evaluate four open-source SAEs for GPT-2 small against each other, with neurons serving as a baseline, and linear features learned via distributed alignment search (DAS) serving as a skyline. For each, we learn a binary mask to select features that will be patched to change the country of a city without changing the continent, or vice versa. Our results show that SAEs struggle to reach the neuron baseline, and none come close to the DAS skyline. We release code here: https://github.com/MaheepChaudhary/SAE-Ravel
在机理可解释性方面,一种流行的新方法是在神经元激活上训练高维稀疏自动编码器(SAE),并使用 SAE 特征作为分析的原子单位。然而,关于 SAE 特征空间是否有助于因果分析的证据尚不充分。在这项工作中,我们使用 RAVEL 基准来评估在 GPT-2 small 的隐藏表征上训练出来的 SAE 是否拥有一组特征集,可以分别传递城市在哪个国家和哪个大洲的知识。我们以神经元作为基线,以通过分布式对齐搜索(DAS)学习到的线性特征作为天际线,对 GPT-2 small 的四个开源 SAE 进行了对比评估。对于每种方法,我们都会学习一个二进制掩码,以选择将被修补的特征,从而在不改变大陆的情况下改变一个城市的国家,反之亦然。我们的结果表明,SAE 难以达到神经元基线,而且没有一个能接近 DAS 的天际线。我们在此发布代码:https://github.com/MaheepChaudhary/SAE-Ravel
{"title":"Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small","authors":"Maheep Chaudhary, Atticus Geiger","doi":"arxiv-2409.04478","DOIUrl":"https://doi.org/arxiv-2409.04478","url":null,"abstract":"A popular new method in mechanistic interpretability is to train\u0000high-dimensional sparse autoencoders (SAEs) on neuron activations and use SAE\u0000features as the atomic units of analysis. However, the body of evidence on\u0000whether SAE feature spaces are useful for causal analysis is underdeveloped. In\u0000this work, we use the RAVEL benchmark to evaluate whether SAEs trained on\u0000hidden representations of GPT-2 small have sets of features that separately\u0000mediate knowledge of which country a city is in and which continent it is in.\u0000We evaluate four open-source SAEs for GPT-2 small against each other, with\u0000neurons serving as a baseline, and linear features learned via distributed\u0000alignment search (DAS) serving as a skyline. For each, we learn a binary mask\u0000to select features that will be patched to change the country of a city without\u0000changing the continent, or vice versa. Our results show that SAEs struggle to\u0000reach the neuron baseline, and none come close to the DAS skyline. We release\u0000code here: https://github.com/MaheepChaudhary/SAE-Ravel","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking Neural Networks (SNNs) have emerged as a promising substitute for Artificial Neural Networks (ANNs) due to their advantages of fast inference and low power consumption. However, the lack of efficient training algorithms has hindered their widespread adoption. Existing supervised learning algorithms for SNNs require significantly more memory and time than their ANN counterparts. Even commonly used ANN-SNN conversion methods necessitate re-training of ANNs to enhance conversion efficiency, incurring additional computational costs. To address these challenges, we propose a novel training-free ANN-SNN conversion pipeline. Our approach directly converts pre-trained ANN models into high-performance SNNs without additional training. The conversion pipeline includes a local-learning-based threshold balancing algorithm, which enables efficient calculation of the optimal thresholds and fine-grained adjustment of threshold value by channel-wise scaling. We demonstrate the scalability of our framework across three typical computer vision tasks: image classification, semantic segmentation, and object detection. This showcases its applicability to both classification and regression tasks. Moreover, we have evaluated the energy consumption of the converted SNNs, demonstrating their superior low-power advantage compared to conventional ANNs. Our training-free algorithm outperforms existing methods, highlighting its practical applicability and efficiency. This approach simplifies the deployment of SNNs by leveraging open-source pre-trained ANN models and neuromorphic hardware, enabling fast, low-power inference with negligible performance reduction.
尖峰神经网络(SNN)具有推理速度快、功耗低等优点,因此有望取代人工神经网络(ANN)。然而,高效训练算法的缺乏阻碍了其广泛应用。即使是常用的 ANN-SNN 转换方法,也需要重新训练 ANNN 以提高转换效率,从而产生额外的计算成本。为了应对这些挑战,我们提出了一种新型免训练 ANN-SNN 转换管道。我们的方法可直接将预先训练好的 ANN 模型转换为高性能 SNN,无需额外训练。转换管道包括基于本地学习的阈值平衡算法,该算法可以高效计算最佳阈值,并通过信道缩放对阈值进行细粒度调整。我们在三个典型的计算机视觉任务中展示了我们框架的可扩展性:图像分类、语义分割和物体检测。这展示了它对分类和回归任务的适用性。此外,我们还对转换后的 SNN 的能耗进行了评估,证明与传统 ANN 相比,SNN 具有更低功耗的优势。我们的免训练算法优于现有方法,凸显了其实用性和高效性。这种方法通过利用开源预训练 ANN 模型和神经形态硬件,简化了 SNN 的部署,实现了快速、低功耗推理,性能降低可忽略不计。
{"title":"Training-free Conversion of Pretrained ANNs to SNNs for Low-Power and High-Performance Applications","authors":"Tong Bu, Maohua Li, Zhaofei Yu","doi":"arxiv-2409.03368","DOIUrl":"https://doi.org/arxiv-2409.03368","url":null,"abstract":"Spiking Neural Networks (SNNs) have emerged as a promising substitute for\u0000Artificial Neural Networks (ANNs) due to their advantages of fast inference and\u0000low power consumption. However, the lack of efficient training algorithms has\u0000hindered their widespread adoption. Existing supervised learning algorithms for\u0000SNNs require significantly more memory and time than their ANN counterparts.\u0000Even commonly used ANN-SNN conversion methods necessitate re-training of ANNs\u0000to enhance conversion efficiency, incurring additional computational costs. To\u0000address these challenges, we propose a novel training-free ANN-SNN conversion\u0000pipeline. Our approach directly converts pre-trained ANN models into\u0000high-performance SNNs without additional training. The conversion pipeline\u0000includes a local-learning-based threshold balancing algorithm, which enables\u0000efficient calculation of the optimal thresholds and fine-grained adjustment of\u0000threshold value by channel-wise scaling. We demonstrate the scalability of our\u0000framework across three typical computer vision tasks: image classification,\u0000semantic segmentation, and object detection. This showcases its applicability\u0000to both classification and regression tasks. Moreover, we have evaluated the\u0000energy consumption of the converted SNNs, demonstrating their superior\u0000low-power advantage compared to conventional ANNs. Our training-free algorithm\u0000outperforms existing methods, highlighting its practical applicability and\u0000efficiency. This approach simplifies the deployment of SNNs by leveraging\u0000open-source pre-trained ANN models and neuromorphic hardware, enabling fast,\u0000low-power inference with negligible performance reduction.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking Neural Networks (SNNs) simulators are essential tools to prototype biologically inspired models and neuromorphic hardware architectures and predict their performance. For such a tool, ease of use and flexibility are critical, but so is simulation speed especially given the complexity inherent to simulating SNN. Here, we present SNNAX, a JAX-based framework for simulating and training such models with PyTorch-like intuitiveness and JAX-like execution speed. SNNAX models are easily extended and customized to fit the desired model specifications and target neuromorphic hardware. Additionally, SNNAX offers key features for optimizing the training and deployment of SNNs such as flexible automatic differentiation and just-in-time compilation. We evaluate and compare SNNAX to other commonly used machine learning (ML) frameworks used for programming SNNs. We provide key performance metrics, best practices, documented examples for simulating SNNs in SNNAX, and implement several benchmarks used in the literature.
{"title":"SNNAX -- Spiking Neural Networks in JAX","authors":"Jamie Lohoff, Jan Finkbeiner, Emre Neftci","doi":"arxiv-2409.02842","DOIUrl":"https://doi.org/arxiv-2409.02842","url":null,"abstract":"Spiking Neural Networks (SNNs) simulators are essential tools to prototype\u0000biologically inspired models and neuromorphic hardware architectures and\u0000predict their performance. For such a tool, ease of use and flexibility are\u0000critical, but so is simulation speed especially given the complexity inherent\u0000to simulating SNN. Here, we present SNNAX, a JAX-based framework for simulating\u0000and training such models with PyTorch-like intuitiveness and JAX-like execution\u0000speed. SNNAX models are easily extended and customized to fit the desired model\u0000specifications and target neuromorphic hardware. Additionally, SNNAX offers key\u0000features for optimizing the training and deployment of SNNs such as flexible\u0000automatic differentiation and just-in-time compilation. We evaluate and compare\u0000SNNAX to other commonly used machine learning (ML) frameworks used for\u0000programming SNNs. We provide key performance metrics, best practices,\u0000documented examples for simulating SNNs in SNNAX, and implement several\u0000benchmarks used in the literature.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
On-device computing, or edge computing, is becoming increasingly important for remote sensing, particularly in applications like deep network-based perception on on-orbit satellites and unmanned aerial vehicles (UAVs). In these scenarios, two brain-like capabilities are crucial for remote sensing models: (1) high energy efficiency, allowing the model to operate on edge devices with limited computing resources, and (2) online adaptation, enabling the model to quickly adapt to environmental variations, weather changes, and sensor drift. This work addresses these needs by proposing an online adaptation framework based on spiking neural networks (SNNs) for remote sensing. Starting with a pretrained SNN model, we design an efficient, unsupervised online adaptation algorithm, which adopts an approximation of the BPTT algorithm and only involves forward-in-time computation that significantly reduces the computational complexity of SNN adaptation learning. Besides, we propose an adaptive activation scaling scheme to boost online SNN adaptation performance, particularly in low time-steps. Furthermore, for the more challenging remote sensing detection task, we propose a confidence-based instance weighting scheme, which substantially improves adaptation performance in the detection task. To our knowledge, this work is the first to address the online adaptation of SNNs. Extensive experiments on seven benchmark datasets across classification, segmentation, and detection tasks demonstrate that our proposed method significantly outperforms existing domain adaptation and domain generalization approaches under varying weather conditions. The proposed method enables energy-efficient and fast online adaptation on edge devices, and has much potential in applications such as remote perception on on-orbit satellites and UAV.
{"title":"Brain-Inspired Online Adaptation for Remote Sensing with Spiking Neural Network","authors":"Dexin Duan, Peilin liu, Fei Wen","doi":"arxiv-2409.02146","DOIUrl":"https://doi.org/arxiv-2409.02146","url":null,"abstract":"On-device computing, or edge computing, is becoming increasingly important\u0000for remote sensing, particularly in applications like deep network-based\u0000perception on on-orbit satellites and unmanned aerial vehicles (UAVs). In these\u0000scenarios, two brain-like capabilities are crucial for remote sensing models:\u0000(1) high energy efficiency, allowing the model to operate on edge devices with\u0000limited computing resources, and (2) online adaptation, enabling the model to\u0000quickly adapt to environmental variations, weather changes, and sensor drift.\u0000This work addresses these needs by proposing an online adaptation framework\u0000based on spiking neural networks (SNNs) for remote sensing. Starting with a\u0000pretrained SNN model, we design an efficient, unsupervised online adaptation\u0000algorithm, which adopts an approximation of the BPTT algorithm and only\u0000involves forward-in-time computation that significantly reduces the\u0000computational complexity of SNN adaptation learning. Besides, we propose an\u0000adaptive activation scaling scheme to boost online SNN adaptation performance,\u0000particularly in low time-steps. Furthermore, for the more challenging remote\u0000sensing detection task, we propose a confidence-based instance weighting\u0000scheme, which substantially improves adaptation performance in the detection\u0000task. To our knowledge, this work is the first to address the online adaptation\u0000of SNNs. Extensive experiments on seven benchmark datasets across\u0000classification, segmentation, and detection tasks demonstrate that our proposed\u0000method significantly outperforms existing domain adaptation and domain\u0000generalization approaches under varying weather conditions. The proposed method\u0000enables energy-efficient and fast online adaptation on edge devices, and has\u0000much potential in applications such as remote perception on on-orbit satellites\u0000and UAV.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the present paper, I describe a spiking neural network (SNN) architecture which, can be used in wide range of supervised learning classification tasks. It is assumed, that all participating signals (the classified object description, correct class label and SNN decision) have spiking nature. The distinctive feature of this architecture is a combination of prototypical network structures corresponding to different classes and significantly distinctive instances of one class (=columns) and functionally differing populations of neurons inside columns (=layers). The other distinctive feature is a novel combination of anti-Hebbian and dopamine-modulated plasticity. The plasticity rules are local and do not use the backpropagation principle. Besides that, as in my previous studies, I was guided by the requirement that the all neuron/plasticity models should be easily implemented on modern neurochips. I illustrate the high performance of my network on the MNIST benchmark.
{"title":"CoLaNET -- A Spiking Neural Network with Columnar Layered Architecture for Classification","authors":"Mikhail Kiselev","doi":"arxiv-2409.01230","DOIUrl":"https://doi.org/arxiv-2409.01230","url":null,"abstract":"In the present paper, I describe a spiking neural network (SNN) architecture\u0000which, can be used in wide range of supervised learning classification tasks.\u0000It is assumed, that all participating signals (the classified object\u0000description, correct class label and SNN decision) have spiking nature. The\u0000distinctive feature of this architecture is a combination of prototypical\u0000network structures corresponding to different classes and significantly\u0000distinctive instances of one class (=columns) and functionally differing\u0000populations of neurons inside columns (=layers). The other distinctive feature\u0000is a novel combination of anti-Hebbian and dopamine-modulated plasticity. The\u0000plasticity rules are local and do not use the backpropagation principle.\u0000Besides that, as in my previous studies, I was guided by the requirement that\u0000the all neuron/plasticity models should be easily implemented on modern\u0000neurochips. I illustrate the high performance of my network on the MNIST\u0000benchmark.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fu Xing Long, Moritz Frenzel, Peter Krause, Markus Gitterle, Thomas Bäck, Niki van Stein
In landscape-aware algorithm selection problem, the effectiveness of feature-based predictive models strongly depends on the representativeness of training data for practical applications. In this work, we investigate the potential of randomly generated functions (RGF) for the model training, which cover a much more diverse set of optimization problem classes compared to the widely-used black-box optimization benchmarking (BBOB) suite. Correspondingly, we focus on automated algorithm configuration (AAC), that is, selecting the best suited algorithm and fine-tuning its hyperparameters based on the landscape features of problem instances. Precisely, we analyze the performance of dense neural network (NN) models in handling the multi-output mixed regression and classification tasks using different training data sets, such as RGF and many-affine BBOB (MA-BBOB) functions. Based on our results on the BBOB functions in 5d and 20d, near optimal configurations can be identified using the proposed approach, which can most of the time outperform the off-the-shelf default configuration considered by practitioners with limited knowledge about AAC. Furthermore, the predicted configurations are competitive against the single best solver in many cases. Overall, configurations with better performance can be best identified by using NN models trained on a combination of RGF and MA-BBOB functions.
{"title":"Landscape-Aware Automated Algorithm Configuration using Multi-output Mixed Regression and Classification","authors":"Fu Xing Long, Moritz Frenzel, Peter Krause, Markus Gitterle, Thomas Bäck, Niki van Stein","doi":"arxiv-2409.01446","DOIUrl":"https://doi.org/arxiv-2409.01446","url":null,"abstract":"In landscape-aware algorithm selection problem, the effectiveness of\u0000feature-based predictive models strongly depends on the representativeness of\u0000training data for practical applications. In this work, we investigate the\u0000potential of randomly generated functions (RGF) for the model training, which\u0000cover a much more diverse set of optimization problem classes compared to the\u0000widely-used black-box optimization benchmarking (BBOB) suite. Correspondingly,\u0000we focus on automated algorithm configuration (AAC), that is, selecting the\u0000best suited algorithm and fine-tuning its hyperparameters based on the\u0000landscape features of problem instances. Precisely, we analyze the performance\u0000of dense neural network (NN) models in handling the multi-output mixed\u0000regression and classification tasks using different training data sets, such as\u0000RGF and many-affine BBOB (MA-BBOB) functions. Based on our results on the BBOB\u0000functions in 5d and 20d, near optimal configurations can be identified using\u0000the proposed approach, which can most of the time outperform the off-the-shelf\u0000default configuration considered by practitioners with limited knowledge about\u0000AAC. Furthermore, the predicted configurations are competitive against the\u0000single best solver in many cases. Overall, configurations with better\u0000performance can be best identified by using NN models trained on a combination\u0000of RGF and MA-BBOB functions.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"68 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chris Lu, Michael Beukman, Michael Matthews, Jakob Foerster
Human intelligence emerged through the process of natural selection and evolution on Earth. We investigate what it would take to re-create this process in silico. While past work has often focused on low-level processes (such as simulating physics or chemistry), we instead take a more targeted approach, aiming to evolve agents that can accumulate open-ended culture and technologies across generations. Towards this, we present JaxLife: an artificial life simulator in which embodied agents, parameterized by deep neural networks, must learn to survive in an expressive world containing programmable systems. First, we describe the environment and show that it can facilitate meaningful Turing-complete computation. We then analyze the evolved emergent agents' behavior, such as rudimentary communication protocols, agriculture, and tool use. Finally, we investigate how complexity scales with the amount of compute used. We believe JaxLife takes a step towards studying evolved behavior in more open-ended simulations. Our code is available at https://github.com/luchris429/JaxLife
{"title":"JaxLife: An Open-Ended Agentic Simulator","authors":"Chris Lu, Michael Beukman, Michael Matthews, Jakob Foerster","doi":"arxiv-2409.00853","DOIUrl":"https://doi.org/arxiv-2409.00853","url":null,"abstract":"Human intelligence emerged through the process of natural selection and\u0000evolution on Earth. We investigate what it would take to re-create this process\u0000in silico. While past work has often focused on low-level processes (such as\u0000simulating physics or chemistry), we instead take a more targeted approach,\u0000aiming to evolve agents that can accumulate open-ended culture and technologies\u0000across generations. Towards this, we present JaxLife: an artificial life\u0000simulator in which embodied agents, parameterized by deep neural networks, must\u0000learn to survive in an expressive world containing programmable systems. First,\u0000we describe the environment and show that it can facilitate meaningful\u0000Turing-complete computation. We then analyze the evolved emergent agents'\u0000behavior, such as rudimentary communication protocols, agriculture, and tool\u0000use. Finally, we investigate how complexity scales with the amount of compute\u0000used. We believe JaxLife takes a step towards studying evolved behavior in more\u0000open-ended simulations. Our code is available at\u0000https://github.com/luchris429/JaxLife","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}