Meta-heuristic algorithmic development has been a thrust area of research since its inception. In this paper, a novel meta-heuristic optimization algorithm, Olive Ridley Survival (ORS), is proposed which is inspired from survival challenges faced by hatchlings of Olive Ridley sea turtle. A major fact about survival of Olive Ridley reveals that out of one thousand Olive Ridley hatchlings which emerge from nest, only one survive at sea due to various environmental and other factors. This fact acts as the backbone for developing the proposed algorithm. The algorithm has two major phases: hatchlings survival through environmental factors and impact of movement trajectory on its survival. The phases are mathematically modelled and implemented along with suitable input representation and fitness function. The algorithm is analysed theoretically. To validate the algorithm, fourteen mathematical benchmark functions from standard CEC test suites are evaluated and statistically tested. Also, to study the efficacy of ORS on recent complex benchmark functions, ten benchmark functions of CEC-06-2019 are evaluated. Further, three well-known engineering problems are solved by ORS and compared with other state-of-the-art meta-heuristics. Simulation results show that in many cases, the proposed ORS algorithm outperforms some state-of-the-art meta-heuristic optimization algorithms. The sub-optimal behavior of ORS in some recent benchmark functions is also observed.
{"title":"ORS: A novel Olive Ridley Survival inspired Meta-heuristic Optimization Algorithm","authors":"Niranjan Panigrahi, Sourav Kumar Bhoi, Debasis Mohapatra, Rashmi Ranjan Sahoo, Kshira Sagar Sahoo, Anil Mohapatra","doi":"arxiv-2409.09210","DOIUrl":"https://doi.org/arxiv-2409.09210","url":null,"abstract":"Meta-heuristic algorithmic development has been a thrust area of research\u0000since its inception. In this paper, a novel meta-heuristic optimization\u0000algorithm, Olive Ridley Survival (ORS), is proposed which is inspired from\u0000survival challenges faced by hatchlings of Olive Ridley sea turtle. A major\u0000fact about survival of Olive Ridley reveals that out of one thousand Olive\u0000Ridley hatchlings which emerge from nest, only one survive at sea due to\u0000various environmental and other factors. This fact acts as the backbone for\u0000developing the proposed algorithm. The algorithm has two major phases:\u0000hatchlings survival through environmental factors and impact of movement\u0000trajectory on its survival. The phases are mathematically modelled and\u0000implemented along with suitable input representation and fitness function. The\u0000algorithm is analysed theoretically. To validate the algorithm, fourteen\u0000mathematical benchmark functions from standard CEC test suites are evaluated\u0000and statistically tested. Also, to study the efficacy of ORS on recent complex\u0000benchmark functions, ten benchmark functions of CEC-06-2019 are evaluated.\u0000Further, three well-known engineering problems are solved by ORS and compared\u0000with other state-of-the-art meta-heuristics. Simulation results show that in\u0000many cases, the proposed ORS algorithm outperforms some state-of-the-art\u0000meta-heuristic optimization algorithms. The sub-optimal behavior of ORS in some\u0000recent benchmark functions is also observed.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs), the models inspired by the mechanisms of real neurons in the brain, transmit and represent information by employing discrete action potentials or spikes. The sparse, asynchronous properties of information processing make SNNs highly energy efficient, leading to SNNs being promising solutions for implementing neural networks in neuromorphic devices. However, the nondifferentiable nature of SNN neurons makes it a challenge to train them. The current training methods of SNNs that are based on error backpropagation (BP) and precisely designing surrogate gradient are difficult to implement and biologically implausible, hindering the implementation of SNNs on neuromorphic devices. Thus, it is important to train SNNs with a method that is both physically implementatable and biologically plausible. In this paper, we propose using augmented direct feedback alignment (aDFA), a gradient-free approach based on random projection, to train SNNs. This method requires only partial information of the forward process during training, so it is easy to implement and biologically plausible. We systematically demonstrate the feasibility of the proposed aDFA-SNNs scheme, propose its effective working range, and analyze its well-performing settings by employing genetic algorithm. We also analyze the impact of crucial features of SNNs on the scheme, thus demonstrating its superiority and stability over BP and conventional direct feedback alignment. Our scheme can achieve competitive performance without accurate prior knowledge about the utilized system, thus providing a valuable reference for physically training SNNs.
{"title":"Training Spiking Neural Networks via Augmented Direct Feedback Alignment","authors":"Yongbo Zhang, Katsuma Inoue, Mitsumasa Nakajima, Toshikazu Hashimoto, Yasuo Kuniyoshi, Kohei Nakajima","doi":"arxiv-2409.07776","DOIUrl":"https://doi.org/arxiv-2409.07776","url":null,"abstract":"Spiking neural networks (SNNs), the models inspired by the mechanisms of real\u0000neurons in the brain, transmit and represent information by employing discrete\u0000action potentials or spikes. The sparse, asynchronous properties of information\u0000processing make SNNs highly energy efficient, leading to SNNs being promising\u0000solutions for implementing neural networks in neuromorphic devices. However,\u0000the nondifferentiable nature of SNN neurons makes it a challenge to train them.\u0000The current training methods of SNNs that are based on error backpropagation\u0000(BP) and precisely designing surrogate gradient are difficult to implement and\u0000biologically implausible, hindering the implementation of SNNs on neuromorphic\u0000devices. Thus, it is important to train SNNs with a method that is both\u0000physically implementatable and biologically plausible. In this paper, we\u0000propose using augmented direct feedback alignment (aDFA), a gradient-free\u0000approach based on random projection, to train SNNs. This method requires only\u0000partial information of the forward process during training, so it is easy to\u0000implement and biologically plausible. We systematically demonstrate the\u0000feasibility of the proposed aDFA-SNNs scheme, propose its effective working\u0000range, and analyze its well-performing settings by employing genetic algorithm.\u0000We also analyze the impact of crucial features of SNNs on the scheme, thus\u0000demonstrating its superiority and stability over BP and conventional direct\u0000feedback alignment. Our scheme can achieve competitive performance without\u0000accurate prior knowledge about the utilized system, thus providing a valuable\u0000reference for physically training SNNs.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the present paper, it is shown how the columnar/layered CoLaNET spiking neural network (SNN) architecture can be used in supervised learning image classification tasks. Image pixel brightness is coded by the spike count during image presentation period. Image class label is indicated by activity of special SNN input nodes (one node per class). The CoLaNET classification accuracy is evaluated on the MNIST benchmark. It is demonstrated that CoLaNET is almost as accurate as the most advanced machine learning algorithms (not using convolutional approach).
{"title":"Classifying Images with CoLaNET Spiking Neural Network -- the MNIST Example","authors":"Mikhail Kiselev","doi":"arxiv-2409.07833","DOIUrl":"https://doi.org/arxiv-2409.07833","url":null,"abstract":"In the present paper, it is shown how the columnar/layered CoLaNET spiking\u0000neural network (SNN) architecture can be used in supervised learning image\u0000classification tasks. Image pixel brightness is coded by the spike count during\u0000image presentation period. Image class label is indicated by activity of\u0000special SNN input nodes (one node per class). The CoLaNET classification\u0000accuracy is evaluated on the MNIST benchmark. It is demonstrated that CoLaNET\u0000is almost as accurate as the most advanced machine learning algorithms (not\u0000using convolutional approach).","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper explores the integration of Diophantine equations into neural network (NN) architectures to improve model interpretability, stability, and efficiency. By encoding and decoding neural network parameters as integer solutions to Diophantine equations, we introduce a novel approach that enhances both the precision and robustness of deep learning models. Our method integrates a custom loss function that enforces Diophantine constraints during training, leading to better generalization, reduced error bounds, and enhanced resilience against adversarial attacks. We demonstrate the efficacy of this approach through several tasks, including image classification and natural language processing, where improvements in accuracy, convergence, and robustness are observed. This study offers a new perspective on combining mathematical theory and machine learning to create more interpretable and efficient models.
{"title":"Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding","authors":"Ronald Katende","doi":"arxiv-2409.07310","DOIUrl":"https://doi.org/arxiv-2409.07310","url":null,"abstract":"This paper explores the integration of Diophantine equations into neural\u0000network (NN) architectures to improve model interpretability, stability, and\u0000efficiency. By encoding and decoding neural network parameters as integer\u0000solutions to Diophantine equations, we introduce a novel approach that enhances\u0000both the precision and robustness of deep learning models. Our method\u0000integrates a custom loss function that enforces Diophantine constraints during\u0000training, leading to better generalization, reduced error bounds, and enhanced\u0000resilience against adversarial attacks. We demonstrate the efficacy of this\u0000approach through several tasks, including image classification and natural\u0000language processing, where improvements in accuracy, convergence, and\u0000robustness are observed. This study offers a new perspective on combining\u0000mathematical theory and machine learning to create more interpretable and\u0000efficient models.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we introduce Y-Drop, a regularization method that biases the dropout algorithm towards dropping more important neurons with higher probability. The backbone of our approach is neuron conductance, an interpretable measure of neuron importance that calculates the contribution of each neuron towards the end-to-end mapping of the network. We investigate the impact of the uniform dropout selection criterion on performance by assigning higher dropout probability to the more important units. We show that forcing the network to solve the task at hand in the absence of its important units yields a strong regularization effect. Further analysis indicates that Y-Drop yields solutions where more neurons are important, i.e have high conductance, and yields robust networks. In our experiments we show that the regularization effect of Y-Drop scales better than vanilla dropout w.r.t. the architecture size and consistently yields superior performance over multiple datasets and architecture combinations, with little tuning.
{"title":"Y-Drop: A Conductance based Dropout for fully connected layers","authors":"Efthymios Georgiou, Georgios Paraskevopoulos, Alexandros Potamianos","doi":"arxiv-2409.09088","DOIUrl":"https://doi.org/arxiv-2409.09088","url":null,"abstract":"In this work, we introduce Y-Drop, a regularization method that biases the\u0000dropout algorithm towards dropping more important neurons with higher\u0000probability. The backbone of our approach is neuron conductance, an\u0000interpretable measure of neuron importance that calculates the contribution of\u0000each neuron towards the end-to-end mapping of the network. We investigate the\u0000impact of the uniform dropout selection criterion on performance by assigning\u0000higher dropout probability to the more important units. We show that forcing\u0000the network to solve the task at hand in the absence of its important units\u0000yields a strong regularization effect. Further analysis indicates that Y-Drop\u0000yields solutions where more neurons are important, i.e have high conductance,\u0000and yields robust networks. In our experiments we show that the regularization\u0000effect of Y-Drop scales better than vanilla dropout w.r.t. the architecture\u0000size and consistently yields superior performance over multiple datasets and\u0000architecture combinations, with little tuning.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"190 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alzheimer's Disease is an incurable cognitive condition that affects thousands of people globally. While some diagnostic methods exist for Alzheimer's Disease, many of these methods cannot detect Alzheimer's in its earlier stages. Recently, researchers have explored the use of Electroencephalogram (EEG) technology for diagnosing Alzheimer's. EEG is a noninvasive method of recording the brain's electrical signals, and EEG data has shown distinct differences between patients with and without Alzheimer's. In the past, Artificial Neural Networks (ANNs) have been used to predict Alzheimer's from EEG data, but these models sometimes produce false positive diagnoses. This study aims to compare losses between ANNs and Kolmogorov-Arnold Networks (KANs) across multiple types of epochs, learning rates, and nodes. The results show that across these different parameters, ANNs are more accurate in predicting Alzheimer's Disease from EEG signals.
{"title":"A Comprehensive Comparison Between ANNs and KANs For Classifying EEG Alzheimer's Data","authors":"Akshay Sunkara, Sriram Sattiraju, Aakarshan Kumar, Zaryab Kanjiani, Himesh Anumala","doi":"arxiv-2409.05989","DOIUrl":"https://doi.org/arxiv-2409.05989","url":null,"abstract":"Alzheimer's Disease is an incurable cognitive condition that affects\u0000thousands of people globally. While some diagnostic methods exist for\u0000Alzheimer's Disease, many of these methods cannot detect Alzheimer's in its\u0000earlier stages. Recently, researchers have explored the use of\u0000Electroencephalogram (EEG) technology for diagnosing Alzheimer's. EEG is a\u0000noninvasive method of recording the brain's electrical signals, and EEG data\u0000has shown distinct differences between patients with and without Alzheimer's.\u0000In the past, Artificial Neural Networks (ANNs) have been used to predict\u0000Alzheimer's from EEG data, but these models sometimes produce false positive\u0000diagnoses. This study aims to compare losses between ANNs and Kolmogorov-Arnold\u0000Networks (KANs) across multiple types of epochs, learning rates, and nodes. The\u0000results show that across these different parameters, ANNs are more accurate in\u0000predicting Alzheimer's Disease from EEG signals.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"160 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuxiao Huang, Xuebin Lv, Shenghao Wu, Jibin Wu, Liang Feng, Kay Chen Tan
Evolutionary Multi-task Optimization (EMTO) is a paradigm that leverages knowledge transfer across simultaneously optimized tasks for enhanced search performance. To facilitate EMTO's performance, various knowledge transfer models have been developed for specific optimization tasks. However, designing these models often requires substantial expert knowledge. Recently, large language models (LLMs) have achieved remarkable success in autonomous programming, aiming to produce effective solvers for specific problems. In this work, a LLM-based optimization paradigm is introduced to establish an autonomous model factory for generating knowledge transfer models, ensuring effective and efficient knowledge transfer across various optimization tasks. To evaluate the performance of the proposed method, we conducted comprehensive empirical studies comparing the knowledge transfer model generated by the LLM with existing state-of-the-art knowledge transfer methods. The results demonstrate that the generated model is able to achieve superior or competitive performance against hand-crafted knowledge transfer models in terms of both efficiency and effectiveness.
{"title":"Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models","authors":"Yuxiao Huang, Xuebin Lv, Shenghao Wu, Jibin Wu, Liang Feng, Kay Chen Tan","doi":"arxiv-2409.04270","DOIUrl":"https://doi.org/arxiv-2409.04270","url":null,"abstract":"Evolutionary Multi-task Optimization (EMTO) is a paradigm that leverages\u0000knowledge transfer across simultaneously optimized tasks for enhanced search\u0000performance. To facilitate EMTO's performance, various knowledge transfer\u0000models have been developed for specific optimization tasks. However, designing\u0000these models often requires substantial expert knowledge. Recently, large\u0000language models (LLMs) have achieved remarkable success in autonomous\u0000programming, aiming to produce effective solvers for specific problems. In this\u0000work, a LLM-based optimization paradigm is introduced to establish an\u0000autonomous model factory for generating knowledge transfer models, ensuring\u0000effective and efficient knowledge transfer across various optimization tasks.\u0000To evaluate the performance of the proposed method, we conducted comprehensive\u0000empirical studies comparing the knowledge transfer model generated by the LLM\u0000with existing state-of-the-art knowledge transfer methods. The results\u0000demonstrate that the generated model is able to achieve superior or competitive\u0000performance against hand-crafted knowledge transfer models in terms of both\u0000efficiency and effectiveness.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bilevel optimization problems comprise an upper level optimization task that contains a lower level optimization task as a constraint. While there is a significant and growing literature devoted to solving bilevel problems with single objective at both levels using evolutionary computation, there is relatively scarce work done to address problems with multiple objectives (BLMOP) at both levels. For black-box BLMOPs, the existing evolutionary techniques typically utilize nested search, which in its native form consumes large number of function evaluations. In this work, we propose to reduce this expense by predicting the lower level Pareto set for a candidate upper level solution directly, instead of conducting an optimization from scratch. Such a prediction is significantly challenging for BLMOPs as it involves one-to-many mapping scenario. We resolve this bottleneck by supplementing the dataset using a helper variable and construct a neural network, which can then be trained to map the variables in a meaningful manner. Then, we embed this initialization within a bilevel optimization framework, termed Pareto set prediction assisted evolutionary bilevel multi-objective optimization (PSP-BLEMO). Systematic experiments with existing state-of-the-art methods are presented to demonstrate its benefit. The experiments show that the proposed approach is competitive across a range of problems, including both deceptive and non-deceptive problems
{"title":"Pareto Set Prediction Assisted Bilevel Multi-objective Optimization","authors":"Bing Wang, Hemant K. Singh, Tapabrata Ray","doi":"arxiv-2409.03328","DOIUrl":"https://doi.org/arxiv-2409.03328","url":null,"abstract":"Bilevel optimization problems comprise an upper level optimization task that\u0000contains a lower level optimization task as a constraint. While there is a\u0000significant and growing literature devoted to solving bilevel problems with\u0000single objective at both levels using evolutionary computation, there is\u0000relatively scarce work done to address problems with multiple objectives\u0000(BLMOP) at both levels. For black-box BLMOPs, the existing evolutionary\u0000techniques typically utilize nested search, which in its native form consumes\u0000large number of function evaluations. In this work, we propose to reduce this\u0000expense by predicting the lower level Pareto set for a candidate upper level\u0000solution directly, instead of conducting an optimization from scratch. Such a\u0000prediction is significantly challenging for BLMOPs as it involves one-to-many\u0000mapping scenario. We resolve this bottleneck by supplementing the dataset using\u0000a helper variable and construct a neural network, which can then be trained to\u0000map the variables in a meaningful manner. Then, we embed this initialization\u0000within a bilevel optimization framework, termed Pareto set prediction assisted\u0000evolutionary bilevel multi-objective optimization (PSP-BLEMO). Systematic\u0000experiments with existing state-of-the-art methods are presented to demonstrate\u0000its benefit. The experiments show that the proposed approach is competitive\u0000across a range of problems, including both deceptive and non-deceptive problems","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking Neural Networks (SNNs) have emerged as a promising substitute for Artificial Neural Networks (ANNs) due to their advantages of fast inference and low power consumption. However, the lack of efficient training algorithms has hindered their widespread adoption. Existing supervised learning algorithms for SNNs require significantly more memory and time than their ANN counterparts. Even commonly used ANN-SNN conversion methods necessitate re-training of ANNs to enhance conversion efficiency, incurring additional computational costs. To address these challenges, we propose a novel training-free ANN-SNN conversion pipeline. Our approach directly converts pre-trained ANN models into high-performance SNNs without additional training. The conversion pipeline includes a local-learning-based threshold balancing algorithm, which enables efficient calculation of the optimal thresholds and fine-grained adjustment of threshold value by channel-wise scaling. We demonstrate the scalability of our framework across three typical computer vision tasks: image classification, semantic segmentation, and object detection. This showcases its applicability to both classification and regression tasks. Moreover, we have evaluated the energy consumption of the converted SNNs, demonstrating their superior low-power advantage compared to conventional ANNs. Our training-free algorithm outperforms existing methods, highlighting its practical applicability and efficiency. This approach simplifies the deployment of SNNs by leveraging open-source pre-trained ANN models and neuromorphic hardware, enabling fast, low-power inference with negligible performance reduction.
尖峰神经网络(SNN)具有推理速度快、功耗低等优点,因此有望取代人工神经网络(ANN)。然而,高效训练算法的缺乏阻碍了其广泛应用。即使是常用的 ANN-SNN 转换方法,也需要重新训练 ANNN 以提高转换效率,从而产生额外的计算成本。为了应对这些挑战,我们提出了一种新型免训练 ANN-SNN 转换管道。我们的方法可直接将预先训练好的 ANN 模型转换为高性能 SNN,无需额外训练。转换管道包括基于本地学习的阈值平衡算法,该算法可以高效计算最佳阈值,并通过信道缩放对阈值进行细粒度调整。我们在三个典型的计算机视觉任务中展示了我们框架的可扩展性:图像分类、语义分割和物体检测。这展示了它对分类和回归任务的适用性。此外,我们还对转换后的 SNN 的能耗进行了评估,证明与传统 ANN 相比,SNN 具有更低功耗的优势。我们的免训练算法优于现有方法,凸显了其实用性和高效性。这种方法通过利用开源预训练 ANN 模型和神经形态硬件,简化了 SNN 的部署,实现了快速、低功耗推理,性能降低可忽略不计。
{"title":"Training-free Conversion of Pretrained ANNs to SNNs for Low-Power and High-Performance Applications","authors":"Tong Bu, Maohua Li, Zhaofei Yu","doi":"arxiv-2409.03368","DOIUrl":"https://doi.org/arxiv-2409.03368","url":null,"abstract":"Spiking Neural Networks (SNNs) have emerged as a promising substitute for\u0000Artificial Neural Networks (ANNs) due to their advantages of fast inference and\u0000low power consumption. However, the lack of efficient training algorithms has\u0000hindered their widespread adoption. Existing supervised learning algorithms for\u0000SNNs require significantly more memory and time than their ANN counterparts.\u0000Even commonly used ANN-SNN conversion methods necessitate re-training of ANNs\u0000to enhance conversion efficiency, incurring additional computational costs. To\u0000address these challenges, we propose a novel training-free ANN-SNN conversion\u0000pipeline. Our approach directly converts pre-trained ANN models into\u0000high-performance SNNs without additional training. The conversion pipeline\u0000includes a local-learning-based threshold balancing algorithm, which enables\u0000efficient calculation of the optimal thresholds and fine-grained adjustment of\u0000threshold value by channel-wise scaling. We demonstrate the scalability of our\u0000framework across three typical computer vision tasks: image classification,\u0000semantic segmentation, and object detection. This showcases its applicability\u0000to both classification and regression tasks. Moreover, we have evaluated the\u0000energy consumption of the converted SNNs, demonstrating their superior\u0000low-power advantage compared to conventional ANNs. Our training-free algorithm\u0000outperforms existing methods, highlighting its practical applicability and\u0000efficiency. This approach simplifies the deployment of SNNs by leveraging\u0000open-source pre-trained ANN models and neuromorphic hardware, enabling fast,\u0000low-power inference with negligible performance reduction.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking Neural Networks (SNNs) simulators are essential tools to prototype biologically inspired models and neuromorphic hardware architectures and predict their performance. For such a tool, ease of use and flexibility are critical, but so is simulation speed especially given the complexity inherent to simulating SNN. Here, we present SNNAX, a JAX-based framework for simulating and training such models with PyTorch-like intuitiveness and JAX-like execution speed. SNNAX models are easily extended and customized to fit the desired model specifications and target neuromorphic hardware. Additionally, SNNAX offers key features for optimizing the training and deployment of SNNs such as flexible automatic differentiation and just-in-time compilation. We evaluate and compare SNNAX to other commonly used machine learning (ML) frameworks used for programming SNNs. We provide key performance metrics, best practices, documented examples for simulating SNNs in SNNAX, and implement several benchmarks used in the literature.
{"title":"SNNAX -- Spiking Neural Networks in JAX","authors":"Jamie Lohoff, Jan Finkbeiner, Emre Neftci","doi":"arxiv-2409.02842","DOIUrl":"https://doi.org/arxiv-2409.02842","url":null,"abstract":"Spiking Neural Networks (SNNs) simulators are essential tools to prototype\u0000biologically inspired models and neuromorphic hardware architectures and\u0000predict their performance. For such a tool, ease of use and flexibility are\u0000critical, but so is simulation speed especially given the complexity inherent\u0000to simulating SNN. Here, we present SNNAX, a JAX-based framework for simulating\u0000and training such models with PyTorch-like intuitiveness and JAX-like execution\u0000speed. SNNAX models are easily extended and customized to fit the desired model\u0000specifications and target neuromorphic hardware. Additionally, SNNAX offers key\u0000features for optimizing the training and deployment of SNNs such as flexible\u0000automatic differentiation and just-in-time compilation. We evaluate and compare\u0000SNNAX to other commonly used machine learning (ML) frameworks used for\u0000programming SNNs. We provide key performance metrics, best practices,\u0000documented examples for simulating SNNs in SNNAX, and implement several\u0000benchmarks used in the literature.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142188235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}