Daniel Musekamp, Marimuthu Kalimuthu, David Holzmüller, Makoto Takamoto, Mathias Niepert
Solving partial differential equations (PDEs) is a fundamental problem in engineering and science. While neural PDE solvers can be more efficient than established numerical solvers, they often require large amounts of training data that is costly to obtain. Active Learning (AL) could help surrogate models reach the same accuracy with smaller training sets by querying classical solvers with more informative initial conditions and PDE parameters. While AL is more common in other domains, it has yet to be studied extensively for neural PDE solvers. To bridge this gap, we introduce AL4PDE, a modular and extensible active learning benchmark. It provides multiple parametric PDEs and state-of-the-art surrogate models for the solver-in-the-loop setting, enabling the evaluation of existing and the development of new AL methods for PDE solving. We use the benchmark to evaluate batch active learning algorithms such as uncertainty- and feature-based methods. We show that AL reduces the average error by up to 71% compared to random sampling and significantly reduces worst-case errors. Moreover, AL generates similar datasets across repeated runs, with consistent distributions over the PDE parameters and initial conditions. The acquired datasets are reusable, providing benefits for surrogate models not involved in the data generation.
{"title":"Active Learning for Neural PDE Solvers","authors":"Daniel Musekamp, Marimuthu Kalimuthu, David Holzmüller, Makoto Takamoto, Mathias Niepert","doi":"arxiv-2408.01536","DOIUrl":"https://doi.org/arxiv-2408.01536","url":null,"abstract":"Solving partial differential equations (PDEs) is a fundamental problem in\u0000engineering and science. While neural PDE solvers can be more efficient than\u0000established numerical solvers, they often require large amounts of training\u0000data that is costly to obtain. Active Learning (AL) could help surrogate models\u0000reach the same accuracy with smaller training sets by querying classical\u0000solvers with more informative initial conditions and PDE parameters. While AL\u0000is more common in other domains, it has yet to be studied extensively for\u0000neural PDE solvers. To bridge this gap, we introduce AL4PDE, a modular and\u0000extensible active learning benchmark. It provides multiple parametric PDEs and\u0000state-of-the-art surrogate models for the solver-in-the-loop setting, enabling\u0000the evaluation of existing and the development of new AL methods for PDE\u0000solving. We use the benchmark to evaluate batch active learning algorithms such\u0000as uncertainty- and feature-based methods. We show that AL reduces the average\u0000error by up to 71% compared to random sampling and significantly reduces\u0000worst-case errors. Moreover, AL generates similar datasets across repeated\u0000runs, with consistent distributions over the PDE parameters and initial\u0000conditions. The acquired datasets are reusable, providing benefits for\u0000surrogate models not involved in the data generation.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weihao Zhang, Yu Du, Hongyi Li, Songchen Ma, Rong Zhao
Neuromorphic computing exhibits great potential to provide high-performance benefits in various applications beyond neural networks. However, a general-purpose program execution model that aligns with the features of neuromorphic computing is required to bridge the gap between program versatility and neuromorphic hardware efficiency. The dataflow model offers a potential solution, but it faces high graph complexity and incompatibility with neuromorphic hardware when dealing with control flow programs, which decreases the programmability and performance. Here, we present a dataflow model tailored for neuromorphic hardware, called neuromorphic dataflow, which provides a compact, concise, and neuromorphic-compatible program representation for control logic. The neuromorphic dataflow introduces "when" and "where" primitives, which restructure the view of control. The neuromorphic dataflow embeds these primitives in the dataflow schema with the plasticity inherited from the spiking algorithms. Our method enables the deployment of general-purpose programs on neuromorphic hardware with both programmability and plasticity, while fully utilizing the hardware's potential.
{"title":"General-purpose Dataflow Model with Neuromorphic Primitives","authors":"Weihao Zhang, Yu Du, Hongyi Li, Songchen Ma, Rong Zhao","doi":"arxiv-2408.01090","DOIUrl":"https://doi.org/arxiv-2408.01090","url":null,"abstract":"Neuromorphic computing exhibits great potential to provide high-performance\u0000benefits in various applications beyond neural networks. However, a\u0000general-purpose program execution model that aligns with the features of\u0000neuromorphic computing is required to bridge the gap between program\u0000versatility and neuromorphic hardware efficiency. The dataflow model offers a\u0000potential solution, but it faces high graph complexity and incompatibility with\u0000neuromorphic hardware when dealing with control flow programs, which decreases\u0000the programmability and performance. Here, we present a dataflow model tailored\u0000for neuromorphic hardware, called neuromorphic dataflow, which provides a\u0000compact, concise, and neuromorphic-compatible program representation for\u0000control logic. The neuromorphic dataflow introduces \"when\" and \"where\"\u0000primitives, which restructure the view of control. The neuromorphic dataflow\u0000embeds these primitives in the dataflow schema with the plasticity inherited\u0000from the spiking algorithms. Our method enables the deployment of\u0000general-purpose programs on neuromorphic hardware with both programmability and\u0000plasticity, while fully utilizing the hardware's potential.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper explores the capability of continuous-time recurrent neural networks to store and recall precisely timed spike patterns. We show (by numerical experiments) that this is indeed possible: within some range of parameters, any random score of spike trains (for all neurons in the network) can be robustly memorized and autonomously reproduced with stable accurate relative timing of all spikes, with probability close to one. We also demonstrate associative recall under noisy conditions. In these experiments, the required synaptic weights are computed offline, to satisfy a template that encourages temporal stability.
{"title":"Continuous-Time Neural Networks Can Stably Memorize Random Spike Trains","authors":"Hugo Aguettaz, Hans-Andrea Loeliger","doi":"arxiv-2408.01166","DOIUrl":"https://doi.org/arxiv-2408.01166","url":null,"abstract":"The paper explores the capability of continuous-time recurrent neural\u0000networks to store and recall precisely timed spike patterns. We show (by\u0000numerical experiments) that this is indeed possible: within some range of\u0000parameters, any random score of spike trains (for all neurons in the network)\u0000can be robustly memorized and autonomously reproduced with stable accurate\u0000relative timing of all spikes, with probability close to one. We also\u0000demonstrate associative recall under noisy conditions. In these experiments, the required synaptic weights are computed offline, to\u0000satisfy a template that encourages temporal stability.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"113 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ria Patel, Sujit Tripathy, Zachary Sublett, Seoyoung An, Riya Patel
Recent advancements in bio-inspired visual sensing and neuromorphic computing have led to the development of various highly efficient bio-inspired solutions with real-world applications. One notable application integrates event-based cameras with spiking neural networks (SNNs) to process event-based sequences that are asynchronous and sparse, making them difficult to handle. In this project, we develop a convolutional spiking neural network (CSNN) architecture that leverages convolutional operations and recurrent properties of a spiking neuron to learn the spatial and temporal relations in the ASL-DVS gesture dataset. The ASL-DVS gesture dataset is a neuromorphic dataset containing hand gestures when displaying 24 letters (A to Y, excluding J and Z due to the nature of their symbols) from the American Sign Language (ASL). We performed classification on a pre-processed subset of the full ASL-DVS dataset to identify letter signs and achieved 100% training accuracy. Specifically, this was achieved by training in the Google Cloud compute platform while using a learning rate of 0.0005, batch size of 25 (total of 20 batches), 200 iterations, and 10 epochs.
{"title":"Using CSNNs to Perform Event-based Data Processing & Classification on ASL-DVS","authors":"Ria Patel, Sujit Tripathy, Zachary Sublett, Seoyoung An, Riya Patel","doi":"arxiv-2408.00611","DOIUrl":"https://doi.org/arxiv-2408.00611","url":null,"abstract":"Recent advancements in bio-inspired visual sensing and neuromorphic computing\u0000have led to the development of various highly efficient bio-inspired solutions\u0000with real-world applications. One notable application integrates event-based\u0000cameras with spiking neural networks (SNNs) to process event-based sequences\u0000that are asynchronous and sparse, making them difficult to handle. In this\u0000project, we develop a convolutional spiking neural network (CSNN) architecture\u0000that leverages convolutional operations and recurrent properties of a spiking\u0000neuron to learn the spatial and temporal relations in the ASL-DVS gesture\u0000dataset. The ASL-DVS gesture dataset is a neuromorphic dataset containing hand\u0000gestures when displaying 24 letters (A to Y, excluding J and Z due to the\u0000nature of their symbols) from the American Sign Language (ASL). We performed\u0000classification on a pre-processed subset of the full ASL-DVS dataset to\u0000identify letter signs and achieved 100% training accuracy. Specifically, this\u0000was achieved by training in the Google Cloud compute platform while using a\u0000learning rate of 0.0005, batch size of 25 (total of 20 batches), 200\u0000iterations, and 10 epochs.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"82 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141886829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Fu, Xinpeng Zhang, Jixiang Ma, Peng Zhao, Shuai Lu, Xu T. Liu
Convolution is the core component within deep neural networks and it is computationally intensive and time consuming. Tensor data layouts significantly impact convolution operations in terms of memory access and computational efficiency. Yet, there is still a lack of comprehensive performance characterization on data layouts on SIMD architectures concerning convolution methods. This paper proposes three novel data layouts for im2win convolution: NHWC, CHWN, and CHWN8, and introduces a set of general optimization techniques for both direct and im2win convolutions. We compare the optimized im2win convolution with the direct convolution and PyTorch's im2col-based convolution across the aforementioned layouts on SIMD machines. The experiments demonstrated that the im2win convolution with the new NHWC layout achieved up to 355% performance speedup over NCHW layout. Our optimizations also significantly improve the performance of both im2win and direct convolutions. Our optimized im2win and direct convolutions achieved up to 95% and 94% of machine's theoretical peak performance, respectively.
{"title":"High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures","authors":"Xiang Fu, Xinpeng Zhang, Jixiang Ma, Peng Zhao, Shuai Lu, Xu T. Liu","doi":"arxiv-2408.00278","DOIUrl":"https://doi.org/arxiv-2408.00278","url":null,"abstract":"Convolution is the core component within deep neural networks and it is\u0000computationally intensive and time consuming. Tensor data layouts significantly\u0000impact convolution operations in terms of memory access and computational\u0000efficiency. Yet, there is still a lack of comprehensive performance\u0000characterization on data layouts on SIMD architectures concerning convolution\u0000methods. This paper proposes three novel data layouts for im2win convolution:\u0000NHWC, CHWN, and CHWN8, and introduces a set of general optimization techniques\u0000for both direct and im2win convolutions. We compare the optimized im2win\u0000convolution with the direct convolution and PyTorch's im2col-based convolution\u0000across the aforementioned layouts on SIMD machines. The experiments\u0000demonstrated that the im2win convolution with the new NHWC layout achieved up\u0000to 355% performance speedup over NCHW layout. Our optimizations also\u0000significantly improve the performance of both im2win and direct convolutions.\u0000Our optimized im2win and direct convolutions achieved up to 95% and 94% of\u0000machine's theoretical peak performance, respectively.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"98 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141886831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johannes J. Pienaar, Anna S. Bosman, Katherine M. Malan
Landscape analysis aims to characterise optimisation problems based on their objective (or fitness) function landscape properties. The problem search space is typically sampled, and various landscape features are estimated based on the samples. One particularly salient set of features is information content, which requires the samples to be sequences of neighbouring solutions, such that the local relationships between consecutive sample points are preserved. Generating such spatially correlated samples that also provide good search space coverage is challenging. It is therefore common to first obtain an unordered sample with good search space coverage, and then apply an ordering algorithm such as the nearest neighbour to minimise the distance between consecutive points in the sample. However, the nearest neighbour algorithm becomes computationally prohibitive in higher dimensions, thus there is a need for more efficient alternatives. In this study, Hilbert space-filling curves are proposed as a method to efficiently obtain high-quality ordered samples. Hilbert curves are a special case of fractal curves, and guarantee uniform coverage of a bounded search space while providing a spatially correlated sample. We study the effectiveness of Hilbert curves as samplers, and discover that they are capable of extracting salient features at a fraction of the computational cost compared to Latin hypercube sampling with post-factum ordering. Further, we investigate the use of Hilbert curves as an ordering strategy, and find that they order the sample significantly faster than the nearest neighbour ordering, without sacrificing the saliency of the extracted features.
{"title":"Hilbert curves for efficient exploratory landscape analysis neighbourhood sampling","authors":"Johannes J. Pienaar, Anna S. Bosman, Katherine M. Malan","doi":"arxiv-2408.00526","DOIUrl":"https://doi.org/arxiv-2408.00526","url":null,"abstract":"Landscape analysis aims to characterise optimisation problems based on their\u0000objective (or fitness) function landscape properties. The problem search space\u0000is typically sampled, and various landscape features are estimated based on the\u0000samples. One particularly salient set of features is information content, which\u0000requires the samples to be sequences of neighbouring solutions, such that the\u0000local relationships between consecutive sample points are preserved. Generating\u0000such spatially correlated samples that also provide good search space coverage\u0000is challenging. It is therefore common to first obtain an unordered sample with\u0000good search space coverage, and then apply an ordering algorithm such as the\u0000nearest neighbour to minimise the distance between consecutive points in the\u0000sample. However, the nearest neighbour algorithm becomes computationally\u0000prohibitive in higher dimensions, thus there is a need for more efficient\u0000alternatives. In this study, Hilbert space-filling curves are proposed as a\u0000method to efficiently obtain high-quality ordered samples. Hilbert curves are a\u0000special case of fractal curves, and guarantee uniform coverage of a bounded\u0000search space while providing a spatially correlated sample. We study the\u0000effectiveness of Hilbert curves as samplers, and discover that they are capable\u0000of extracting salient features at a fraction of the computational cost compared\u0000to Latin hypercube sampling with post-factum ordering. Further, we investigate\u0000the use of Hilbert curves as an ordering strategy, and find that they order the\u0000sample significantly faster than the nearest neighbour ordering, without\u0000sacrificing the saliency of the extracted features.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141886848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Narendra Singh Dhakad, Yuvnish Malhotra, Santosh Kumar Vishvakarma, Kaushik Roy
This paper introduces a Scalable Hierarchical Aware Convolutional Neural Network (SHA-CNN) model architecture for Edge AI applications. The proposed hierarchical CNN model is meticulously crafted to strike a balance between computational efficiency and accuracy, addressing the challenges posed by resource-constrained edge devices. SHA-CNN demonstrates its efficacy by achieving accuracy comparable to state-of-the-art hierarchical models while outperforming baseline models in accuracy metrics. The key innovation lies in the model's hierarchical awareness, enabling it to discern and prioritize relevant features at multiple levels of abstraction. The proposed architecture classifies data in a hierarchical manner, facilitating a nuanced understanding of complex features within the datasets. Moreover, SHA-CNN exhibits a remarkable capacity for scalability, allowing for the seamless incorporation of new classes. This flexibility is particularly advantageous in dynamic environments where the model needs to adapt to evolving datasets and accommodate additional classes without the need for extensive retraining. Testing has been conducted on the PYNQ Z2 FPGA board to validate the proposed model. The results achieved an accuracy of 99.34%, 83.35%, and 63.66% for MNIST, CIFAR-10, and CIFAR-100 datasets, respectively. For CIFAR-100, our proposed architecture performs hierarchical classification with 10% reduced computation while compromising only 0.7% accuracy with the state-of-the-art. The adaptability of SHA-CNN to FPGA architecture underscores its potential for deployment in edge devices, where computational resources are limited. The SHA-CNN framework thus emerges as a promising advancement in the intersection of hierarchical CNNs, scalability, and FPGA-based Edge AI.
{"title":"SHA-CNN: Scalable Hierarchical Aware Convolutional Neural Network for Edge AI","authors":"Narendra Singh Dhakad, Yuvnish Malhotra, Santosh Kumar Vishvakarma, Kaushik Roy","doi":"arxiv-2407.21370","DOIUrl":"https://doi.org/arxiv-2407.21370","url":null,"abstract":"This paper introduces a Scalable Hierarchical Aware Convolutional Neural\u0000Network (SHA-CNN) model architecture for Edge AI applications. The proposed\u0000hierarchical CNN model is meticulously crafted to strike a balance between\u0000computational efficiency and accuracy, addressing the challenges posed by\u0000resource-constrained edge devices. SHA-CNN demonstrates its efficacy by\u0000achieving accuracy comparable to state-of-the-art hierarchical models while\u0000outperforming baseline models in accuracy metrics. The key innovation lies in\u0000the model's hierarchical awareness, enabling it to discern and prioritize\u0000relevant features at multiple levels of abstraction. The proposed architecture\u0000classifies data in a hierarchical manner, facilitating a nuanced understanding\u0000of complex features within the datasets. Moreover, SHA-CNN exhibits a\u0000remarkable capacity for scalability, allowing for the seamless incorporation of\u0000new classes. This flexibility is particularly advantageous in dynamic\u0000environments where the model needs to adapt to evolving datasets and\u0000accommodate additional classes without the need for extensive retraining.\u0000Testing has been conducted on the PYNQ Z2 FPGA board to validate the proposed\u0000model. The results achieved an accuracy of 99.34%, 83.35%, and 63.66% for\u0000MNIST, CIFAR-10, and CIFAR-100 datasets, respectively. For CIFAR-100, our\u0000proposed architecture performs hierarchical classification with 10% reduced\u0000computation while compromising only 0.7% accuracy with the state-of-the-art.\u0000The adaptability of SHA-CNN to FPGA architecture underscores its potential for\u0000deployment in edge devices, where computational resources are limited. The\u0000SHA-CNN framework thus emerges as a promising advancement in the intersection\u0000of hierarchical CNNs, scalability, and FPGA-based Edge AI.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"241 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kevin Godin-Dubois, Olivier Weissl, Karine Miras, Anna V. Kononova
We introduce here the concept of Artificial General Creatures (AGC) which encompasses "robotic or virtual agents with a wide enough range of capabilities to ensure their continued survival". With this in mind, we propose a research line aimed at incrementally building both the technology and the trustworthiness of AGC. The core element in this approach is that trust can only be built over time, through demonstrably mutually beneficial interactions. To this end, we advocate starting from unobtrusive, nonthreatening artificial agents that would explicitly collaborate with humans, similarly to what domestic animals do. By combining multiple research fields, from Evolutionary Robotics to Neuroscience, from Ethics to Human-Machine Interaction, we aim at creating embodied, self-sustaining Artificial General Creatures that would form social and emotional connections with humans. Although they would not be able to play competitive online games or generate poems, we argue that creatures akin to artificial pets would be invaluable stepping stones toward symbiotic Artificial General Intelligence.
{"title":"Interactive embodied evolution for socially adept Artificial General Creatures","authors":"Kevin Godin-Dubois, Olivier Weissl, Karine Miras, Anna V. Kononova","doi":"arxiv-2407.21357","DOIUrl":"https://doi.org/arxiv-2407.21357","url":null,"abstract":"We introduce here the concept of Artificial General Creatures (AGC) which\u0000encompasses \"robotic or virtual agents with a wide enough range of capabilities\u0000to ensure their continued survival\". With this in mind, we propose a research\u0000line aimed at incrementally building both the technology and the\u0000trustworthiness of AGC. The core element in this approach is that trust can\u0000only be built over time, through demonstrably mutually beneficial interactions. To this end, we advocate starting from unobtrusive, nonthreatening artificial\u0000agents that would explicitly collaborate with humans, similarly to what\u0000domestic animals do. By combining multiple research fields, from Evolutionary\u0000Robotics to Neuroscience, from Ethics to Human-Machine Interaction, we aim at\u0000creating embodied, self-sustaining Artificial General Creatures that would form\u0000social and emotional connections with humans. Although they would not be able\u0000to play competitive online games or generate poems, we argue that creatures\u0000akin to artificial pets would be invaluable stepping stones toward symbiotic\u0000Artificial General Intelligence.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, several new lexicase-based selection variants have emerged due to the success of standard lexicase selection in various application domains. For symbolic regression problems, variants that use an epsilon-threshold or batches of training cases, among others, have led to performance improvements. Lately, especially variants that combine lexicase selection and down-sampling strategies have received a lot of attention. This paper evaluates random as well as informed down-sampling in combination with the relevant lexicase-based selection methods on a wide range of symbolic regression problems. In contrast to most work, we not only compare the methods over a given evaluation budget, but also over a given time as time is usually limited in practice. We find that for a given evaluation budget, epsilon-lexicase selection in combination with random or informed down-sampling outperforms all other methods. Only for a rather long running time of 24h, the best performing method is tournament selection in combination with informed down-sampling. If the given running time is very short, lexicase variants using batches of training cases perform best.
{"title":"Lexicase-based Selection Methods with Down-sampling for Symbolic Regression Problems: Overview and Benchmark","authors":"Alina Geiger, Dominik Sobania, Franz Rothlauf","doi":"arxiv-2407.21632","DOIUrl":"https://doi.org/arxiv-2407.21632","url":null,"abstract":"In recent years, several new lexicase-based selection variants have emerged\u0000due to the success of standard lexicase selection in various application\u0000domains. For symbolic regression problems, variants that use an\u0000epsilon-threshold or batches of training cases, among others, have led to\u0000performance improvements. Lately, especially variants that combine lexicase\u0000selection and down-sampling strategies have received a lot of attention. This\u0000paper evaluates random as well as informed down-sampling in combination with\u0000the relevant lexicase-based selection methods on a wide range of symbolic\u0000regression problems. In contrast to most work, we not only compare the methods\u0000over a given evaluation budget, but also over a given time as time is usually\u0000limited in practice. We find that for a given evaluation budget,\u0000epsilon-lexicase selection in combination with random or informed down-sampling\u0000outperforms all other methods. Only for a rather long running time of 24h, the\u0000best performing method is tournament selection in combination with informed\u0000down-sampling. If the given running time is very short, lexicase variants using\u0000batches of training cases perform best.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samip Karki, Diego Chavez Arana, Andrew Sornborger, Francesco Caravelli
Reservoir computing is a promising approach for harnessing the computational power of recurrent neural networks while dramatically simplifying training. This paper investigates the application of integrate-and-fire neurons within reservoir computing frameworks for two distinct tasks: capturing chaotic dynamics of the H'enon map and forecasting the Mackey-Glass time series. Integrate-and-fire neurons can be implemented in low-power neuromorphic architectures such as Intel Loihi. We explore the impact of network topologies created through random interactions on the reservoir's performance. Our study reveals task-specific variations in network effectiveness, highlighting the importance of tailored architectures for distinct computational tasks. To identify optimal network configurations, we employ a meta-learning approach combined with simulated annealing. This method efficiently explores the space of possible network structures, identifying architectures that excel in different scenarios. The resulting networks demonstrate a range of behaviors, showcasing how inherent architectural features influence task-specific capabilities. We study the reservoir computing performance using a custom integrate-and-fire code, Intel's Lava neuromorphic computing software framework, and via an on-chip implementation in Loihi. We conclude with an analysis of the energy performance of the Loihi architecture.
{"title":"Neuromorphic on-chip reservoir computing with spiking neural network architectures","authors":"Samip Karki, Diego Chavez Arana, Andrew Sornborger, Francesco Caravelli","doi":"arxiv-2407.20547","DOIUrl":"https://doi.org/arxiv-2407.20547","url":null,"abstract":"Reservoir computing is a promising approach for harnessing the computational\u0000power of recurrent neural networks while dramatically simplifying training.\u0000This paper investigates the application of integrate-and-fire neurons within\u0000reservoir computing frameworks for two distinct tasks: capturing chaotic\u0000dynamics of the H'enon map and forecasting the Mackey-Glass time series.\u0000Integrate-and-fire neurons can be implemented in low-power neuromorphic\u0000architectures such as Intel Loihi. We explore the impact of network topologies\u0000created through random interactions on the reservoir's performance. Our study\u0000reveals task-specific variations in network effectiveness, highlighting the\u0000importance of tailored architectures for distinct computational tasks. To\u0000identify optimal network configurations, we employ a meta-learning approach\u0000combined with simulated annealing. This method efficiently explores the space\u0000of possible network structures, identifying architectures that excel in\u0000different scenarios. The resulting networks demonstrate a range of behaviors,\u0000showcasing how inherent architectural features influence task-specific\u0000capabilities. We study the reservoir computing performance using a custom\u0000integrate-and-fire code, Intel's Lava neuromorphic computing software\u0000framework, and via an on-chip implementation in Loihi. We conclude with an\u0000analysis of the energy performance of the Loihi architecture.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}