Khanh Doan, Long Tung Vuong, Tuan Nguyen, Anh Tuan Bui, Quyen Tran, Thanh-Toan Do, Dinh Phung, Trung Le
Diffusion models (DM) have become fundamental components of generative models, excelling across various domains such as image creation, audio generation, and complex data interpolation. Signal-to-Noise diffusion models constitute a diverse family covering most state-of-the-art diffusion models. While there have been several attempts to study Signal-to-Noise (S2N) diffusion models from various perspectives, there remains a need for a comprehensive study connecting different viewpoints and exploring new perspectives. In this study, we offer a comprehensive perspective on noise schedulers, examining their role through the lens of the signal-to-noise ratio (SNR) and its connections to information theory. Building upon this framework, we have developed a generalized backward equation to enhance the performance of the inference process.
{"title":"Connective Viewpoints of Signal-to-Noise Diffusion Models","authors":"Khanh Doan, Long Tung Vuong, Tuan Nguyen, Anh Tuan Bui, Quyen Tran, Thanh-Toan Do, Dinh Phung, Trung Le","doi":"arxiv-2408.04221","DOIUrl":"https://doi.org/arxiv-2408.04221","url":null,"abstract":"Diffusion models (DM) have become fundamental components of generative\u0000models, excelling across various domains such as image creation, audio\u0000generation, and complex data interpolation. Signal-to-Noise diffusion models\u0000constitute a diverse family covering most state-of-the-art diffusion models.\u0000While there have been several attempts to study Signal-to-Noise (S2N) diffusion\u0000models from various perspectives, there remains a need for a comprehensive\u0000study connecting different viewpoints and exploring new perspectives. In this\u0000study, we offer a comprehensive perspective on noise schedulers, examining\u0000their role through the lens of the signal-to-noise ratio (SNR) and its\u0000connections to information theory. Building upon this framework, we have\u0000developed a generalized backward equation to enhance the performance of the\u0000inference process.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The field of multiobjective evolutionary algorithms (MOEAs) often emphasizes its popularity for optimization problems with conflicting objectives. However, it is still theoretically unknown how MOEAs perform for different degrees of conflict, even for no conflicts, compared with typical approaches outside this field. As the first step to tackle this question, we propose the OneMaxMin$_k$ benchmark class with the degree of the conflict $kin[0..n]$, a generalized variant of COCZ and OneMinMax. Two typical non-MOEA approaches, scalarization (weighted-sum approach) and $epsilon$-constraint approach, are considered. We prove that for any set of weights, the set of optima found by scalarization approach cannot cover the full Pareto front. Although the set of the optima of constrained problems constructed via $epsilon$-constraint approach can cover the full Pareto front, the general used ways (via exterior or nonparameter penalty functions) to solve such constrained problems encountered difficulties. The nonparameter penalty function way cannot construct the set of optima whose function values are the Pareto front, and the exterior way helps (with expected runtime of $O(nln n)$ for the randomized local search algorithm for reaching any Pareto front point) but with careful settings of $epsilon$ and $r$ ($r>1/(epsilon+1-lceil epsilon rceil)$). In constrast, the generally analyzed MOEAs can efficiently solve OneMaxMin$_k$ without above careful designs. We prove that (G)SEMO, MOEA/D, NSGA-II, and SMS-EMOA can cover the full Pareto front in $O(max{k,1}nln n)$ expected number of function evaluations, which is the same asymptotic runtime as the exterior way in $epsilon$-constraint approach with careful settings. As a side result, our results also give the performance analysis of solving a constrained problem via multiobjective way.
{"title":"Theoretical Advantage of Multiobjective Evolutionary Algorithms for Problems with Different Degrees of Conflict","authors":"Weijie Zheng","doi":"arxiv-2408.04207","DOIUrl":"https://doi.org/arxiv-2408.04207","url":null,"abstract":"The field of multiobjective evolutionary algorithms (MOEAs) often emphasizes\u0000its popularity for optimization problems with conflicting objectives. However,\u0000it is still theoretically unknown how MOEAs perform for different degrees of\u0000conflict, even for no conflicts, compared with typical approaches outside this\u0000field. As the first step to tackle this question, we propose the OneMaxMin$_k$\u0000benchmark class with the degree of the conflict $kin[0..n]$, a generalized\u0000variant of COCZ and OneMinMax. Two typical non-MOEA approaches, scalarization\u0000(weighted-sum approach) and $epsilon$-constraint approach, are considered. We\u0000prove that for any set of weights, the set of optima found by scalarization\u0000approach cannot cover the full Pareto front. Although the set of the optima of\u0000constrained problems constructed via $epsilon$-constraint approach can cover\u0000the full Pareto front, the general used ways (via exterior or nonparameter\u0000penalty functions) to solve such constrained problems encountered difficulties.\u0000The nonparameter penalty function way cannot construct the set of optima whose\u0000function values are the Pareto front, and the exterior way helps (with expected\u0000runtime of $O(nln n)$ for the randomized local search algorithm for reaching\u0000any Pareto front point) but with careful settings of $epsilon$ and $r$\u0000($r>1/(epsilon+1-lceil epsilon rceil)$). In constrast, the generally analyzed MOEAs can efficiently solve\u0000OneMaxMin$_k$ without above careful designs. We prove that (G)SEMO, MOEA/D,\u0000NSGA-II, and SMS-EMOA can cover the full Pareto front in $O(max{k,1}nln n)$\u0000expected number of function evaluations, which is the same asymptotic runtime\u0000as the exterior way in $epsilon$-constraint approach with careful settings. As\u0000a side result, our results also give the performance analysis of solving a\u0000constrained problem via multiobjective way.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-objective evolutionary algorithms (MOEAs) have emerged as powerful tools for solving complex optimization problems characterized by multiple, often conflicting, objectives. While advancements have been made in computational efficiency as well as diversity and convergence of solutions, a critical challenge persists: the internal evolutionary mechanisms are opaque to human users. Drawing upon the successes of explainable AI in explaining complex algorithms and models, we argue that the need to understand the underlying evolutionary operators and population dynamics within MOEAs aligns well with a visual analytics paradigm. This paper introduces ParetoTracker, a visual analytics framework designed to support the comprehension and inspection of population dynamics in the evolutionary processes of MOEAs. Informed by preliminary literature review and expert interviews, the framework establishes a multi-level analysis scheme, which caters to user engagement and exploration ranging from examining overall trends in performance metrics to conducting fine-grained inspections of evolutionary operations. In contrast to conventional practices that require manual plotting of solutions for each generation, ParetoTracker facilitates the examination of temporal trends and dynamics across consecutive generations in an integrated visual interface. The effectiveness of the framework is demonstrated through case studies and expert interviews focused on widely adopted benchmark optimization problems.
{"title":"ParetoTracker: Understanding Population Dynamics in Multi-objective Evolutionary Algorithms through Visual Analytics","authors":"Zherui Zhang, Fan Yang, Ran Cheng, Yuxin Ma","doi":"arxiv-2408.04539","DOIUrl":"https://doi.org/arxiv-2408.04539","url":null,"abstract":"Multi-objective evolutionary algorithms (MOEAs) have emerged as powerful\u0000tools for solving complex optimization problems characterized by multiple,\u0000often conflicting, objectives. While advancements have been made in\u0000computational efficiency as well as diversity and convergence of solutions, a\u0000critical challenge persists: the internal evolutionary mechanisms are opaque to\u0000human users. Drawing upon the successes of explainable AI in explaining complex\u0000algorithms and models, we argue that the need to understand the underlying\u0000evolutionary operators and population dynamics within MOEAs aligns well with a\u0000visual analytics paradigm. This paper introduces ParetoTracker, a visual\u0000analytics framework designed to support the comprehension and inspection of\u0000population dynamics in the evolutionary processes of MOEAs. Informed by\u0000preliminary literature review and expert interviews, the framework establishes\u0000a multi-level analysis scheme, which caters to user engagement and exploration\u0000ranging from examining overall trends in performance metrics to conducting\u0000fine-grained inspections of evolutionary operations. In contrast to\u0000conventional practices that require manual plotting of solutions for each\u0000generation, ParetoTracker facilitates the examination of temporal trends and\u0000dynamics across consecutive generations in an integrated visual interface. The\u0000effectiveness of the framework is demonstrated through case studies and expert\u0000interviews focused on widely adopted benchmark optimization problems.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessandro Pierro, Philipp Stratmann, Gabriel Andres Fonseca Guerra, Sumedh Risbud, Timothy Shea, Ashish Rao Mangalore, Andreas Wild
In this article, we describe an algorithm for solving Quadratic Unconstrained Binary Optimization problems on the Intel Loihi 2 neuromorphic processor. The solver is based on a hardware-aware fine-grained parallel simulated annealing algorithm developed for Intel's neuromorphic research chip Loihi 2. Preliminary results show that our approach can generate feasible solutions in as little as 1 ms and up to 37x more energy efficient compared to two baseline solvers running on a CPU. These advantages could be especially relevant for size-, weight-, and power-constrained edge computing applications.
{"title":"Solving QUBO on the Loihi 2 Neuromorphic Processor","authors":"Alessandro Pierro, Philipp Stratmann, Gabriel Andres Fonseca Guerra, Sumedh Risbud, Timothy Shea, Ashish Rao Mangalore, Andreas Wild","doi":"arxiv-2408.03076","DOIUrl":"https://doi.org/arxiv-2408.03076","url":null,"abstract":"In this article, we describe an algorithm for solving Quadratic Unconstrained\u0000Binary Optimization problems on the Intel Loihi 2 neuromorphic processor. The\u0000solver is based on a hardware-aware fine-grained parallel simulated annealing\u0000algorithm developed for Intel's neuromorphic research chip Loihi 2. Preliminary\u0000results show that our approach can generate feasible solutions in as little as\u00001 ms and up to 37x more energy efficient compared to two baseline solvers\u0000running on a CPU. These advantages could be especially relevant for size-,\u0000weight-, and power-constrained edge computing applications.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dylan Adams, Magda Zajaczkowska, Ashiq Anjum, Andrea Soltoggio, Shirin Dora
Despite basic differences between Spiking Neural Networks (SNN) and Artificial Neural Networks (ANN), most research on SNNs involve adapting ANN-based methods for SNNs. Pruning (dropping connections) and quantization (reducing precision) are often used to improve energy efficiency of SNNs. These methods are very effective for ANNs whose energy needs are determined by signals transmitted on synapses. However, the event-driven paradigm in SNNs implies that energy is consumed by spikes. In this paper, we propose a new synapse model whose weights are modulated by Interspike Intervals (ISI) i.e. time difference between two spikes. SNNs composed of this synapse model, termed ISI Modulated SNNs (IMSNN), can use gradient descent to estimate how the ISI of a neuron changes after updating its synaptic parameters. A higher ISI implies fewer spikes and vice-versa. The learning algorithm for IMSNNs exploits this information to selectively propagate gradients such that learning is achieved by increasing the ISIs resulting in a network that generates fewer spikes. The performance of IMSNNs with dense and convolutional layers have been evaluated in terms of classification accuracy and the number of spikes using the MNIST and FashionMNIST datasets. The performance comparison with conventional SNNs shows that IMSNNs exhibit upto 90% reduction in the number of spikes while maintaining similar classification accuracy.
{"title":"Synaptic Modulation using Interspike Intervals Increases Energy Efficiency of Spiking Neural Networks","authors":"Dylan Adams, Magda Zajaczkowska, Ashiq Anjum, Andrea Soltoggio, Shirin Dora","doi":"arxiv-2408.02961","DOIUrl":"https://doi.org/arxiv-2408.02961","url":null,"abstract":"Despite basic differences between Spiking Neural Networks (SNN) and\u0000Artificial Neural Networks (ANN), most research on SNNs involve adapting\u0000ANN-based methods for SNNs. Pruning (dropping connections) and quantization\u0000(reducing precision) are often used to improve energy efficiency of SNNs. These\u0000methods are very effective for ANNs whose energy needs are determined by\u0000signals transmitted on synapses. However, the event-driven paradigm in SNNs\u0000implies that energy is consumed by spikes. In this paper, we propose a new\u0000synapse model whose weights are modulated by Interspike Intervals (ISI) i.e.\u0000time difference between two spikes. SNNs composed of this synapse model, termed\u0000ISI Modulated SNNs (IMSNN), can use gradient descent to estimate how the ISI of\u0000a neuron changes after updating its synaptic parameters. A higher ISI implies\u0000fewer spikes and vice-versa. The learning algorithm for IMSNNs exploits this\u0000information to selectively propagate gradients such that learning is achieved\u0000by increasing the ISIs resulting in a network that generates fewer spikes. The\u0000performance of IMSNNs with dense and convolutional layers have been evaluated\u0000in terms of classification accuracy and the number of spikes using the MNIST\u0000and FashionMNIST datasets. The performance comparison with conventional SNNs\u0000shows that IMSNNs exhibit upto 90% reduction in the number of spikes while\u0000maintaining similar classification accuracy.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"137 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad Shafique
Convolutional Neural Networks (CNNs), a prominent type of Deep Neural Networks (DNNs), have emerged as a state-of-the-art solution for solving machine learning tasks. To improve the performance and energy efficiency of CNN inference, the employment of specialized hardware accelerators is prevalent. However, CNN accelerators still face performance- and energy-efficiency challenges due to high off-chip memory (DRAM) access latency and energy, which are especially crucial for latency- and energy-constrained embedded applications. Moreover, different DRAM architectures have different profiles of access latency and energy, thus making it challenging to optimize them for high performance and energy-efficient CNN accelerators. To address this, we present PENDRAM, a novel design space exploration methodology that enables high-performance and energy-efficient CNN acceleration through a generalized DRAM data mapping policy. Specifically, it explores the impact of different DRAM data mapping policies and DRAM architectures across different CNN partitioning and scheduling schemes on the DRAM access latency and energy, then identifies the pareto-optimal design choices. The experimental results show that our DRAM data mapping policy improves the energy-delay-product of DRAM accesses in the CNN accelerator over other mapping policies by up to 96%. In this manner, our PENDRAM methodology offers high-performance and energy-efficient CNN acceleration under any given DRAM architectures for diverse embedded AI applications.
卷积神经网络(CNN)是深度神经网络(DNN)的一种重要类型,已成为解决机器学习任务的最先进解决方案。然而,由于芯片外存储器(DRAM)访问延迟和能耗较高,CNN 加速器仍然面临着性能和能效挑战,这对于延迟和能耗受限的嵌入式应用尤为重要。此外,不同的 DRAM 体系结构具有不同的访问延迟和能耗特征,因此要优化它们以实现高性能、高能效的 CNN 加速器具有挑战性。为了解决这个问题,我们提出了一种新颖的设计空间探索方法--PENDRAM,通过通用 DRAM 数据映射策略实现高性能、高能效的 CNN 加速。具体来说,它探索了不同 CNN 分区和调度方案中的不同 DRAM 数据映射策略和 DRAM 架构对 DRAM 访问延迟和能耗的影响,然后确定了帕累托最优设计选择。实验结果表明,与其他映射策略相比,我们的 DRAM 数据映射策略可将 CNN 加速器中 DRAM 访问的能耗-延迟积提高 96%。因此,我们的 PENDRAM 方法可在任何给定的 DRAM 架构下为各种嵌入式人工智能应用提供高性能、高能效的 CNN 加速。
{"title":"PENDRAM: Enabling High-Performance and Energy-Efficient Processing of Deep Neural Networks through a Generalized DRAM Data Mapping Policy","authors":"Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad Shafique","doi":"arxiv-2408.02412","DOIUrl":"https://doi.org/arxiv-2408.02412","url":null,"abstract":"Convolutional Neural Networks (CNNs), a prominent type of Deep Neural\u0000Networks (DNNs), have emerged as a state-of-the-art solution for solving\u0000machine learning tasks. To improve the performance and energy efficiency of CNN\u0000inference, the employment of specialized hardware accelerators is prevalent.\u0000However, CNN accelerators still face performance- and energy-efficiency\u0000challenges due to high off-chip memory (DRAM) access latency and energy, which\u0000are especially crucial for latency- and energy-constrained embedded\u0000applications. Moreover, different DRAM architectures have different profiles of\u0000access latency and energy, thus making it challenging to optimize them for high\u0000performance and energy-efficient CNN accelerators. To address this, we present\u0000PENDRAM, a novel design space exploration methodology that enables\u0000high-performance and energy-efficient CNN acceleration through a generalized\u0000DRAM data mapping policy. Specifically, it explores the impact of different\u0000DRAM data mapping policies and DRAM architectures across different CNN\u0000partitioning and scheduling schemes on the DRAM access latency and energy, then\u0000identifies the pareto-optimal design choices. The experimental results show\u0000that our DRAM data mapping policy improves the energy-delay-product of DRAM\u0000accesses in the CNN accelerator over other mapping policies by up to 96%. In\u0000this manner, our PENDRAM methodology offers high-performance and\u0000energy-efficient CNN acceleration under any given DRAM architectures for\u0000diverse embedded AI applications.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"84 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andoni I. Garmendia, Quentin Cappart, Josu Ceberio, Alexander Mendiburu
Neural Combinatorial Optimization (NCO) is an emerging domain where deep learning techniques are employed to address combinatorial optimization problems as a standalone solver. Despite their potential, existing NCO methods often suffer from inefficient search space exploration, frequently leading to local optima entrapment or redundant exploration of previously visited states. This paper introduces a versatile framework, referred to as Memory-Augmented Reinforcement for Combinatorial Optimization (MARCO), that can be used to enhance both constructive and improvement methods in NCO through an innovative memory module. MARCO stores data collected throughout the optimization trajectory and retrieves contextually relevant information at each state. This way, the search is guided by two competing criteria: making the best decision in terms of the quality of the solution and avoiding revisiting already explored solutions. This approach promotes a more efficient use of the available optimization budget. Moreover, thanks to the parallel nature of NCO models, several search threads can run simultaneously, all sharing the same memory module, enabling an efficient collaborative exploration. Empirical evaluations, carried out on the maximum cut, maximum independent set and travelling salesman problems, reveal that the memory module effectively increases the exploration, enabling the model to discover diverse, higher-quality solutions. MARCO achieves good performance in a low computational cost, establishing a promising new direction in the field of NCO.
{"title":"MARCO: A Memory-Augmented Reinforcement Framework for Combinatorial Optimization","authors":"Andoni I. Garmendia, Quentin Cappart, Josu Ceberio, Alexander Mendiburu","doi":"arxiv-2408.02207","DOIUrl":"https://doi.org/arxiv-2408.02207","url":null,"abstract":"Neural Combinatorial Optimization (NCO) is an emerging domain where deep\u0000learning techniques are employed to address combinatorial optimization problems\u0000as a standalone solver. Despite their potential, existing NCO methods often\u0000suffer from inefficient search space exploration, frequently leading to local\u0000optima entrapment or redundant exploration of previously visited states. This\u0000paper introduces a versatile framework, referred to as Memory-Augmented\u0000Reinforcement for Combinatorial Optimization (MARCO), that can be used to\u0000enhance both constructive and improvement methods in NCO through an innovative\u0000memory module. MARCO stores data collected throughout the optimization\u0000trajectory and retrieves contextually relevant information at each state. This\u0000way, the search is guided by two competing criteria: making the best decision\u0000in terms of the quality of the solution and avoiding revisiting already\u0000explored solutions. This approach promotes a more efficient use of the\u0000available optimization budget. Moreover, thanks to the parallel nature of NCO\u0000models, several search threads can run simultaneously, all sharing the same\u0000memory module, enabling an efficient collaborative exploration. Empirical\u0000evaluations, carried out on the maximum cut, maximum independent set and\u0000travelling salesman problems, reveal that the memory module effectively\u0000increases the exploration, enabling the model to discover diverse,\u0000higher-quality solutions. MARCO achieves good performance in a low\u0000computational cost, establishing a promising new direction in the field of NCO.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leonardo Lucio Custode, Fabio Caraffini, Anil Yaman, Giovanni Iacca
Hyperparameter optimization is a crucial problem in Evolutionary Computation. In fact, the values of the hyperparameters directly impact the trajectory taken by the optimization process, and their choice requires extensive reasoning by human operators. Although a variety of self-adaptive Evolutionary Algorithms have been proposed in the literature, no definitive solution has been found. In this work, we perform a preliminary investigation to automate the reasoning process that leads to the choice of hyperparameter values. We employ two open-source Large Language Models (LLMs), namely Llama2-70b and Mixtral, to analyze the optimization logs online and provide novel real-time hyperparameter recommendations. We study our approach in the context of step-size adaptation for (1+1)-ES. The results suggest that LLMs can be an effective method for optimizing hyperparameters in Evolution Strategies, encouraging further research in this direction.
{"title":"An investigation on the use of Large Language Models for hyperparameter tuning in Evolutionary Algorithms","authors":"Leonardo Lucio Custode, Fabio Caraffini, Anil Yaman, Giovanni Iacca","doi":"arxiv-2408.02451","DOIUrl":"https://doi.org/arxiv-2408.02451","url":null,"abstract":"Hyperparameter optimization is a crucial problem in Evolutionary Computation.\u0000In fact, the values of the hyperparameters directly impact the trajectory taken\u0000by the optimization process, and their choice requires extensive reasoning by\u0000human operators. Although a variety of self-adaptive Evolutionary Algorithms\u0000have been proposed in the literature, no definitive solution has been found. In\u0000this work, we perform a preliminary investigation to automate the reasoning\u0000process that leads to the choice of hyperparameter values. We employ two\u0000open-source Large Language Models (LLMs), namely Llama2-70b and Mixtral, to\u0000analyze the optimization logs online and provide novel real-time hyperparameter\u0000recommendations. We study our approach in the context of step-size adaptation\u0000for (1+1)-ES. The results suggest that LLMs can be an effective method for\u0000optimizing hyperparameters in Evolution Strategies, encouraging further\u0000research in this direction.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guo-Yun Lin, Zong-Gan Chen, Yuncheng Jiang, Zhi-Hui Zhan, Jun Zhang
How to simultaneously locate multiple global peaks and achieve certain accuracy on the found peaks are two key challenges in solving multimodal optimization problems (MMOPs). In this paper, a landscape-aware differential evolution (LADE) algorithm is proposed for MMOPs, which utilizes landscape knowledge to maintain sufficient diversity and provide efficient search guidance. In detail, the landscape knowledge is efficiently utilized in the following three aspects. First, a landscape-aware peak exploration helps each individual evolve adaptively to locate a peak and simulates the regions of the found peaks according to search history to avoid an individual locating a found peak. Second, a landscape-aware peak distinction distinguishes whether an individual locates a new global peak, a new local peak, or a found peak. Accuracy refinement can thus only be conducted on the global peaks to enhance the search efficiency. Third, a landscape-aware reinitialization specifies the initial position of an individual adaptively according to the distribution of the found peaks, which helps explore more peaks. The experiments are conducted on 20 widely-used benchmark MMOPs. Experimental results show that LADE obtains generally better or competitive performance compared with seven well-performed algorithms proposed recently and four winner algorithms in the IEEE CEC competitions for multimodal optimization.
{"title":"A Landscape-Aware Differential Evolution for Multimodal Optimization Problems","authors":"Guo-Yun Lin, Zong-Gan Chen, Yuncheng Jiang, Zhi-Hui Zhan, Jun Zhang","doi":"arxiv-2408.02340","DOIUrl":"https://doi.org/arxiv-2408.02340","url":null,"abstract":"How to simultaneously locate multiple global peaks and achieve certain\u0000accuracy on the found peaks are two key challenges in solving multimodal\u0000optimization problems (MMOPs). In this paper, a landscape-aware differential\u0000evolution (LADE) algorithm is proposed for MMOPs, which utilizes landscape\u0000knowledge to maintain sufficient diversity and provide efficient search\u0000guidance. In detail, the landscape knowledge is efficiently utilized in the\u0000following three aspects. First, a landscape-aware peak exploration helps each\u0000individual evolve adaptively to locate a peak and simulates the regions of the\u0000found peaks according to search history to avoid an individual locating a found\u0000peak. Second, a landscape-aware peak distinction distinguishes whether an\u0000individual locates a new global peak, a new local peak, or a found peak.\u0000Accuracy refinement can thus only be conducted on the global peaks to enhance\u0000the search efficiency. Third, a landscape-aware reinitialization specifies the\u0000initial position of an individual adaptively according to the distribution of\u0000the found peaks, which helps explore more peaks. The experiments are conducted\u0000on 20 widely-used benchmark MMOPs. Experimental results show that LADE obtains\u0000generally better or competitive performance compared with seven well-performed\u0000algorithms proposed recently and four winner algorithms in the IEEE CEC\u0000competitions for multimodal optimization.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We show how brain networks, modeled as Spiking Neural Networks, can be viewed at different levels of abstraction. Lower levels include complications such as failures of neurons and edges. Higher levels are more abstract, making simplifying assumptions to avoid these complications. We show precise relationships between executions of networks at different levels, which enables us to understand the behavior of lower-level networks in terms of the behavior of higher-level networks. We express our results using two abstract networks, A1 and A2, one to express firing guarantees and the other to express non-firing guarantees, and one detailed network D. The abstract networks contain reliable neurons and edges, whereas the detailed network has neurons and edges that may fail, subject to some constraints. Here we consider just initial stopping failures. To define these networks, we begin with abstract network A1 and modify it systematically to obtain the other two networks. To obtain A2, we simply lower the firing thresholds of the neurons. To obtain D, we introduce failures of neurons and edges, and incorporate redundancy in the neurons and edges in order to compensate for the failures. We also define corresponding inputs for the networks, and corresponding executions of the networks. We prove two main theorems, one relating corresponding executions of A1 and D and the other relating corresponding executions of A2 and D. Together, these give both firing and non-firing guarantees for the detailed network D. We also give a third theorem, relating the effects of D on an external reliable actuator neuron to the effects of the abstract networks on the same actuator neuron.
我们展示了以尖峰神经网络(Spiking Neural Networks)为模型的大脑网络如何在不同的抽象层级上进行观察。较低层次的抽象包括神经元和边缘失效等复杂情况。较高层次的抽象程度更高,可以做出简化假设以避免这些复杂性。我们展示了不同层次网络执行之间的精确关系,这使我们能够从高层网络的行为来理解低层网络的行为。我们使用两个抽象网络 A1 和 A2(一个用于表达触发保证,另一个用于表达非触发保证)以及一个详细网络 D 来表达我们的结果。抽象网络包含可靠的神经元和边,而详细网络则包含可能失效的神经元和边,并受到一些约束条件的限制。这里我们只考虑初始停止失败。为了定义这些网络,我们从抽象网络 A1 开始,并对其进行系统修改,以获得其他两个网络。为了得到 A2,我们只需降低神经元的发射阈值。为了得到 D,我们引入了神经元和边的失效,并在神经元和边中加入冗余以补偿失效。我们还定义了网络的相应输入和网络的相应执行。我们证明了两个主要定理,一个是关于 A1 和 D 的相应执行的定理,另一个是关于 A2 和 D 的相应执行的定理。我们还给出了第三个定理,即关于 D 对外部可靠执行器神经元的影响和抽象网络对同一执行器神经元的影响的定理。
{"title":"Abstraction in Neural Networks","authors":"Nancy Lynch","doi":"arxiv-2408.02125","DOIUrl":"https://doi.org/arxiv-2408.02125","url":null,"abstract":"We show how brain networks, modeled as Spiking Neural Networks, can be viewed\u0000at different levels of abstraction. Lower levels include complications such as\u0000failures of neurons and edges. Higher levels are more abstract, making\u0000simplifying assumptions to avoid these complications. We show precise\u0000relationships between executions of networks at different levels, which enables\u0000us to understand the behavior of lower-level networks in terms of the behavior\u0000of higher-level networks. We express our results using two abstract networks, A1 and A2, one to express\u0000firing guarantees and the other to express non-firing guarantees, and one\u0000detailed network D. The abstract networks contain reliable neurons and edges,\u0000whereas the detailed network has neurons and edges that may fail, subject to\u0000some constraints. Here we consider just initial stopping failures. To define\u0000these networks, we begin with abstract network A1 and modify it systematically\u0000to obtain the other two networks. To obtain A2, we simply lower the firing\u0000thresholds of the neurons. To obtain D, we introduce failures of neurons and\u0000edges, and incorporate redundancy in the neurons and edges in order to\u0000compensate for the failures. We also define corresponding inputs for the\u0000networks, and corresponding executions of the networks. We prove two main theorems, one relating corresponding executions of A1 and D\u0000and the other relating corresponding executions of A2 and D. Together, these\u0000give both firing and non-firing guarantees for the detailed network D. We also\u0000give a third theorem, relating the effects of D on an external reliable\u0000actuator neuron to the effects of the abstract networks on the same actuator\u0000neuron.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}