The majority of theoretical analyses of evolutionary algorithms in the discrete domain focus on binary optimization algorithms, even though black-box optimization on the categorical domain has a lot of practical applications. In this paper, we consider a probabilistic model-based algorithm using the family of categorical distributions as its underlying distribution and set the sample size as two. We term this specific algorithm the categorical compact genetic algorithm (ccGA). The ccGA can be considered as an extension of the compact genetic algorithm (cGA), which is an efficient binary optimization algorithm. We theoretically analyze the dependency of the number of possible categories K, the number of dimensions D, and the learning rate η on the runtime. We investigate the tail bound of the runtime on two typical linear functions on the categorical domain: categorical OneMax (COM) and KVAL. We derive that the runtimes on COM and KVAL are O(Dln(DK)/η) and Θ(DlnK/η) with high probability, respectively. Our analysis is a generalization for that of the cGA on the binary domain.
离散领域进化算法的理论分析大多集中在二元优化算法上,尽管分类领域的黑盒优化有很多实际应用。在本文中,我们考虑一种基于概率模型的算法,将分类分布族作为其基础分布,并将样本量设为两个。我们将这种特定算法称为分类紧凑遗传算法(ccGA)。ccGA 可以看作是紧凑遗传算法(ccGA)的扩展,后者是一种高效的二进制优化算法。我们从理论上分析了可能的类别数 K、维数 D 和学习率 η 对运行时间的影响。我们研究了分类域中两个典型线性函数的运行时间尾界:分类 OneMax (COM) 和 KVAL。我们得出,COM 和 KVAL 的运行时间分别为 O(Dln(DK)/η) 和 Θ(DlnK/η),且概率很高。我们的分析是对二元域 cGA 分析的推广。
{"title":"Tail Bounds on the Runtime of Categorical Compact Genetic Algorithm.","authors":"Ryoki Hamano, Kento Uchida, Shinichi Shirakawa, Daiki Morinaga, Youhei Akimoto","doi":"10.1162/evco_a_00361","DOIUrl":"https://doi.org/10.1162/evco_a_00361","url":null,"abstract":"<p><p>The majority of theoretical analyses of evolutionary algorithms in the discrete domain focus on binary optimization algorithms, even though black-box optimization on the categorical domain has a lot of practical applications. In this paper, we consider a probabilistic model-based algorithm using the family of categorical distributions as its underlying distribution and set the sample size as two. We term this specific algorithm the categorical compact genetic algorithm (ccGA). The ccGA can be considered as an extension of the compact genetic algorithm (cGA), which is an efficient binary optimization algorithm. We theoretically analyze the dependency of the number of possible categories K, the number of dimensions D, and the learning rate η on the runtime. We investigate the tail bound of the runtime on two typical linear functions on the categorical domain: categorical OneMax (COM) and KVAL. We derive that the runtimes on COM and KVAL are O(Dln(DK)/η) and Θ(DlnK/η) with high probability, respectively. Our analysis is a generalization for that of the cGA on the binary domain.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-52"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many real-world optimization problems can be stated in terms of submodular functions. Furthermore, these real-world problems often involve uncertainties which may lead to the violation of given constraints. A lot of evolutionary multi-objective algorithms following the Pareto optimization approach have recently been analyzed and applied to submodular problems with different types of constraints. We present a first runtime analysis of evolutionary multi-objective algorithms based on Pareto optimization for chance-constrained submodular functions. Here the constraint involves stochastic components and the constraint can only be violated with a small probability of α. We investigate the classical GSEMO algorithm for two different bi-objective formulations using tail bounds to determine the feasibility of solutions. We show that the algorithm GSEMO obtains the same worst case performance guarantees for monotone submodular functions as recently analyzed greedy algorithms for the case of uniform IID weights and uniformly distributed weights with the same dispersion when using the appropriate bi-objective formulation. As part of our investigations, we also point out situations where the use of tail bounds in the first bi-objective formulation can prevent GSEMO from obtaining good solutions in the case of uniformly distributed weights with the same dispersion if the objective function is submodular but non-monotone due to a single element impacting monotonicity. Furthermore, we investigate the behavior of the evolutionary multi-objective algorithms GSEMO, NSGA-II and SPEA2 on different submodular chance-constrained network problems. Our experimental results show that the use of evolutionary multi-objective algorithms leads to significant performance improvements compared to state-of-the-art greedy algorithms for submodular optimization.
{"title":"Optimizing Monotone Chance-Constrained Submodular Functions Using Evolutionary Multi-Objective Algorithms.","authors":"Aneta Neumann, Frank Neumann","doi":"10.1162/evco_a_00360","DOIUrl":"https://doi.org/10.1162/evco_a_00360","url":null,"abstract":"<p><p>Many real-world optimization problems can be stated in terms of submodular functions. Furthermore, these real-world problems often involve uncertainties which may lead to the violation of given constraints. A lot of evolutionary multi-objective algorithms following the Pareto optimization approach have recently been analyzed and applied to submodular problems with different types of constraints. We present a first runtime analysis of evolutionary multi-objective algorithms based on Pareto optimization for chance-constrained submodular functions. Here the constraint involves stochastic components and the constraint can only be violated with a small probability of α. We investigate the classical GSEMO algorithm for two different bi-objective formulations using tail bounds to determine the feasibility of solutions. We show that the algorithm GSEMO obtains the same worst case performance guarantees for monotone submodular functions as recently analyzed greedy algorithms for the case of uniform IID weights and uniformly distributed weights with the same dispersion when using the appropriate bi-objective formulation. As part of our investigations, we also point out situations where the use of tail bounds in the first bi-objective formulation can prevent GSEMO from obtaining good solutions in the case of uniformly distributed weights with the same dispersion if the objective function is submodular but non-monotone due to a single element impacting monotonicity. Furthermore, we investigate the behavior of the evolutionary multi-objective algorithms GSEMO, NSGA-II and SPEA2 on different submodular chance-constrained network problems. Our experimental results show that the use of evolutionary multi-objective algorithms leads to significant performance improvements compared to state-of-the-art greedy algorithms for submodular optimization.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-35"},"PeriodicalIF":4.6,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142331627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Performing classification on high-dimensional data poses a significant challenge due to the huge search space. Moreover, complex feature interactions introduce an additional obstacle. The problems can be addressed by using feature selection to select relevant features or feature construction to construct a small set of high-level features. However, performing feature selection or feature construction only might make the feature set suboptimal. To remedy this problem, this study investigates the use of genetic programming for simultaneous feature selection and feature construction in addressing different classification tasks. The proposed approach is tested on 16 datasets and compared with seven methods including both feature selection and feature constructions techniques. The results show that the obtained feature sets with the constructed and/or selected features can significantly increase the classification accuracy and reduce the dimensionality of the datasets. Further analysis reveals the complementarity of the obtained features leading to the promising classification performance of the proposed method.
{"title":"Genetic Programming for Automatically Evolving Multiple Features to Classification.","authors":"Peng Wang, Bing Xue, Jing Liang, Mengjie Zhang","doi":"10.1162/evco_a_00359","DOIUrl":"https://doi.org/10.1162/evco_a_00359","url":null,"abstract":"<p><p>Performing classification on high-dimensional data poses a significant challenge due to the huge search space. Moreover, complex feature interactions introduce an additional obstacle. The problems can be addressed by using feature selection to select relevant features or feature construction to construct a small set of high-level features. However, performing feature selection or feature construction only might make the feature set suboptimal. To remedy this problem, this study investigates the use of genetic programming for simultaneous feature selection and feature construction in addressing different classification tasks. The proposed approach is tested on 16 datasets and compared with seven methods including both feature selection and feature constructions techniques. The results show that the obtained feature sets with the constructed and/or selected features can significantly increase the classification accuracy and reduce the dimensionality of the datasets. Further analysis reveals the complementarity of the obtained features leading to the promising classification performance of the proposed method.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-27"},"PeriodicalIF":4.6,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giuseppe Paolo, Miranda Coninx, Alban Laflaquière, Stephane Doncieux
Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of setting has to be able to (1) explore possible agent behaviors and (2) exploit any possible discovered reward. Exploration algorithms have been proposed that require the definition of a low-dimension behavior space, in which the behavior generated by the agent's policy can be represented. The need to design a priori this space such that it is worth exploring is a major limitation of these algorithms. In this work, we introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while optimizing any reward discovered (see Figure 1). It does so by separating the exploration and learning of the behavior space from the exploitation of the reward through an alternating two-step process. In the first step, STAX builds a repertoire of diverse policies while learning a low-dimensional representation of the high-dimensional observations generated during the policies evaluation. In the exploitation step, emitters optimize the performance of the discovered rewarding solutions. Experiments conducted on three different sparse reward environments show that STAX performs comparably to existing baselines while requiring much less prior information about the task as it autonomously builds the behavior space it explores.
{"title":"Discovering and Exploiting Sparse Rewards in a Learned Behavior Space.","authors":"Giuseppe Paolo, Miranda Coninx, Alban Laflaquière, Stephane Doncieux","doi":"10.1162/evco_a_00343","DOIUrl":"10.1162/evco_a_00343","url":null,"abstract":"<p><p>Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of setting has to be able to (1) explore possible agent behaviors and (2) exploit any possible discovered reward. Exploration algorithms have been proposed that require the definition of a low-dimension behavior space, in which the behavior generated by the agent's policy can be represented. The need to design a priori this space such that it is worth exploring is a major limitation of these algorithms. In this work, we introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while optimizing any reward discovered (see Figure 1). It does so by separating the exploration and learning of the behavior space from the exploitation of the reward through an alternating two-step process. In the first step, STAX builds a repertoire of diverse policies while learning a low-dimensional representation of the high-dimensional observations generated during the policies evaluation. In the exploitation step, emitters optimize the performance of the discovered rewarding solutions. Experiments conducted on three different sparse reward environments show that STAX performs comparably to existing baselines while requiring much less prior information about the task as it autonomously builds the behavior space it explores.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"275-305"},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41171496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Novelty search is a powerful tool for finding diverse sets of objects in complicated spaces. Recent experiments on simplified versions of novelty search introduce the idea that novelty search happens at the level of the archive space, rather than individual points. The sparseness measure and archive update criterion create a process that is driven by a two measures: (1) spread out to cover the space while trying to remain as efficiently packed as possible, and (2) metrics inspired by k nearest neighbor theory. In this paper, we generalize previous simplifications of novelty search to include traditional population (μ,λ) dynamics for generating new search points, where the population and the archive are updated separately. We provide some theoretical guidance regarding balancing mutation and sparseness criteria and introduce the concept of saturation as a way of talking about fully covered spaces. We show empirically that claims that novelty search is inherently objectiveless are incorrect. We leverage the understanding of novelty search as an optimizer of archive coverage, suggest several ways to improve the search, and demonstrate one simple improvement-generating some new points directly from the archive rather than the parent population.
新奇搜索是在复杂空间中寻找不同对象集的有力工具。最近对简化版新颖性搜索的实验提出了一个想法,即新颖性搜索发生在档案空间的层面上,而不是单个点上。稀疏度衡量标准和档案更新标准创建了一个由两种衡量标准驱动的过程:(1)在尽量保持有效包装的同时,向外扩散以覆盖空间;(2)受 k 近邻理论启发的度量。在本文中,我们对以往的新颖性搜索简化进行了概括,纳入了用于生成新搜索点的传统种群(μ,λ)动力学,其中种群和档案分别更新。我们为平衡突变和稀疏性标准提供了一些理论指导,并引入了饱和概念,以此来讨论完全覆盖的空间。我们通过经验证明,认为新颖性搜索本质上是不客观的说法是不正确的。我们将新颖性搜索理解为档案覆盖率的优化器,提出了几种改进搜索的方法,并演示了一种简单的改进方法--直接从档案而不是父群体中生成一些新点。
{"title":"Preliminary Analysis of Simple Novelty Search.","authors":"R Paul Wiegand","doi":"10.1162/evco_a_00340","DOIUrl":"10.1162/evco_a_00340","url":null,"abstract":"<p><p>Novelty search is a powerful tool for finding diverse sets of objects in complicated spaces. Recent experiments on simplified versions of novelty search introduce the idea that novelty search happens at the level of the archive space, rather than individual points. The sparseness measure and archive update criterion create a process that is driven by a two measures: (1) spread out to cover the space while trying to remain as efficiently packed as possible, and (2) metrics inspired by k nearest neighbor theory. In this paper, we generalize previous simplifications of novelty search to include traditional population (μ,λ) dynamics for generating new search points, where the population and the archive are updated separately. We provide some theoretical guidance regarding balancing mutation and sparseness criteria and introduce the concept of saturation as a way of talking about fully covered spaces. We show empirically that claims that novelty search is inherently objectiveless are incorrect. We leverage the understanding of novelty search as an optimizer of archive coverage, suggest several ways to improve the search, and demonstrate one simple improvement-generating some new points directly from the archive rather than the parent population.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"249-273"},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9828886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minimizing the number of selected features and maximizing the classification performance are two main objectives in feature selection, which can be formulated as a bi-objective optimization problem. Due to the complex interactions between features, a solution (i.e., feature subset) with poor objective values does not mean that all the features it selects are useless, as some of them combined with other complementary features can greatly improve the classification performance. Thus, it is necessary to consider not only the performance of feature subsets in the objective space, but also their differences in the search space, to explore more promising feature combinations. To this end, this paper proposes a tri-objective method for bi-objective feature selection in classification, which solves a bi-objective feature selection problem as a tri-objective problem by considering the diversity (differences) between feature subsets in the search space as the third objective. The selection based on the converted tri-objective method can maintain a balance between minimizing the number of selected features, maximizing the classification performance, and exploring more promising feature subsets. Furthermore, a novel initialization strategy and an offspring reproduction operator are proposed to promote the diversity of feature subsets in the objective space and improve the search ability, respectively. The proposed algorithm is compared with five multiobjective-based feature selection methods, six typical feature selection methods, and two peer methods with diversity as a helper objective. Experimental results on 20 real-world classification datasets suggest that the proposed method outperforms the compared methods in most scenarios.
{"title":"A Tri-Objective Method for Bi-Objective Feature Selection in Classification.","authors":"Ruwang Jiao, Bing Xue, Mengjie Zhang","doi":"10.1162/evco_a_00339","DOIUrl":"10.1162/evco_a_00339","url":null,"abstract":"<p><p>Minimizing the number of selected features and maximizing the classification performance are two main objectives in feature selection, which can be formulated as a bi-objective optimization problem. Due to the complex interactions between features, a solution (i.e., feature subset) with poor objective values does not mean that all the features it selects are useless, as some of them combined with other complementary features can greatly improve the classification performance. Thus, it is necessary to consider not only the performance of feature subsets in the objective space, but also their differences in the search space, to explore more promising feature combinations. To this end, this paper proposes a tri-objective method for bi-objective feature selection in classification, which solves a bi-objective feature selection problem as a tri-objective problem by considering the diversity (differences) between feature subsets in the search space as the third objective. The selection based on the converted tri-objective method can maintain a balance between minimizing the number of selected features, maximizing the classification performance, and exploring more promising feature subsets. Furthermore, a novel initialization strategy and an offspring reproduction operator are proposed to promote the diversity of feature subsets in the objective space and improve the search ability, respectively. The proposed algorithm is compared with five multiobjective-based feature selection methods, six typical feature selection methods, and two peer methods with diversity as a helper objective. Experimental results on 20 real-world classification datasets suggest that the proposed method outperforms the compared methods in most scenarios.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"217-248"},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9822009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacob de Nobel, Furong Ye, Diederick Vermetten, Hao Wang, Carola Doerr, Thomas Bäck
We present IOHexperimenter, the experimentation module of the IOHprofiler project. IOHexperimenter aims at providing an easy-to-use and customizable toolbox for benchmarking iterative optimization heuristics such as local search, evolutionary and genetic algorithms, and Bayesian optimization techniques. IOHexperimenter can be used as a stand-alone tool or as part of a benchmarking pipeline that uses other modules of the IOHprofiler environment. IOHexperimenter provides an efficient interface between optimization problems and their solvers while allowing for granular logging of the optimization process. Its logs are fully compatible with existing tools for interactive data analysis, which significantly speeds up the deployment of a benchmarking pipeline. The main components of IOHexperimenter are the environment to build customized problem suites and the various logging options that allow users to steer the granularity of the data records.
{"title":"IOHexperimenter: Benchmarking Platform for Iterative Optimization Heuristics.","authors":"Jacob de Nobel, Furong Ye, Diederick Vermetten, Hao Wang, Carola Doerr, Thomas Bäck","doi":"10.1162/evco_a_00342","DOIUrl":"10.1162/evco_a_00342","url":null,"abstract":"<p><p>We present IOHexperimenter, the experimentation module of the IOHprofiler project. IOHexperimenter aims at providing an easy-to-use and customizable toolbox for benchmarking iterative optimization heuristics such as local search, evolutionary and genetic algorithms, and Bayesian optimization techniques. IOHexperimenter can be used as a stand-alone tool or as part of a benchmarking pipeline that uses other modules of the IOHprofiler environment. IOHexperimenter provides an efficient interface between optimization problems and their solvers while allowing for granular logging of the optimization process. Its logs are fully compatible with existing tools for interactive data analysis, which significantly speeds up the deployment of a benchmarking pipeline. The main components of IOHexperimenter are the environment to build customized problem suites and the various logging options that allow users to steer the granularity of the data records.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"205-210"},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9862561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The herein proposed Python package pflacco provides a set of numerical features to characterize single-objective continuous and constrained optimization problems. Thereby, pflacco addresses two major challenges in the area of optimization. Firstly, it provides the means to develop an understanding of a given problem instance, which is crucial for designing, selecting, or configuring optimization algorithms in general. Secondly, these numerical features can be utilized in the research streams of automated algorithm selection and configuration. While the majority of these landscape features are already available in the R package flacco, our Python implementation offers these tools to an even wider audience and thereby promotes research interests and novel avenues in the area of optimization.
{"title":"Pflacco: Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems in Python.","authors":"Raphael Patrick Prager, Heike Trautmann","doi":"10.1162/evco_a_00341","DOIUrl":"10.1162/evco_a_00341","url":null,"abstract":"<p><p>The herein proposed Python package pflacco provides a set of numerical features to characterize single-objective continuous and constrained optimization problems. Thereby, pflacco addresses two major challenges in the area of optimization. Firstly, it provides the means to develop an understanding of a given problem instance, which is crucial for designing, selecting, or configuring optimization algorithms in general. Secondly, these numerical features can be utilized in the research streams of automated algorithm selection and configuration. While the majority of these landscape features are already available in the R package flacco, our Python implementation offers these tools to an even wider audience and thereby promotes research interests and novel avenues in the area of optimization.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"211-216"},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9867698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In classification, feature selection is an essential pre-processing step that selects a small subset of features to improve classification performance. Existing feature selection approaches can be divided into three main approaches: wrapper approaches, filter approaches, and embedded approaches. In comparison with two other approaches, embedded approaches usually have better trade-off between classification performance and computation time. One of the most well-known embedded approaches is sparsity regularisation-based feature selection which generates sparse solutions for feature selection. Despite its good performance, sparsity regularisation-based feature selection outputs only a feature ranking which requires the number of selected features to be predefined. More importantly, the ranking mechanism introduces a risk of ignoring feature interactions which leads to the fact that many top-ranked but redundant features are selected. This work addresses the above problems by proposing a new representation that considers the interactions between features and can automatically determine an appropriate number of selected features. The proposed representation is used in a differential evolutionary (DE) algorithm to optimise the feature subset. In addition, a novel initialisation mechanism is proposed to let DE consider various numbers of selected features at the beginning. The proposed algorithm is examined on both synthetic and real-world datasets. The results on the synthetic dataset show that the proposed algorithm can select complementary features while existing sparsity regularisation-based feature selection algorithms are at risk of selecting redundant features. The results on real-world datasets show that the proposed algorithm achieves better classification performance than well-known wrapper, filter, and embedded approaches. The algorithm is also as efficient as filter feature selection approaches.
{"title":"Evolutionary Sparsity Regularisation-based Feature Selection for Binary Classification.","authors":"Bach Hoai Nguyen, Bing Xue, Mengjie Zhang","doi":"10.1162/evco_a_00358","DOIUrl":"https://doi.org/10.1162/evco_a_00358","url":null,"abstract":"<p><p>In classification, feature selection is an essential pre-processing step that selects a small subset of features to improve classification performance. Existing feature selection approaches can be divided into three main approaches: wrapper approaches, filter approaches, and embedded approaches. In comparison with two other approaches, embedded approaches usually have better trade-off between classification performance and computation time. One of the most well-known embedded approaches is sparsity regularisation-based feature selection which generates sparse solutions for feature selection. Despite its good performance, sparsity regularisation-based feature selection outputs only a feature ranking which requires the number of selected features to be predefined. More importantly, the ranking mechanism introduces a risk of ignoring feature interactions which leads to the fact that many top-ranked but redundant features are selected. This work addresses the above problems by proposing a new representation that considers the interactions between features and can automatically determine an appropriate number of selected features. The proposed representation is used in a differential evolutionary (DE) algorithm to optimise the feature subset. In addition, a novel initialisation mechanism is proposed to let DE consider various numbers of selected features at the beginning. The proposed algorithm is examined on both synthetic and real-world datasets. The results on the synthetic dataset show that the proposed algorithm can select complementary features while existing sparsity regularisation-based feature selection algorithms are at risk of selecting redundant features. The results on real-world datasets show that the proposed algorithm achieves better classification performance than well-known wrapper, filter, and embedded approaches. The algorithm is also as efficient as filter feature selection approaches.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-33"},"PeriodicalIF":4.6,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zbyněk Pitra, Jan Koza, Jiří Tumpach, Martin Holeňa
Surrogate modeling has become a valuable technique for black-box optimization tasks with expensive evaluation of the objective function. In this paper, we investigate the relationships between the predictive accuracy of surrogate models, their settings, and features of the black-box function landscape during evolutionary optimization by the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) state-of-the-art optimizer for expensive continuous black-box tasks. This study aims to establish the foundation for specific rules and automated methods for selecting and tuning surrogate models by exploring relationships between landscape features and model errors, focusing on the behavior of a specific model within each generation in contrast to selecting a specific algorithm at the outset. We perform a feature analysis process, identifying a significant number of non-robust features and clustering similar landscape features, resulting in the selection of 14 features out of 384, varying with input data selection methods. Our analysis explores the error dependencies of four models across 39 settings, utilizing three methods for input data selection, drawn from surrogate-assisted CMA-ES runs on noiseless benchmarks within the Comparing Continuous Optimizers framework.
{"title":"Landscape Analysis for Surrogate Models in the Evolutionary Black-Box Context.","authors":"Zbyněk Pitra, Jan Koza, Jiří Tumpach, Martin Holeňa","doi":"10.1162/evco_a_00357","DOIUrl":"https://doi.org/10.1162/evco_a_00357","url":null,"abstract":"<p><p>Surrogate modeling has become a valuable technique for black-box optimization tasks with expensive evaluation of the objective function. In this paper, we investigate the relationships between the predictive accuracy of surrogate models, their settings, and features of the black-box function landscape during evolutionary optimization by the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) state-of-the-art optimizer for expensive continuous black-box tasks. This study aims to establish the foundation for specific rules and automated methods for selecting and tuning surrogate models by exploring relationships between landscape features and model errors, focusing on the behavior of a specific model within each generation in contrast to selecting a specific algorithm at the outset. We perform a feature analysis process, identifying a significant number of non-robust features and clustering similar landscape features, resulting in the selection of 14 features out of 384, varying with input data selection methods. Our analysis explores the error dependencies of four models across 39 settings, utilizing three methods for input data selection, drawn from surrogate-assisted CMA-ES runs on noiseless benchmarks within the Comparing Continuous Optimizers framework.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":" ","pages":"1-29"},"PeriodicalIF":4.6,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141983781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}