首页 > 最新文献

Evolutionary Computation最新文献

英文 中文
Genetic Programming for Automatically Evolving Multiple Features to Classification. 遗传编程自动演化分类的多重特征
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-19 DOI: 10.1162/evco_a_00359
Peng Wang, Bing Xue, Jing Liang, Mengjie Zhang

Performing classification on high-dimensional data poses a significant challenge due to the huge search space. Moreover, complex feature interactions introduce an additional obstacle. The problems can be addressed by using feature selection to select relevant features or feature construction to construct a small set of high-level features. However, performing feature selection or feature construction only might make the feature set suboptimal. To remedy this problem, this study investigates the use of genetic programming for simultaneous feature selection and feature construction in addressing different classification tasks. The proposed approach is tested on 16 datasets and compared with seven methods including both feature selection and feature constructions techniques. The results show that the obtained feature sets with the constructed and/or selected features can significantly increase the classification accuracy and reduce the dimensionality of the datasets. Further analysis reveals the complementarity of the obtained features leading to the promising classification performance of the proposed method.

由于搜索空间巨大,对高维数据进行分类是一项重大挑战。此外,复杂的特征交互也带来了额外的障碍。要解决这些问题,可以使用特征选择来选择相关特征,或者使用特征构建来构建一小部分高级特征集。然而,仅进行特征选择或特征构建可能会使特征集不够理想。为了解决这个问题,本研究探讨了使用遗传编程同时进行特征选择和特征构建,以解决不同的分类任务。所提出的方法在 16 个数据集上进行了测试,并与包括特征选择和特征构建技术在内的七种方法进行了比较。结果表明,利用构建和/或选择的特征获得的特征集可显著提高分类准确率并降低数据集的维度。进一步的分析表明,所获得的特征具有互补性,因此建议的方法具有良好的分类性能。
{"title":"Genetic Programming for Automatically Evolving Multiple Features to Classification.","authors":"Peng Wang, Bing Xue, Jing Liang, Mengjie Zhang","doi":"10.1162/evco_a_00359","DOIUrl":"https://doi.org/10.1162/evco_a_00359","url":null,"abstract":"<p><p>Performing classification on high-dimensional data poses a significant challenge due to the huge search space. Moreover, complex feature interactions introduce an additional obstacle. The problems can be addressed by using feature selection to select relevant features or feature construction to construct a small set of high-level features. However, performing feature selection or feature construction only might make the feature set suboptimal. To remedy this problem, this study investigates the use of genetic programming for simultaneous feature selection and feature construction in addressing different classification tasks. The proposed approach is tested on 16 datasets and compared with seven methods including both feature selection and feature constructions techniques. The results show that the obtained feature sets with the constructed and/or selected features can significantly increase the classification accuracy and reduce the dimensionality of the datasets. Further analysis reveals the complementarity of the obtained features leading to the promising classification performance of the proposed method.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering and Exploiting Sparse Rewards in a Learned Behavior Space. 在学习行为空间中发现和利用稀疏奖励。
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-03 DOI: 10.1162/evco_a_00343
Giuseppe Paolo, Miranda Coninx, Alban Laflaquière, Stephane Doncieux

Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of setting has to be able to (1) explore possible agent behaviors and (2) exploit any possible discovered reward. Exploration algorithms have been proposed that require the definition of a low-dimension behavior space, in which the behavior generated by the agent's policy can be represented. The need to design a priori this space such that it is worth exploring is a major limitation of these algorithms. In this work, we introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while optimizing any reward discovered (see Figure 1). It does so by separating the exploration and learning of the behavior space from the exploitation of the reward through an alternating two-step process. In the first step, STAX builds a repertoire of diverse policies while learning a low-dimensional representation of the high-dimensional observations generated during the policies evaluation. In the exploitation step, emitters optimize the performance of the discovered rewarding solutions. Experiments conducted on three different sparse reward environments show that STAX performs comparably to existing baselines while requiring much less prior information about the task as it autonomously builds the behavior space it explores.

在稀疏奖励设置中学习最优策略是困难的,因为学习代理对其行动的质量几乎没有反馈。在这种情况下,一个好的策略是专注于探索,希望能发现奖励信号来改进。能够处理这种设置的学习算法必须能够(1)探索可能的代理行为,(2)利用任何可能发现的奖励。已经提出了需要定义低维行为空间的探索算法,在该空间中可以表示由代理的策略生成的行为。需要先验地设计这个空间,使其值得探索,这是这些算法的主要限制。在这项工作中,我们介绍了STAX,这是一种设计用于在飞行中学习行为空间并在优化发现的任何奖励的同时对其进行探索的算法。它通过交替的两步过程,将行为空间的探索和学习与奖励的利用分离开来。在第一步中,STAX构建了一系列不同的策略,同时学习在策略评估期间生成的高维观察的低维表示。在开发步骤中,发射器优化所发现的有回报的解决方案的性能。在三种不同的稀疏奖励环境中进行的实验表明,STAX的表现与现有基线相当,同时在自主构建其探索的行为空间时,所需的关于任务的先验信息要少得多。
{"title":"Discovering and Exploiting Sparse Rewards in a Learned Behavior Space.","authors":"Giuseppe Paolo, Miranda Coninx, Alban Laflaquière, Stephane Doncieux","doi":"10.1162/evco_a_00343","DOIUrl":"10.1162/evco_a_00343","url":null,"abstract":"<p><p>Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of setting has to be able to (1) explore possible agent behaviors and (2) exploit any possible discovered reward. Exploration algorithms have been proposed that require the definition of a low-dimension behavior space, in which the behavior generated by the agent's policy can be represented. The need to design a priori this space such that it is worth exploring is a major limitation of these algorithms. In this work, we introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while optimizing any reward discovered (see Figure 1). It does so by separating the exploration and learning of the behavior space from the exploitation of the reward through an alternating two-step process. In the first step, STAX builds a repertoire of diverse policies while learning a low-dimensional representation of the high-dimensional observations generated during the policies evaluation. In the exploitation step, emitters optimize the performance of the discovered rewarding solutions. Experiments conducted on three different sparse reward environments show that STAX performs comparably to existing baselines while requiring much less prior information about the task as it autonomously builds the behavior space it explores.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41171496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preliminary Analysis of Simple Novelty Search. 简单新奇搜索的初步分析。
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-03 DOI: 10.1162/evco_a_00340
R Paul Wiegand

Novelty search is a powerful tool for finding diverse sets of objects in complicated spaces. Recent experiments on simplified versions of novelty search introduce the idea that novelty search happens at the level of the archive space, rather than individual points. The sparseness measure and archive update criterion create a process that is driven by a two measures: (1) spread out to cover the space while trying to remain as efficiently packed as possible, and (2) metrics inspired by k nearest neighbor theory. In this paper, we generalize previous simplifications of novelty search to include traditional population (μ,λ) dynamics for generating new search points, where the population and the archive are updated separately. We provide some theoretical guidance regarding balancing mutation and sparseness criteria and introduce the concept of saturation as a way of talking about fully covered spaces. We show empirically that claims that novelty search is inherently objectiveless are incorrect. We leverage the understanding of novelty search as an optimizer of archive coverage, suggest several ways to improve the search, and demonstrate one simple improvement-generating some new points directly from the archive rather than the parent population.

新奇搜索是在复杂空间中寻找不同对象集的有力工具。最近对简化版新颖性搜索的实验提出了一个想法,即新颖性搜索发生在档案空间的层面上,而不是单个点上。稀疏度衡量标准和档案更新标准创建了一个由两种衡量标准驱动的过程:(1)在尽量保持有效包装的同时,向外扩散以覆盖空间;(2)受 k 近邻理论启发的度量。在本文中,我们对以往的新颖性搜索简化进行了概括,纳入了用于生成新搜索点的传统种群(μ,λ)动力学,其中种群和档案分别更新。我们为平衡突变和稀疏性标准提供了一些理论指导,并引入了饱和概念,以此来讨论完全覆盖的空间。我们通过经验证明,认为新颖性搜索本质上是不客观的说法是不正确的。我们将新颖性搜索理解为档案覆盖率的优化器,提出了几种改进搜索的方法,并演示了一种简单的改进方法--直接从档案而不是父群体中生成一些新点。
{"title":"Preliminary Analysis of Simple Novelty Search.","authors":"R Paul Wiegand","doi":"10.1162/evco_a_00340","DOIUrl":"10.1162/evco_a_00340","url":null,"abstract":"<p><p>Novelty search is a powerful tool for finding diverse sets of objects in complicated spaces. Recent experiments on simplified versions of novelty search introduce the idea that novelty search happens at the level of the archive space, rather than individual points. The sparseness measure and archive update criterion create a process that is driven by a two measures: (1) spread out to cover the space while trying to remain as efficiently packed as possible, and (2) metrics inspired by k nearest neighbor theory. In this paper, we generalize previous simplifications of novelty search to include traditional population (μ,λ) dynamics for generating new search points, where the population and the archive are updated separately. We provide some theoretical guidance regarding balancing mutation and sparseness criteria and introduce the concept of saturation as a way of talking about fully covered spaces. We show empirically that claims that novelty search is inherently objectiveless are incorrect. We leverage the understanding of novelty search as an optimizer of archive coverage, suggest several ways to improve the search, and demonstrate one simple improvement-generating some new points directly from the archive rather than the parent population.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9828886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Tri-Objective Method for Bi-Objective Feature Selection in Classification. 分类中双目标特征选择的三目标方法
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-03 DOI: 10.1162/evco_a_00339
Ruwang Jiao, Bing Xue, Mengjie Zhang

Minimizing the number of selected features and maximizing the classification performance are two main objectives in feature selection, which can be formulated as a bi-objective optimization problem. Due to the complex interactions between features, a solution (i.e., feature subset) with poor objective values does not mean that all the features it selects are useless, as some of them combined with other complementary features can greatly improve the classification performance. Thus, it is necessary to consider not only the performance of feature subsets in the objective space, but also their differences in the search space, to explore more promising feature combinations. To this end, this paper proposes a tri-objective method for bi-objective feature selection in classification, which solves a bi-objective feature selection problem as a tri-objective problem by considering the diversity (differences) between feature subsets in the search space as the third objective. The selection based on the converted tri-objective method can maintain a balance between minimizing the number of selected features, maximizing the classification performance, and exploring more promising feature subsets. Furthermore, a novel initialization strategy and an offspring reproduction operator are proposed to promote the diversity of feature subsets in the objective space and improve the search ability, respectively. The proposed algorithm is compared with five multiobjective-based feature selection methods, six typical feature selection methods, and two peer methods with diversity as a helper objective. Experimental results on 20 real-world classification datasets suggest that the proposed method outperforms the compared methods in most scenarios.

最小化所选特征的数量和最大化分类性能是特征选择的两个主要目标,这可以表述为一个双目标优化问题。由于特征之间存在复杂的相互作用,目标值较差的解决方案(即特征子集)并不意味着其选择的所有特征都是无用的,因为其中一些特征与其他互补特征相结合可以大大提高分类性能。因此,不仅要考虑特征子集在目标空间中的表现,还要考虑它们在搜索空间中的差异,以探索更有前景的特征组合。为此,本文提出了一种用于分类中双目标特征选择的三目标方法,该方法通过考虑搜索空间中特征子集之间的多样性(差异)作为第三个目标,将双目标特征选择问题作为三目标问题来解决。基于转换后的三目标方法进行的选择可以在最小化所选特征数量、最大化分类性能和探索更有前景的特征子集之间保持平衡。此外,还提出了一种新颖的初始化策略和子代繁衍算子,以分别促进目标空间中特征子集的多样性和提高搜索能力。将所提出的算法与五种基于多目标的特征选择方法、六种典型特征选择方法以及两种以多样性为辅助目标的同类方法进行了比较。在 20 个真实世界分类数据集上的实验结果表明,所提出的方法在大多数情况下都优于所比较的方法。
{"title":"A Tri-Objective Method for Bi-Objective Feature Selection in Classification.","authors":"Ruwang Jiao, Bing Xue, Mengjie Zhang","doi":"10.1162/evco_a_00339","DOIUrl":"10.1162/evco_a_00339","url":null,"abstract":"<p><p>Minimizing the number of selected features and maximizing the classification performance are two main objectives in feature selection, which can be formulated as a bi-objective optimization problem. Due to the complex interactions between features, a solution (i.e., feature subset) with poor objective values does not mean that all the features it selects are useless, as some of them combined with other complementary features can greatly improve the classification performance. Thus, it is necessary to consider not only the performance of feature subsets in the objective space, but also their differences in the search space, to explore more promising feature combinations. To this end, this paper proposes a tri-objective method for bi-objective feature selection in classification, which solves a bi-objective feature selection problem as a tri-objective problem by considering the diversity (differences) between feature subsets in the search space as the third objective. The selection based on the converted tri-objective method can maintain a balance between minimizing the number of selected features, maximizing the classification performance, and exploring more promising feature subsets. Furthermore, a novel initialization strategy and an offspring reproduction operator are proposed to promote the diversity of feature subsets in the objective space and improve the search ability, respectively. The proposed algorithm is compared with five multiobjective-based feature selection methods, six typical feature selection methods, and two peer methods with diversity as a helper objective. Experimental results on 20 real-world classification datasets suggest that the proposed method outperforms the compared methods in most scenarios.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9822009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IOHexperimenter: Benchmarking Platform for Iterative Optimization Heuristics. IOHexperimenter:迭代优化启发法基准测试平台。
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-03 DOI: 10.1162/evco_a_00342
Jacob de Nobel, Furong Ye, Diederick Vermetten, Hao Wang, Carola Doerr, Thomas Bäck

We present IOHexperimenter, the experimentation module of the IOHprofiler project. IOHexperimenter aims at providing an easy-to-use and customizable toolbox for benchmarking iterative optimization heuristics such as local search, evolutionary and genetic algorithms, and Bayesian optimization techniques. IOHexperimenter can be used as a stand-alone tool or as part of a benchmarking pipeline that uses other modules of the IOHprofiler environment. IOHexperimenter provides an efficient interface between optimization problems and their solvers while allowing for granular logging of the optimization process. Its logs are fully compatible with existing tools for interactive data analysis, which significantly speeds up the deployment of a benchmarking pipeline. The main components of IOHexperimenter are the environment to build customized problem suites and the various logging options that allow users to steer the granularity of the data records.

我们介绍 IOHprofiler 项目的实验模块 IOHexperimenter。IOHexperimenter旨在为迭代优化启发式算法(如局部搜索、进化算法、遗传算法和贝叶斯优化技术)的基准测试提供一个易于使用且可定制的工具箱。IOHexperimenter 可作为独立工具使用,也可作为使用 IOHprofiler 环境其他模块的基准测试管道的一部分。IOHexperimenter 为优化问题及其求解器提供了一个高效的接口,同时允许对优化过程进行细粒度记录。其日志与现有的交互式数据分析工具完全兼容,从而大大加快了基准测试管道的部署速度。IOHexperimenter 的主要组件是用于构建定制问题套件的环境,以及允许用户控制数据记录粒度的各种日志选项。
{"title":"IOHexperimenter: Benchmarking Platform for Iterative Optimization Heuristics.","authors":"Jacob de Nobel, Furong Ye, Diederick Vermetten, Hao Wang, Carola Doerr, Thomas Bäck","doi":"10.1162/evco_a_00342","DOIUrl":"10.1162/evco_a_00342","url":null,"abstract":"<p><p>We present IOHexperimenter, the experimentation module of the IOHprofiler project. IOHexperimenter aims at providing an easy-to-use and customizable toolbox for benchmarking iterative optimization heuristics such as local search, evolutionary and genetic algorithms, and Bayesian optimization techniques. IOHexperimenter can be used as a stand-alone tool or as part of a benchmarking pipeline that uses other modules of the IOHprofiler environment. IOHexperimenter provides an efficient interface between optimization problems and their solvers while allowing for granular logging of the optimization process. Its logs are fully compatible with existing tools for interactive data analysis, which significantly speeds up the deployment of a benchmarking pipeline. The main components of IOHexperimenter are the environment to build customized problem suites and the various logging options that allow users to steer the granularity of the data records.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9862561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pflacco: Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems in Python. Pflacco:用 Python 对连续和受限优化问题进行基于特征的景观分析
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-03 DOI: 10.1162/evco_a_00341
Raphael Patrick Prager, Heike Trautmann

The herein proposed Python package pflacco provides a set of numerical features to characterize single-objective continuous and constrained optimization problems. Thereby, pflacco addresses two major challenges in the area of optimization. Firstly, it provides the means to develop an understanding of a given problem instance, which is crucial for designing, selecting, or configuring optimization algorithms in general. Secondly, these numerical features can be utilized in the research streams of automated algorithm selection and configuration. While the majority of these landscape features are already available in the R package flacco, our Python implementation offers these tools to an even wider audience and thereby promotes research interests and novel avenues in the area of optimization.

本文提出的 Python 软件包 pflacco 提供了一组数值特征,用于描述单目标连续和约束优化问题。因此,pflacco 解决了优化领域的两大难题。首先,它提供了理解给定问题实例的方法,这对于设计、选择或配置一般优化算法至关重要。其次,这些数字特征可用于自动算法选择和配置的研究流。虽然这些景观特征中的大部分已在 R 软件包 flacco 中提供,但我们的 Python 实现为更广泛的受众提供了这些工具,从而促进了优化领域的研究兴趣和新途径。
{"title":"Pflacco: Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems in Python.","authors":"Raphael Patrick Prager, Heike Trautmann","doi":"10.1162/evco_a_00341","DOIUrl":"10.1162/evco_a_00341","url":null,"abstract":"<p><p>The herein proposed Python package pflacco provides a set of numerical features to characterize single-objective continuous and constrained optimization problems. Thereby, pflacco addresses two major challenges in the area of optimization. Firstly, it provides the means to develop an understanding of a given problem instance, which is crucial for designing, selecting, or configuring optimization algorithms in general. Secondly, these numerical features can be utilized in the research streams of automated algorithm selection and configuration. While the majority of these landscape features are already available in the R package flacco, our Python implementation offers these tools to an even wider audience and thereby promotes research interests and novel avenues in the area of optimization.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9867698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evolutionary Sparsity Regularisation-based Feature Selection for Binary Classification. 基于进化稀疏正则化的二元分类特征选择
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-22 DOI: 10.1162/evco_a_00358
Bach Hoai Nguyen, Bing Xue, Mengjie Zhang

In classification, feature selection is an essential pre-processing step that selects a small subset of features to improve classification performance. Existing feature selection approaches can be divided into three main approaches: wrapper approaches, filter approaches, and embedded approaches. In comparison with two other approaches, embedded approaches usually have better trade-off between classification performance and computation time. One of the most well-known embedded approaches is sparsity regularisation-based feature selection which generates sparse solutions for feature selection. Despite its good performance, sparsity regularisation-based feature selection outputs only a feature ranking which requires the number of selected features to be predefined. More importantly, the ranking mechanism introduces a risk of ignoring feature interactions which leads to the fact that many top-ranked but redundant features are selected. This work addresses the above problems by proposing a new representation that considers the interactions between features and can automatically determine an appropriate number of selected features. The proposed representation is used in a differential evolutionary (DE) algorithm to optimise the feature subset. In addition, a novel initialisation mechanism is proposed to let DE consider various numbers of selected features at the beginning. The proposed algorithm is examined on both synthetic and real-world datasets. The results on the synthetic dataset show that the proposed algorithm can select complementary features while existing sparsity regularisation-based feature selection algorithms are at risk of selecting redundant features. The results on real-world datasets show that the proposed algorithm achieves better classification performance than well-known wrapper, filter, and embedded approaches. The algorithm is also as efficient as filter feature selection approaches.

在分类过程中,特征选择是一个重要的预处理步骤,它可以选择一小部分特征子集来提高分类性能。现有的特征选择方法主要分为三种:包装方法、过滤方法和嵌入方法。与其他两种方法相比,嵌入式方法通常能更好地权衡分类性能和计算时间。最著名的嵌入式方法之一是基于稀疏正则化的特征选择,它能为特征选择生成稀疏解。尽管基于稀疏正则化的特征选择性能良好,但它只能输出一个特征排序,而排序需要预定义所选特征的数量。更重要的是,这种排序机制有可能忽略特征之间的相互作用,从而导致许多排名靠前但多余的特征被选中。为了解决上述问题,本研究提出了一种新的表示方法,它考虑了特征之间的相互作用,并能自动确定所选特征的适当数量。提出的表示法被用于差分进化(DE)算法,以优化特征子集。此外,还提出了一种新颖的初始化机制,让差分进化算法在开始时就能考虑各种数量的选定特征。我们在合成数据集和实际数据集上对所提出的算法进行了检验。合成数据集上的结果表明,提出的算法可以选择互补特征,而现有的基于稀疏正则化的特征选择算法则有可能选择冗余特征。在真实数据集上的结果表明,与众所周知的包装方法、过滤方法和嵌入方法相比,所提出的算法取得了更好的分类性能。该算法的效率也不亚于滤波器特征选择方法。
{"title":"Evolutionary Sparsity Regularisation-based Feature Selection for Binary Classification.","authors":"Bach Hoai Nguyen, Bing Xue, Mengjie Zhang","doi":"10.1162/evco_a_00358","DOIUrl":"https://doi.org/10.1162/evco_a_00358","url":null,"abstract":"<p><p>In classification, feature selection is an essential pre-processing step that selects a small subset of features to improve classification performance. Existing feature selection approaches can be divided into three main approaches: wrapper approaches, filter approaches, and embedded approaches. In comparison with two other approaches, embedded approaches usually have better trade-off between classification performance and computation time. One of the most well-known embedded approaches is sparsity regularisation-based feature selection which generates sparse solutions for feature selection. Despite its good performance, sparsity regularisation-based feature selection outputs only a feature ranking which requires the number of selected features to be predefined. More importantly, the ranking mechanism introduces a risk of ignoring feature interactions which leads to the fact that many top-ranked but redundant features are selected. This work addresses the above problems by proposing a new representation that considers the interactions between features and can automatically determine an appropriate number of selected features. The proposed representation is used in a differential evolutionary (DE) algorithm to optimise the feature subset. In addition, a novel initialisation mechanism is proposed to let DE consider various numbers of selected features at the beginning. The proposed algorithm is examined on both synthetic and real-world datasets. The results on the synthetic dataset show that the proposed algorithm can select complementary features while existing sparsity regularisation-based feature selection algorithms are at risk of selecting redundant features. The results on real-world datasets show that the proposed algorithm achieves better classification performance than well-known wrapper, filter, and embedded approaches. The algorithm is also as efficient as filter feature selection approaches.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Landscape Analysis for Surrogate Models in the Evolutionary Black-Box Context. 进化黑箱背景下的代用模型景观分析
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-14 DOI: 10.1162/evco_a_00357
Zbyněk Pitra, Jan Koza, Jiří Tumpach, Martin Holeňa

Surrogate modeling has become a valuable technique for black-box optimization tasks with expensive evaluation of the objective function. In this paper, we investigate the relationships between the predictive accuracy of surrogate models, their settings, and features of the black-box function landscape during evolutionary optimization by the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) state-of-the-art optimizer for expensive continuous black-box tasks. This study aims to establish the foundation for specific rules and automated methods for selecting and tuning surrogate models by exploring relationships between landscape features and model errors, focusing on the behavior of a specific model within each generation in contrast to selecting a specific algorithm at the outset. We perform a feature analysis process, identifying a significant number of non-robust features and clustering similar landscape features, resulting in the selection of 14 features out of 384, varying with input data selection methods. Our analysis explores the error dependencies of four models across 39 settings, utilizing three methods for input data selection, drawn from surrogate-assisted CMA-ES runs on noiseless benchmarks within the Comparing Continuous Optimizers framework.

对于目标函数评估成本高昂的黑箱优化任务而言,代用模型已成为一种有价值的技术。在本文中,我们针对昂贵的连续黑箱任务,研究了在使用最先进的优化器 "协方差矩阵适应进化策略(CMA-ES)"进行进化优化过程中,代用模型的预测精度、代用模型的设置和黑箱函数景观特征之间的关系。本研究旨在通过探索景观特征与模型误差之间的关系,为选择和调整代用模型的特定规则和自动化方法奠定基础,重点关注每一代中特定模型的行为,而不是一开始就选择特定算法。我们执行了一个特征分析过程,识别了大量非稳健特征,并对类似的景观特征进行了聚类,最终从 384 个特征中选择了 14 个特征,这些特征随输入数据选择方法的不同而变化。我们的分析探讨了四种模型在 39 种设置下的误差依赖性,利用了三种输入数据选择方法,这些数据来自比较连续优化器框架内无噪声基准上的代用辅助 CMA-ES 运行。
{"title":"Landscape Analysis for Surrogate Models in the Evolutionary Black-Box Context.","authors":"Zbyněk Pitra, Jan Koza, Jiří Tumpach, Martin Holeňa","doi":"10.1162/evco_a_00357","DOIUrl":"https://doi.org/10.1162/evco_a_00357","url":null,"abstract":"<p><p>Surrogate modeling has become a valuable technique for black-box optimization tasks with expensive evaluation of the objective function. In this paper, we investigate the relationships between the predictive accuracy of surrogate models, their settings, and features of the black-box function landscape during evolutionary optimization by the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) state-of-the-art optimizer for expensive continuous black-box tasks. This study aims to establish the foundation for specific rules and automated methods for selecting and tuning surrogate models by exploring relationships between landscape features and model errors, focusing on the behavior of a specific model within each generation in contrast to selecting a specific algorithm at the outset. We perform a feature analysis process, identifying a significant number of non-robust features and clustering similar landscape features, resulting in the selection of 14 features out of 384, varying with input data selection methods. Our analysis explores the error dependencies of four models across 39 settings, utilizing three methods for input data selection, drawn from surrogate-assisted CMA-ES runs on noiseless benchmarks within the Comparing Continuous Optimizers framework.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141983781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Runtime Analysis of Single- and Multi-Objective Evolutionary Algorithms for Chance Constrained Optimization Problems with Normally Distributed Random Variables. 针对具有正态分布随机变量的机会约束优化问题的单目标和多目标进化算法的运行时间分析
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1162/evco_a_00355
Frank Neumann, Carsten Witt

Chance constrained optimization problems allow to model problems where constraints involving stochastic components should only be violated with a small probability. Evolutionary algorithms have been applied to this scenario and shown to achieve high quality results. With this paper, we contribute to the theoretical understanding of evolutionary algorithms for chance constrained optimization. We study the scenario of stochastic components that are independent and normally distributed. Considering the simple single-objective (1+1) EA, we show that imposing an additional uniform constraint already leads to local optima for very restricted scenarios and an exponential optimization time. We therefore introduce a multi-objective formulation of the problem which trades off the expected cost and its variance. We show that multi-objective evolutionary algorithms are highly effective when using this formulation and obtain a set of solutions that contains an optimal solution for any possible confidence level imposed on the constraint. Furthermore, we prove that this approach can also be used to compute a set of optimal solutions for the chance constrained minimum spanning tree problem. In order to deal with potentially exponentially many trade-offs in the multi-objective formulation, we propose and analyze improved convex multi-objective approaches. Experimental investigations on instances of the NP-hard stochastic minimum weight dominating set problem confirm the benefit of the multi-objective and the improved convex multi-objective approach in practice.

偶然性约束优化问题可以用来模拟这样的问题,即涉及随机成分的约束只能以很小的概率被违反。进化算法已被应用于这一场景,并取得了高质量的结果。通过本文,我们对进化算法用于偶然约束优化的理论理解做出了贡献。我们研究了独立且呈正态分布的随机成分。考虑到简单的单目标 (1+1) 进化算法,我们发现在非常有限的情况下,施加额外的均匀约束会导致局部最优化,优化时间也会呈指数级增长。因此,我们引入了该问题的多目标表述,在预期成本和方差之间进行权衡。我们证明,多目标进化算法在使用这种表述时非常有效,并能获得一组解决方案,其中包含对约束条件施加的任何可能置信度的最优解。此外,我们还证明了这种方法也可用于计算机会约束最小生成树问题的最优解集。为了处理多目标表述中潜在的指数级权衡,我们提出并分析了改进的凸多目标方法。对 NP 难随机最小权重支配集问题实例的实验研究证实了多目标和改进凸多目标方法在实践中的优势。
{"title":"Runtime Analysis of Single- and Multi-Objective Evolutionary Algorithms for Chance Constrained Optimization Problems with Normally Distributed Random Variables.","authors":"Frank Neumann, Carsten Witt","doi":"10.1162/evco_a_00355","DOIUrl":"https://doi.org/10.1162/evco_a_00355","url":null,"abstract":"<p><p>Chance constrained optimization problems allow to model problems where constraints involving stochastic components should only be violated with a small probability. Evolutionary algorithms have been applied to this scenario and shown to achieve high quality results. With this paper, we contribute to the theoretical understanding of evolutionary algorithms for chance constrained optimization. We study the scenario of stochastic components that are independent and normally distributed. Considering the simple single-objective (1+1) EA, we show that imposing an additional uniform constraint already leads to local optima for very restricted scenarios and an exponential optimization time. We therefore introduce a multi-objective formulation of the problem which trades off the expected cost and its variance. We show that multi-objective evolutionary algorithms are highly effective when using this formulation and obtain a set of solutions that contains an optimal solution for any possible confidence level imposed on the constraint. Furthermore, we prove that this approach can also be used to compute a set of optimal solutions for the chance constrained minimum spanning tree problem. In order to deal with potentially exponentially many trade-offs in the multi-objective formulation, we propose and analyze improved convex multi-objective approaches. Experimental investigations on instances of the NP-hard stochastic minimum weight dominating set problem confirm the benefit of the multi-objective and the improved convex multi-objective approach in practice.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141890746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Machine Learning Methods to Assess Module Performance Contribution in Modular Optimization Frameworks. 使用机器学习方法评估模块化优化框架中的模块性能贡献。
IF 4.6 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1162/evco_a_00356
Ana Kostovska, Diederick Vermetten, Peter Korošec, Sašo Džeroski, Carola Doerr, Tome Eftimov

Modular algorithm frameworks not only allow for combinations never tested in manually selected algorithm portfolios, but they also provide a structured approach to assess which algorithmic ideas are crucial for the observed performance of algorithms. In this study, we propose a methodology for analyzing the impact of the different modules on the overall performance. We consider modular frameworks for two widely used families of derivative-free black-box optimization algorithms, the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) and differential evolution (DE). More specifically, we use performance data of 324 modCMA-ES and 576 modDE algorithm variants (with each variant corresponding to a specific configuration of modules) obtained on the 24 BBOB problems for 6 different runtime budgets in 2 dimensions. Our analysis of these data reveals that the impact of individual modules on overall algorithm performance varies significantly. Notably, among the examined modules, the elitism module in CMA-ES and the linear population size reduction module in DE exhibit the most significant impact on performance. Furthermore, our exploratory data analysis of problem landscape data suggests that the most relevant landscape features remain consistent regardless of the configuration of individual modules, but the influence that these features have on regression accuracy varies. In addition, we apply classifiers that exploit feature importance with respect to the trained models for performance prediction and performance data, to predict the modular configurations of CMA-ES and DE algorithm variants. The results show that the predicted configurations do not exhibit a statistically significant difference in performance compared to the true configurations, with the percentage varying depending on the setup (from 49.1% to 95.5% for mod-CMA and 21.7% to 77.1% for DE).

模块化算法框架不仅可以实现人工选择的算法组合中从未测试过的组合,而且还提供了一种结构化方法,用于评估哪些算法思想对观察到的算法性能至关重要。在本研究中,我们提出了一种分析不同模块对整体性能影响的方法。我们考虑了两个广泛使用的无衍生黑箱优化算法系列的模块框架,即协方差矩阵适应进化策略(CMA-ES)和微分进化(DE)。更具体地说,我们使用了 324 个 modCMA-ES 和 576 个 modDE 算法变体(每个变体对应一个特定的模块配置)的性能数据,这些数据是在 24 个 BBOB 问题上针对 6 种不同的运行时间预算在 2 维度上获得的。我们对这些数据的分析表明,各个模块对算法整体性能的影响差别很大。值得注意的是,在所考察的模块中,CMA-ES 中的精英模块和 DE 中的线性种群规模缩减模块对性能的影响最为显著。此外,我们对问题景观数据的探索性数据分析表明,无论单个模块的配置如何,最相关的景观特征保持一致,但这些特征对回归精度的影响各不相同。此外,我们应用分类器,利用性能预测和性能数据训练模型的特征重要性,来预测 CMA-ES 和 DE 算法变体的模块配置。结果表明,与真实配置相比,预测的配置在性能上没有表现出显著的统计学差异,其百分比因设置而异(mod-CMA 从 49.1% 到 95.5%,DE 从 21.7% 到 77.1%)。
{"title":"Using Machine Learning Methods to Assess Module Performance Contribution in Modular Optimization Frameworks.","authors":"Ana Kostovska, Diederick Vermetten, Peter Korošec, Sašo Džeroski, Carola Doerr, Tome Eftimov","doi":"10.1162/evco_a_00356","DOIUrl":"https://doi.org/10.1162/evco_a_00356","url":null,"abstract":"<p><p>Modular algorithm frameworks not only allow for combinations never tested in manually selected algorithm portfolios, but they also provide a structured approach to assess which algorithmic ideas are crucial for the observed performance of algorithms. In this study, we propose a methodology for analyzing the impact of the different modules on the overall performance. We consider modular frameworks for two widely used families of derivative-free black-box optimization algorithms, the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) and differential evolution (DE). More specifically, we use performance data of 324 modCMA-ES and 576 modDE algorithm variants (with each variant corresponding to a specific configuration of modules) obtained on the 24 BBOB problems for 6 different runtime budgets in 2 dimensions. Our analysis of these data reveals that the impact of individual modules on overall algorithm performance varies significantly. Notably, among the examined modules, the elitism module in CMA-ES and the linear population size reduction module in DE exhibit the most significant impact on performance. Furthermore, our exploratory data analysis of problem landscape data suggests that the most relevant landscape features remain consistent regardless of the configuration of individual modules, but the influence that these features have on regression accuracy varies. In addition, we apply classifiers that exploit feature importance with respect to the trained models for performance prediction and performance data, to predict the modular configurations of CMA-ES and DE algorithm variants. The results show that the predicted configurations do not exhibit a statistically significant difference in performance compared to the true configurations, with the percentage varying depending on the setup (from 49.1% to 95.5% for mod-CMA and 21.7% to 77.1% for DE).</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141890747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Evolutionary Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1