首页 > 最新文献

Evolutionary Computation最新文献

英文 中文
Drift Analysis with Fitness Levels for Elitist Evolutionary Algorithms. 精英进化算法的漂移分析与适合度分析
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-03-22 DOI: 10.1162/evco_a_00349
Jun He, Yuren Zhou

The fitness level method is a popular tool for analyzing the hitting time of elitist evolutionary algorithms. Its idea is to divide the search space into multiple fitness levels and estimate lower and upper bounds on the hitting time using transition probabilities between fitness levels. However, the lower bound generated by this method is often loose. An open question regarding the fitness level method is what are the tightest lower and upper time bounds that can be constructed based on transition probabilities between fitness levels. To answer this question, we combine drift analysis with fitness levels and define the tightest bound problem as a constrained multi-objective optimization problem subject to fitness levels. The tightest metric bounds by fitness levels are constructed and proven for the first time. Then linear bounds are derived from metric bounds and a framework is established that can be used to develop different fitness level methods for different types of linear bounds. The framework is generic and promising, as it can be used to draw tight time bounds on both fitness landscapes with and without shortcuts. This is demonstrated in the example of the (1+1) EA maximizing the TwoMax1 function.

适应度方法是分析精英进化算法命中时间的常用工具。其原理是将搜索空间划分为多个适合度等级,并利用适合度等级之间的过渡概率估算出命中时间的下限和上限。然而,这种方法产生的下限往往比较宽松。关于适合度方法的一个悬而未决的问题是,根据适合度之间的过渡概率,可以构建出最严格的时间下限和上限。为了回答这个问题,我们将漂移分析与适应度水平相结合,并将最严格约束问题定义为受限于适应度水平的多目标优化问题。我们首次构建并证明了适应度水平的最严格度量边界。然后,从度量约束推导出线性约束,并建立了一个框架,可用于为不同类型的线性约束开发不同的适度水平方法。该框架具有通用性和广阔前景,因为它既可以用于绘制有捷径的适度景观,也可以用于绘制无捷径的适度景观。(1+1) EA 最大化 TwoMax1 函数的例子就证明了这一点。
{"title":"Drift Analysis with Fitness Levels for Elitist Evolutionary Algorithms.","authors":"Jun He, Yuren Zhou","doi":"10.1162/evco_a_00349","DOIUrl":"https://doi.org/10.1162/evco_a_00349","url":null,"abstract":"<p><p>The fitness level method is a popular tool for analyzing the hitting time of elitist evolutionary algorithms. Its idea is to divide the search space into multiple fitness levels and estimate lower and upper bounds on the hitting time using transition probabilities between fitness levels. However, the lower bound generated by this method is often loose. An open question regarding the fitness level method is what are the tightest lower and upper time bounds that can be constructed based on transition probabilities between fitness levels. To answer this question, we combine drift analysis with fitness levels and define the tightest bound problem as a constrained multi-objective optimization problem subject to fitness levels. The tightest metric bounds by fitness levels are constructed and proven for the first time. Then linear bounds are derived from metric bounds and a framework is established that can be used to develop different fitness level methods for different types of linear bounds. The framework is generic and promising, as it can be used to draw tight time bounds on both fitness landscapes with and without shortcuts. This is demonstrated in the example of the (1+1) EA maximizing the TwoMax1 function.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140295207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial for the Special Issue on Reproducibility. 可重复性特刊编辑。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-03-01 DOI: 10.1162/evco_e_00344
Manuel López-Ibáñez, Luís Paquete, Mike Preuss
{"title":"Editorial for the Special Issue on Reproducibility.","authors":"Manuel López-Ibáñez, Luís Paquete, Mike Preuss","doi":"10.1162/evco_e_00344","DOIUrl":"10.1162/evco_e_00344","url":null,"abstract":"","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139998205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Practical Methodology for Reproducible Experimentation: An Application to the Double-Row Facility Layout Problem. 可重复实验的实用方法:双排设施布局问题的应用。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-03-01 DOI: 10.1162/evco_a_00317
Raúl Martín-Santamaría, Sergio Cavero, Alberto Herrán, Abraham Duarte, J Manuel Colmenar

Reproducibility of experiments is a complex task in stochastic methods such as evolutionary algorithms or metaheuristics in general. Many works from the literature give general guidelines to favor reproducibility. However, none of them provide both a practical set of steps or software tools to help in this process. In this article, we propose a practical methodology to favor reproducibility in optimization problems tackled with stochastic methods. This methodology is divided into three main steps, where the researcher is assisted by software tools which implement state-of-the-art techniques related to this process. The methodology has been applied to study the double-row facility layout problem (DRFLP) where we propose a new algorithm able to obtain better results than the state-of-the-art methods. To this aim, we have also replicated the previous methods in order to complete the study with a new set of larger instances. All the produced artifacts related to the methodology and the study of the target problem are available in Zenodo.

在进化算法或元启发式算法等随机方法中,实验的可重复性是一项复杂的任务。许多文献都给出了有利于可重复性的一般指导原则。然而,它们都没有提供一套实用的步骤和软件工具来帮助这一过程。在本文中,我们提出了一种实用的方法论,以便在使用随机方法处理优化问题时提高可重复性。该方法分为三个主要步骤,研究人员可借助软件工具实现与此过程相关的先进技术。我们将该方法应用于研究双排设施布局问题,并提出了一种新算法,该算法能够获得比最先进方法更好的结果。为此,我们还复制了以前的方法,以便通过一组新的更大实例完成研究。所有与方法论和目标问题研究相关的成果都可以在 Zenodo 中找到。
{"title":"A Practical Methodology for Reproducible Experimentation: An Application to the Double-Row Facility Layout Problem.","authors":"Raúl Martín-Santamaría, Sergio Cavero, Alberto Herrán, Abraham Duarte, J Manuel Colmenar","doi":"10.1162/evco_a_00317","DOIUrl":"10.1162/evco_a_00317","url":null,"abstract":"<p><p>Reproducibility of experiments is a complex task in stochastic methods such as evolutionary algorithms or metaheuristics in general. Many works from the literature give general guidelines to favor reproducibility. However, none of them provide both a practical set of steps or software tools to help in this process. In this article, we propose a practical methodology to favor reproducibility in optimization problems tackled with stochastic methods. This methodology is divided into three main steps, where the researcher is assisted by software tools which implement state-of-the-art techniques related to this process. The methodology has been applied to study the double-row facility layout problem (DRFLP) where we propose a new algorithm able to obtain better results than the state-of-the-art methods. To this aim, we have also replicated the previous methods in order to complete the study with a new set of larger instances. All the produced artifacts related to the methodology and the study of the target problem are available in Zenodo.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40695126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Importance of Being Constrained: Dealing with Infeasible Solutions in Differential Evolution and Beyond. 受约束的重要性:处理微分进化论中的不可行解及其他问题
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-03-01 DOI: 10.1162/evco_a_00333
Anna V Kononova, Diederick Vermetten, Fabio Caraffini, Madalina-A Mitran, Daniela Zaharie

We argue that results produced by a heuristic optimisation algorithm cannot be considered reproducible unless the algorithm fully specifies what should be done with solutions generated outside the domain, even in the case of simple bound constraints. Currently, in the field of heuristic optimisation, such specification is rarely mentioned or investigated due to the assumed triviality or insignificance of this question. Here, we demonstrate that, at least in algorithms based on Differential Evolution, this choice induces notably different behaviours in terms of performance, disruptiveness, and population diversity. This is shown theoretically (where possible) for standard Differential Evolution in the absence of selection pressure and experimentally for the standard and state-of-the-art Differential Evolution variants, on a special test function and the BBOB benchmarking suite, respectively. Moreover, we demonstrate that the importance of this choice quickly grows with problem dimensionality. Differential Evolution is not at all special in this regard-there is no reason to presume that other heuristic optimisers are not equally affected by the aforementioned algorithmic choice. Thus, we urge the heuristic optimisation community to formalise and adopt the idea of a new algorithmic component in heuristic optimisers, which we refer to as the strategy of dealing with infeasible solutions. This component needs to be consistently: (a) specified in algorithmic descriptions to guarantee reproducibility of results, (b) studied to better understand its impact on an algorithm's performance in a wider sense (i.e., convergence time, robustness, etc.), and (c) included in the (automatic) design of algorithms. All of these should be done even for problems with bound constraints.

我们认为,启发式优化算法产生的结果不能被认为是可重复的,除非该算法充分说明应该如何处理域外产生的解,即使是在简单约束的情况下。目前,在启发式优化领域,由于假定这个问题微不足道或无关紧要,很少有人提及或研究这种说明。在这里,我们证明,至少在基于差分进化的算法中,这种选择会在性能、破坏性和种群多样性方面引起明显不同的行为。我们从理论上(在可能的情况下)证明了标准差分进化算法在没有选择压力的情况下的表现,并从实验上证明了标准差分进化算法和最先进的差分进化算法变体在特殊测试函数和 BBOB 基准测试套件上的表现。此外,我们还证明了这一选择的重要性随着问题维度的增加而迅速增加。差分进化论在这方面并不特殊--我们没有理由认为其他启发式优化器不会同样受到上述算法选择的影响。因此,我们敦促启发式优化社区正式提出并采用启发式优化器中的新算法组件这一理念,我们将其称为处理不可行解的策略。这个组成部分需要始终如一:(a)在算法描述中具体说明,以保证结果的可重复性;(b)对其进行研究,以更好地理解其对算法性能的广泛影响(即收敛时间、鲁棒性等);(c)将其纳入算法的(自动)设计中。即使对于有约束条件的问题,也应进行所有这些研究。
{"title":"The Importance of Being Constrained: Dealing with Infeasible Solutions in Differential Evolution and Beyond.","authors":"Anna V Kononova, Diederick Vermetten, Fabio Caraffini, Madalina-A Mitran, Daniela Zaharie","doi":"10.1162/evco_a_00333","DOIUrl":"10.1162/evco_a_00333","url":null,"abstract":"<p><p>We argue that results produced by a heuristic optimisation algorithm cannot be considered reproducible unless the algorithm fully specifies what should be done with solutions generated outside the domain, even in the case of simple bound constraints. Currently, in the field of heuristic optimisation, such specification is rarely mentioned or investigated due to the assumed triviality or insignificance of this question. Here, we demonstrate that, at least in algorithms based on Differential Evolution, this choice induces notably different behaviours in terms of performance, disruptiveness, and population diversity. This is shown theoretically (where possible) for standard Differential Evolution in the absence of selection pressure and experimentally for the standard and state-of-the-art Differential Evolution variants, on a special test function and the BBOB benchmarking suite, respectively. Moreover, we demonstrate that the importance of this choice quickly grows with problem dimensionality. Differential Evolution is not at all special in this regard-there is no reason to presume that other heuristic optimisers are not equally affected by the aforementioned algorithmic choice. Thus, we urge the heuristic optimisation community to formalise and adopt the idea of a new algorithmic component in heuristic optimisers, which we refer to as the strategy of dealing with infeasible solutions. This component needs to be consistently: (a) specified in algorithmic descriptions to guarantee reproducibility of results, (b) studied to better understand its impact on an algorithm's performance in a wider sense (i.e., convergence time, robustness, etc.), and (c) included in the (automatic) design of algorithms. All of these should be done even for problems with bound constraints.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9474478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Decomposed Error for Reproducing Implicit Understanding of Algorithms. 利用分解错误重现对算法的隐性理解。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-03-01 DOI: 10.1162/evco_a_00321
Caitlin A Owen, Grant Dick, Peter A Whigham

Reproducibility is important for having confidence in evolutionary machine learning algorithms. Although the focus of reproducibility is usually to recreate an aggregate prediction error score using fixed random seeds, this is not sufficient. Firstly, multiple runs of an algorithm, without a fixed random seed, should ideally return statistically equivalent results. Secondly, it should be confirmed whether the expected behaviour of an algorithm matches its actual behaviour, in terms of how an algorithm targets a reduction in prediction error. Confirming the behaviour of an algorithm is not possible when using a total error aggregate score. Using an error decomposition framework as a methodology for improving the reproducibility of results in evolutionary computation addresses both of these factors. By estimating decomposed error using multiple runs of an algorithm and multiple training sets, the framework provides a greater degree of certainty about the prediction error. Also, decomposing error into bias, variance due to the algorithm (internal variance), and variance due to the training data (external variance) more fully characterises evolutionary algorithms. This allows the behaviour of an algorithm to be confirmed. Applying the framework to a number of evolutionary algorithms shows that their expected behaviour can be different to their actual behaviour. Identifying a behaviour mismatch is important in terms of understanding how to further refine an algorithm as well as how to effectively apply an algorithm to a problem.

可重复性对于建立对进化机器学习算法的信心非常重要。尽管可重复性的重点通常是使用固定的随机种子重新生成一个总的预测误差分数,但这还不够。首先,理想情况下,在没有固定随机种子的情况下,算法的多次运行应在统计上得到相同的结果。其次,应从算法如何减少预测误差的角度,确认算法的预期行为是否与实际行为相符。如果使用总误差综合得分,则无法确认算法的行为。使用误差分解框架作为提高进化计算结果可重复性的方法,可以解决上述两个问题。通过使用算法的多次运行和多个训练集来估算分解误差,该框架可提供更高的预测误差确定性。此外,将误差分解为偏差、算法引起的方差(内部方差)和训练数据引起的方差(外部方差),可以更全面地描述进化算法的特征。这样就可以确认算法的行为。将该框架应用于一些进化算法后发现,它们的预期行为可能与实际行为不同。识别行为不匹配对于理解如何进一步完善算法以及如何有效地将算法应用于问题非常重要。
{"title":"Using Decomposed Error for Reproducing Implicit Understanding of Algorithms.","authors":"Caitlin A Owen, Grant Dick, Peter A Whigham","doi":"10.1162/evco_a_00321","DOIUrl":"10.1162/evco_a_00321","url":null,"abstract":"<p><p>Reproducibility is important for having confidence in evolutionary machine learning algorithms. Although the focus of reproducibility is usually to recreate an aggregate prediction error score using fixed random seeds, this is not sufficient. Firstly, multiple runs of an algorithm, without a fixed random seed, should ideally return statistically equivalent results. Secondly, it should be confirmed whether the expected behaviour of an algorithm matches its actual behaviour, in terms of how an algorithm targets a reduction in prediction error. Confirming the behaviour of an algorithm is not possible when using a total error aggregate score. Using an error decomposition framework as a methodology for improving the reproducibility of results in evolutionary computation addresses both of these factors. By estimating decomposed error using multiple runs of an algorithm and multiple training sets, the framework provides a greater degree of certainty about the prediction error. Also, decomposing error into bias, variance due to the algorithm (internal variance), and variance due to the training data (external variance) more fully characterises evolutionary algorithms. This allows the behaviour of an algorithm to be confirmed. Applying the framework to a number of evolutionary algorithms shows that their expected behaviour can be different to their actual behaviour. Identifying a behaviour mismatch is important in terms of understanding how to further refine an algorithm as well as how to effectively apply an algorithm to a problem.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9084698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BUSTLE: a Versatile Tool for the Evolutionary Learning of STL Specifications from Data. BUSTLE:从数据中进化学习 STL 规格的多功能工具。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-02-19 DOI: 10.1162/evco_a_00347
Federico Pigozzi, Laura Nenzi, Eric Medvet

Describing the properties of complex systems that evolve over time is a crucial requirement for monitoring and understanding them. Signal Temporal Logic (STL) is a framework that proved to be effective for this aim because it is expressive and allows state properties as human-readable formulae. Crafting STL formulae that fit a particular system is, however, a difficult task. For this reason, a few approaches have been proposed recently for the automatic learning of STL formulae starting from observations of the system. In this paper, we propose BUSTLE (Bi-level Universal STL Evolver), an approach based on evolutionary computation for learning STL formulae from data. BUSTLE advances the state-of-the-art because it (i) applies to a broader class of problems, in terms of what is known about the state of the system during its observation, and (ii) generates both the structure and the values of the parameters of the formulae employing a bi-level search mechanism (global for the structure, local for the parameters). We consider two cases where (a) observations of the system in both anomalous and regular state are available, or (b) only observations of regular state are available. We experimentally evaluate BUSTLE on problem instances corresponding to the two cases and compare it against previous approaches. We show that the evolved STL formulae are effective and human-readable: the versatility of BUSTLE does not come at the cost of lower effectiveness.

描述随时间演变的复杂系统的属性是监测和理解这些系统的关键要求。信号时态逻辑(STL)是一个被证明能有效实现这一目标的框架,因为它具有很强的表现力,能将状态属性描述为人类可读的公式。然而,如何设计出适合特定系统的 STL 公式是一项艰巨的任务。因此,最近有人提出了一些从系统观测结果出发自动学习 STL 公式的方法。在本文中,我们提出了 BUSTLE(双级通用 STL 进化器),这是一种基于进化计算的方法,用于从数据中学习 STL 公式。BUSTLE 超越了最先进的技术水平,因为它(i)适用于更广泛的问题类别,即在观测过程中已知的系统状态;(ii)采用双层搜索机制(结构为全局搜索,参数为局部搜索)生成公式的结构和参数值。我们考虑了两种情况:(a) 可同时观测到系统的异常状态和正常状态,或 (b) 只能观测到正常状态。我们在对应这两种情况的问题实例上对 BUSTLE 进行了实验评估,并与之前的方法进行了比较。结果表明,演化出的 STL 公式既有效又便于人类阅读:BUSTLE 的多功能性并没有以降低有效性为代价。
{"title":"BUSTLE: a Versatile Tool for the Evolutionary Learning of STL Specifications from Data.","authors":"Federico Pigozzi, Laura Nenzi, Eric Medvet","doi":"10.1162/evco_a_00347","DOIUrl":"https://doi.org/10.1162/evco_a_00347","url":null,"abstract":"<p><p>Describing the properties of complex systems that evolve over time is a crucial requirement for monitoring and understanding them. Signal Temporal Logic (STL) is a framework that proved to be effective for this aim because it is expressive and allows state properties as human-readable formulae. Crafting STL formulae that fit a particular system is, however, a difficult task. For this reason, a few approaches have been proposed recently for the automatic learning of STL formulae starting from observations of the system. In this paper, we propose BUSTLE (Bi-level Universal STL Evolver), an approach based on evolutionary computation for learning STL formulae from data. BUSTLE advances the state-of-the-art because it (i) applies to a broader class of problems, in terms of what is known about the state of the system during its observation, and (ii) generates both the structure and the values of the parameters of the formulae employing a bi-level search mechanism (global for the structure, local for the parameters). We consider two cases where (a) observations of the system in both anomalous and regular state are available, or (b) only observations of regular state are available. We experimentally evaluate BUSTLE on problem instances corresponding to the two cases and compare it against previous approaches. We show that the evolved STL formulae are effective and human-readable: the versatility of BUSTLE does not come at the cost of lower effectiveness.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139913984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving. 知情下采样词库选择:为高效解决问题识别富有成效的训练案例。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-01-26 DOI: 10.1162/evco_a_00346
Ryan Boldi, Martin Briesch, Dominik Sobania, Alexander Lalejini, Thomas Helmuth, Franz Rothlauf, Charles Ofria, Lee Spector

Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection. Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions. However, sampling randomly can exclude important cases from the down-sample for a number of generations, while cases that measure the same behavior (synonymous cases) may be overused. In this work, we introduce Informed Down-Sampled Lexicase Selection. This method leverages population statistics to build down-samples that contain more distinct and therefore informative training cases. Through an empirical investigation across two different GP systems (PushGP and Grammar-Guided GP), we find that informed down-sampling significantly outperforms random down-sampling on a set of contemporary program synthesis benchmark problems. Through an analysis of the created down-samples, we find that important training cases are included in the down-sample consistently across independent evolutionary runs and systems. We hypothesize that this improvement can be attributed to the ability of Informed Down-Sampled Lexicase Selection to maintain more specialist individuals over the course of evolution, while still benefiting from reduced per-evaluation costs.

遗传编程(GP)通常使用大型训练集,并要求在选择过程中对所有训练案例中的所有个体进行评估。随机向下抽样的词法选择只在训练案例的随机子集上对个体进行评估,这样就能在执行相同数量程序的情况下探索出更多个体。然而,随机抽样可能会在若干代内将重要的案例排除在向下抽样之外,而测量相同行为的案例(同义案例)可能会被过度使用。在这项工作中,我们引入了 "知情向下抽样词库选择"(Informed Down-Sampled Lexicase Selection)。这种方法利用群体统计来建立向下样本,这些样本包含更多不同的训练案例,因此信息量更大。通过对两个不同的 GP 系统(PushGP 和语法引导 GP)进行实证调查,我们发现在一组当代程序合成基准问题上,有信息的向下采样明显优于随机向下采样。通过对所创建的下采样进行分析,我们发现重要的训练案例在不同的进化运行和系统中都会被一致地纳入下采样中。我们假设,这种改进可归因于知情下采样词库选择(Informed Down-Sampled Lexicase Selection)在进化过程中保持更多专业个体的能力,同时还能从降低每次评估成本中获益。
{"title":"Informed Down-Sampled Lexicase Selection: Identifying productive training cases for efficient problem solving.","authors":"Ryan Boldi, Martin Briesch, Dominik Sobania, Alexander Lalejini, Thomas Helmuth, Franz Rothlauf, Charles Ofria, Lee Spector","doi":"10.1162/evco_a_00346","DOIUrl":"https://doi.org/10.1162/evco_a_00346","url":null,"abstract":"<p><p>Genetic Programming (GP) often uses large training sets and requires all individuals to be evaluated on all training cases during selection. Random down-sampled lexicase selection evaluates individuals on only a random subset of the training cases allowing for more individuals to be explored with the same amount of program executions. However, sampling randomly can exclude important cases from the down-sample for a number of generations, while cases that measure the same behavior (synonymous cases) may be overused. In this work, we introduce Informed Down-Sampled Lexicase Selection. This method leverages population statistics to build down-samples that contain more distinct and therefore informative training cases. Through an empirical investigation across two different GP systems (PushGP and Grammar-Guided GP), we find that informed down-sampling significantly outperforms random down-sampling on a set of contemporary program synthesis benchmark problems. Through an analysis of the created down-samples, we find that important training cases are included in the down-sample consistently across independent evolutionary runs and systems. We hypothesize that this improvement can be attributed to the ability of Informed Down-Sampled Lexicase Selection to maintain more specialist individuals over the course of evolution, while still benefiting from reduced per-evaluation costs.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139562620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of Distribution Algorithm for Grammar-Guided Genetic Programming. 语法引导遗传编程的分布算法估算。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2024-01-26 DOI: 10.1162/evco_a_00345
Pablo Ramos Criado, D Barrios Rolanía, David de la Hoz, Daniel Manrique

Genetic variation operators in grammar-guided genetic programming are fundamental to guide the evolutionary process in search and optimization problems. However, they show some limitations, mainly derived from an unbalanced exploration and local-search trade-off. This article presents an estimation of distribution algorithm for grammar-guided genetic programming to overcome this difficulty and thus increase the performance of the evolutionary algorithm. Our proposal employs an extended dynamic stochastic context-free grammar to encode and calculate the estimation of the distribution of the search space from some promising individuals in the population. Unlike traditional estimation of distribution algorithms, the proposed approach improves exploratory behavior by smoothing the estimated distribution model. Therefore, this algorithm is referred to as SEDA, smoothed estimation of distribution algorithm. Experiments have been conducted to compare overall performance using a typical genetic programming crossover operator, an incremental estimation of distribution algorithm, and the proposed approach after tuning their hyperparameters. These experiments involve challenging problems to test the local search and exploration features of the three evolutionary systems. The results show that grammar-guided genetic programming with SEDA achieves the most accurate solutions with an intermediate convergence speed.

语法引导遗传编程中的遗传变异算子是引导搜索和优化问题进化过程的基础。然而,它们也存在一些局限性,主要是探索和局部搜索权衡不平衡。本文提出了一种语法引导遗传编程的分布估计算法,以克服这一困难,从而提高进化算法的性能。我们的建议采用一种扩展的动态随机无上下文语法来编码和计算种群中一些有希望的个体对搜索空间分布的估计。与传统的分布估计算法不同,我们提出的方法通过平滑估计分布模型来改善探索行为。因此,这种算法被称为 SEDA,即平滑估计分布算法。通过实验,比较了使用典型遗传编程交叉算子、增量估计分布算法和调整超参数后的拟议方法的整体性能。这些实验涉及具有挑战性的问题,以测试这三种进化系统的局部搜索和探索功能。结果表明,语法引导的遗传编程与 SEDA 以中等收敛速度获得了最准确的解决方案。
{"title":"Estimation of Distribution Algorithm for Grammar-Guided Genetic Programming.","authors":"Pablo Ramos Criado, D Barrios Rolanía, David de la Hoz, Daniel Manrique","doi":"10.1162/evco_a_00345","DOIUrl":"https://doi.org/10.1162/evco_a_00345","url":null,"abstract":"<p><p>Genetic variation operators in grammar-guided genetic programming are fundamental to guide the evolutionary process in search and optimization problems. However, they show some limitations, mainly derived from an unbalanced exploration and local-search trade-off. This article presents an estimation of distribution algorithm for grammar-guided genetic programming to overcome this difficulty and thus increase the performance of the evolutionary algorithm. Our proposal employs an extended dynamic stochastic context-free grammar to encode and calculate the estimation of the distribution of the search space from some promising individuals in the population. Unlike traditional estimation of distribution algorithms, the proposed approach improves exploratory behavior by smoothing the estimated distribution model. Therefore, this algorithm is referred to as SEDA, smoothed estimation of distribution algorithm. Experiments have been conducted to compare overall performance using a typical genetic programming crossover operator, an incremental estimation of distribution algorithm, and the proposed approach after tuning their hyperparameters. These experiments involve challenging problems to test the local search and exploration features of the three evolutionary systems. The results show that grammar-guided genetic programming with SEDA achieves the most accurate solutions with an intermediate convergence speed.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139565374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Data Stream Ensemble Assisted Multifactorial Evolutionary Algorithm for Offline Data-Driven Dynamic Optimization. 一种数据流集成辅助的多因子进化算法用于离线数据驱动的动态优化。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2023-12-01 DOI: 10.1162/evco_a_00332
Cuie Yang, Jinliang Ding, Yaochu Jin, Tianyou Chai

Existing work on offline data-driven optimization mainly focuses on problems in static environments, and little attention has been paid to problems in dynamic environments. Offline data-driven optimization in dynamic environments is a challenging problem because the distribution of collected data varies over time, requiring surrogate models and optimal solutions tracking with time. This paper proposes a knowledge-transfer-based data-driven optimization algorithm to address these issues. First, an ensemble learning method is adopted to train surrogate models to leverage the knowledge of data in historical environments as well as adapt to new environments. Specifically, given data in a new environment, a model is constructed with the new data, and the preserved models of historical environments are further trained with the new data. Then, these models are considered to be base learners and combined as an ensemble surrogate model. After that, all base learners and the ensemble surrogate model are simultaneously optimized in a multitask environment for finding optimal solutions for real fitness functions. In this way, the optimization tasks in the previous environments can be used to accelerate the tracking of the optimum in the current environment. Since the ensemble model is the most accurate surrogate, we assign more individuals to the ensemble surrogate than its base learners. Empirical results on six dynamic optimization benchmark problems demonstrate the effectiveness of the proposed algorithm compared with four state-of-the-art offline data-driven optimization algorithms. Code is available at https://github.com/Peacefulyang/DSE_MFS.git.

现有的离线数据驱动优化研究主要集中在静态环境下的问题,对动态环境下的问题关注较少。动态环境中的离线数据驱动优化是一个具有挑战性的问题,因为所收集数据的分布随时间而变化,需要代理模型和随时间跟踪的最优解决方案。本文提出了一种基于知识转移的数据驱动优化算法来解决这些问题。首先,采用集成学习方法训练代理模型,以利用历史环境中的数据知识并适应新环境。具体而言,在给定新环境中的数据后,使用新数据构建模型,并使用新数据进一步训练保留的历史环境模型。然后,将这些模型视为基础学习器并组合为集成代理模型。然后,在多任务环境中同时优化所有基础学习器和集成代理模型,以寻找真实适应度函数的最优解。这样,就可以利用之前环境中的优化任务来加速当前环境中最优的跟踪。由于集成模型是最准确的代理,我们将更多的个体分配给集成代理,而不是其基础学习器。六个动态优化基准问题的实证结果表明,与四种最先进的离线数据驱动优化算法相比,本文提出的算法是有效的。代码可从https://github.com/Peacefulyang/DSE_MFS.git获得。
{"title":"A Data Stream Ensemble Assisted Multifactorial Evolutionary Algorithm for Offline Data-Driven Dynamic Optimization.","authors":"Cuie Yang, Jinliang Ding, Yaochu Jin, Tianyou Chai","doi":"10.1162/evco_a_00332","DOIUrl":"10.1162/evco_a_00332","url":null,"abstract":"<p><p>Existing work on offline data-driven optimization mainly focuses on problems in static environments, and little attention has been paid to problems in dynamic environments. Offline data-driven optimization in dynamic environments is a challenging problem because the distribution of collected data varies over time, requiring surrogate models and optimal solutions tracking with time. This paper proposes a knowledge-transfer-based data-driven optimization algorithm to address these issues. First, an ensemble learning method is adopted to train surrogate models to leverage the knowledge of data in historical environments as well as adapt to new environments. Specifically, given data in a new environment, a model is constructed with the new data, and the preserved models of historical environments are further trained with the new data. Then, these models are considered to be base learners and combined as an ensemble surrogate model. After that, all base learners and the ensemble surrogate model are simultaneously optimized in a multitask environment for finding optimal solutions for real fitness functions. In this way, the optimization tasks in the previous environments can be used to accelerate the tracking of the optimum in the current environment. Since the ensemble model is the most accurate surrogate, we assign more individuals to the ensemble surrogate than its base learners. Empirical results on six dynamic optimization benchmark problems demonstrate the effectiveness of the proposed algorithm compared with four state-of-the-art offline data-driven optimization algorithms. Code is available at https://github.com/Peacefulyang/DSE_MFS.git.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9424656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Theoretical Analyses of Multiobjective Evolutionary Algorithms on Multimodal Objectives. 多模态目标下多目标进化算法的理论分析。
IF 6.8 2区 计算机科学 Q1 Mathematics Pub Date : 2023-12-01 DOI: 10.1162/evco_a_00328
Weijie Zheng, Benjamin Doerr

Multiobjective evolutionary algorithms are successfully applied in many real-world multiobjective optimization problems. As for many other AI methods, the theoretical understanding of these algorithms is lagging far behind their success in practice. In particular, previous theory work considers mostly easy problems that are composed of unimodal objectives. As a first step towards a deeper understanding of how evolutionary algorithms solve multimodal multiobjective problems, we propose the OneJumpZeroJump problem, a bi-objective problem composed of two objectives isomorphic to the classic jump function benchmark. We prove that the simple evolutionary multiobjective optimizer (SEMO) with probability one does not compute the full Pareto front, regardless of the runtime. In contrast, for all problem sizes n and all jump sizes k∈[4..n2-1], the global SEMO (GSEMO) covers the Pareto front in an expected number of Θ((n-2k)nk) iterations. For k=o(n), we also show the tighter bound 32enk+1±o(nk+1), which might be the first runtime bound for an MOEA that is tight apart from lower-order terms. We also combine the GSEMO with two approaches that showed advantages in single-objective multimodal problems. When using the GSEMO with a heavy-tailed mutation operator, the expected runtime improves by a factor of at least kΩ(k). When adapting the recent stagnation-detection strategy of Rajabi and Witt (2022) to the GSEMO, the expected runtime also improves by a factor of at least kΩ(k) and surpasses the heavy-tailed GSEMO by a small polynomial factor in k. Via an experimental analysis, we show that these asymptotic differences are visible already for small problem sizes: A factor-5 speed-up from heavy-tailed mutation and a factor-10 speed-up from stagnation detection can be observed already for jump size 4 and problem sizes between 10 and 50. Overall, our results show that the ideas recently developed to aid single-objective evolutionary algorithms to cope with local optima can be effectively employed also in multiobjective optimization.

多目标进化算法成功地应用于许多现实世界的多目标优化问题。对于许多其他的人工智能方法,对这些算法的理论认识远远落后于它们在实践中的成功。特别是,以前的理论工作主要考虑由单峰目标组成的简单问题。作为深入理解进化算法如何解决多模态多目标问题的第一步,我们提出了OneJumpZeroJump问题,这是一个由两个与经典跳跃函数基准同构的目标组成的双目标问题。证明了概率为1的简单进化多目标优化器(SEMO)在不考虑运行时间的情况下不计算完整的Pareto前沿。相反,对于所有问题大小n和所有跳跃大小k∈[4..][n2-1],全局SEMO (GSEMO)在Θ((n-2k)nk)次迭代中覆盖了Pareto前沿。对于k=o(n),我们还显示了更紧密的边界32enk+1±o(nk+1),这可能是除了低阶项外MOEA的第一个紧密运行时边界。我们还将GSEMO与两种在单目标多模态问题中表现出优势的方法结合起来。当使用带有重尾突变操作符的GSEMO时,预期的运行时间至少提高kΩ(k)。当将Rajabi和Witt(2022)的最新停滞检测策略应用于GSEMO时,预期运行时间也提高了至少kΩ(k),并在k中超过了重尾GSEMO的一个小多项式因子。通过实验分析,我们表明这些渐近差异对于小问题规模已经是可见的:对于跳跃大小为4和问题大小在10到50之间的情况,可以观察到来自重尾突变的5倍加速和来自停滞检测的10倍加速。总的来说,我们的研究结果表明,最近发展起来的帮助单目标进化算法处理局部最优的思想也可以有效地应用于多目标优化。
{"title":"Theoretical Analyses of Multiobjective Evolutionary Algorithms on Multimodal Objectives.","authors":"Weijie Zheng, Benjamin Doerr","doi":"10.1162/evco_a_00328","DOIUrl":"10.1162/evco_a_00328","url":null,"abstract":"<p><p>Multiobjective evolutionary algorithms are successfully applied in many real-world multiobjective optimization problems. As for many other AI methods, the theoretical understanding of these algorithms is lagging far behind their success in practice. In particular, previous theory work considers mostly easy problems that are composed of unimodal objectives. As a first step towards a deeper understanding of how evolutionary algorithms solve multimodal multiobjective problems, we propose the OneJumpZeroJump problem, a bi-objective problem composed of two objectives isomorphic to the classic jump function benchmark. We prove that the simple evolutionary multiobjective optimizer (SEMO) with probability one does not compute the full Pareto front, regardless of the runtime. In contrast, for all problem sizes n and all jump sizes k∈[4..n2-1], the global SEMO (GSEMO) covers the Pareto front in an expected number of Θ((n-2k)nk) iterations. For k=o(n), we also show the tighter bound 32enk+1±o(nk+1), which might be the first runtime bound for an MOEA that is tight apart from lower-order terms. We also combine the GSEMO with two approaches that showed advantages in single-objective multimodal problems. When using the GSEMO with a heavy-tailed mutation operator, the expected runtime improves by a factor of at least kΩ(k). When adapting the recent stagnation-detection strategy of Rajabi and Witt (2022) to the GSEMO, the expected runtime also improves by a factor of at least kΩ(k) and surpasses the heavy-tailed GSEMO by a small polynomial factor in k. Via an experimental analysis, we show that these asymptotic differences are visible already for small problem sizes: A factor-5 speed-up from heavy-tailed mutation and a factor-10 speed-up from stagnation detection can be observed already for jump size 4 and problem sizes between 10 and 50. Overall, our results show that the ideas recently developed to aid single-objective evolutionary algorithms to cope with local optima can be effectively employed also in multiobjective optimization.</p>","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":null,"pages":null},"PeriodicalIF":6.8,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9250961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
期刊
Evolutionary Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1