Genetic Programming and Evolvable Machines最新文献

Evolving code with a large language model 使用大型语言模型演化代码

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-09-12 DOI: 10.1007/s10710-024-09494-2

Erik Hemberg, Stephen Moskal, Una-May O’Reilly

Algorithms that use Large Language Models (LLMs) to evolve code arrived on the Genetic Programming (GP) scene very recently. We present LLM_GP, a general LLM-based evolutionary algorithm designed to evolve code. Like GP, it uses evolutionary operators, but its designs and implementations of those operators significantly differ from GP’s because they enlist an LLM, using prompting and the LLM’s pre-trained pattern matching and sequence completion capability. We also present a demonstration-level variant of LLM_GP and share its code. By presentations that range from formal to hands-on, we cover design and LLM-usage considerations as well as the scientific challenges that arise when using an LLM for genetic programming.

使用大型语言模型（LLM）来演化代码的算法最近才出现在遗传编程（GP）领域。我们介绍的 LLM_GP 是一种基于 LLM 的通用进化算法，旨在进化代码。与 GP 一样，它也使用进化算子，但其设计和这些算子的实现与 GP 有很大不同，因为它们使用提示和 LLM 预先训练好的模式匹配和序列补全能力，利用了 LLM。我们还介绍了 LLM_GP 的演示级变体，并分享了其代码。通过从形式到实践的演讲，我们介绍了设计和 LLM 使用方面的注意事项，以及使用 LLM 进行遗传编程时遇到的科学挑战。

引用次数: 0

Hga-lstm: LSTM architecture and hyperparameter search by hybrid GA for air pollution prediction Hga-lstm：用于空气污染预测的 LSTM 架构和混合 GA 的超参数搜索

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-08-01 DOI: 10.1007/s10710-024-09493-3

Jiayu Liang, Yaxin Lu, Mingming Su

Air pollution prediction is a process of predicting the levels of air pollutants in a specific area over a given period. Since LSTM (Long Short-Term Memory) networks are particularly effective in capturing long-term dependencies and patterns in sequential data, they are widely-used for air pollution prediction. However, designing appropriate LSTM architectures and hyperparameters for given tasks can be challenging, which are normally determined by users in existing LSTM-based methods. Note that Genetic Algorithm (GA) is an effective optimization technique, and local search in augmenting the global search ability of GA has been proved, which is rarely considered by existing GA-optimzied LSTM methods. In this work, simultaneous LSTM architecture and hyperparameter search based on GA and local search techniques is investigated for air pollution prediction. Specifically, a new LSTM model search method is designed, termed as HGA-LSTM. HGA is a hybrid GA, which is proposed by integrating GA with local search adaptively. Based on HGA, HGA-LSTM is developed to search for LSTM models with simultaneous LSTM architecture and hyperparameter optimization. In HGA-LSTM, a new crossover is designed to be adaptive to the variable-length representation of LSTM models. The proposed HGA-LSTM is compared with widely-used LSTM-based and nonLSTM-based prediction methods on UCI (University of California Irvine) datasets for air pollution prediction. Results show that HGA-LSTM is generally better than both types of reference methods with its evolved LSTM models achieving lower mean square/absolute errors. Moreover, compared with a baseline method (a GA without local search), HGA-LSTM converges to lower error values, which reflects that HGA has better search ability than GA.

空气污染预测是一个预测特定区域在一定时期内空气污染物水平的过程。由于 LSTM（长短期记忆）网络在捕捉连续数据中的长期依赖关系和模式方面特别有效，因此被广泛用于空气污染预测。然而，为给定任务设计合适的 LSTM 架构和超参数可能具有挑战性，在现有的基于 LSTM 的方法中，这些参数通常由用户决定。需要注意的是，遗传算法（GA）是一种有效的优化技术，而且局部搜索在增强 GA 全局搜索能力方面的作用已得到证实，而现有的 GA 优化 LSTM 方法很少考虑这一点。本文研究了基于 GA 和局部搜索技术的 LSTM 架构和超参数搜索在空气污染预测中的应用。具体来说，我们设计了一种新的 LSTM 模型搜索方法，称为 HGA-LSTM。HGA 是一种混合 GA，通过自适应地集成 GA 和局部搜索而提出。在 HGA 的基础上，HGA-LSTM 被开发出来，用于同时搜索 LSTM 架构和超参数优化的 LSTM 模型。在 HGA-LSTM 中，设计了一种新的交叉，以适应 LSTM 模型的变长表示。在用于空气污染预测的 UCI（加州大学欧文分校）数据集上，将所提出的 HGA-LSTM 与广泛使用的基于 LSTM 和非基于 LSTM 的预测方法进行了比较。结果表明，HGA-LSTM 总体上优于这两种参考方法，其进化 LSTM 模型的均方误差/绝对误差更小。此外，与基准方法（不含局部搜索的 GA）相比，HGA-LSTM 收敛到更低的误差值，这反映出 HGA 比 GA 具有更好的搜索能力。

{"title":"Hga-lstm: LSTM architecture and hyperparameter search by hybrid GA for air pollution prediction","authors":"Jiayu Liang, Yaxin Lu, Mingming Su","doi":"10.1007/s10710-024-09493-3","DOIUrl":"https://doi.org/10.1007/s10710-024-09493-3","url":null,"abstract":"Air pollution prediction is a process of predicting the levels of air pollutants in a specific area over a given period. Since LSTM (Long Short-Term Memory) networks are particularly effective in capturing long-term dependencies and patterns in sequential data, they are widely-used for air pollution prediction. However, designing appropriate LSTM architectures and hyperparameters for given tasks can be challenging, which are normally determined by users in existing LSTM-based methods. Note that Genetic Algorithm (GA) is an effective optimization technique, and local search in augmenting the global search ability of GA has been proved, which is rarely considered by existing GA-optimzied LSTM methods. In this work, simultaneous LSTM architecture and hyperparameter search based on GA and local search techniques is investigated for air pollution prediction. Specifically, a new LSTM model search method is designed, termed as HGA-LSTM. HGA is a hybrid GA, which is proposed by integrating GA with local search adaptively. Based on HGA, HGA-LSTM is developed to search for LSTM models with simultaneous LSTM architecture and hyperparameter optimization. In HGA-LSTM, a new crossover is designed to be adaptive to the variable-length representation of LSTM models. The proposed HGA-LSTM is compared with widely-used LSTM-based and nonLSTM-based prediction methods on UCI (University of California Irvine) datasets for air pollution prediction. Results show that HGA-LSTM is generally better than both types of reference methods with its evolved LSTM models achieving lower mean square/absolute errors. Moreover, compared with a baseline method (a GA without local search), HGA-LSTM converges to lower error values, which reflects that HGA has better search ability than GA.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"23 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141882004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A survey on dynamic populations in bio-inspired algorithms 生物启发算法中的动态种群调查

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-07-24 DOI: 10.1007/s10710-024-09492-4

Davide Farinati, Leonardo Vanneschi

Population-Based Bio-Inspired Algorithms (PBBIAs) are computational methods that simulate natural biological processes, such as evolution or social behaviors, to solve optimization problems. Traditionally, PBBIAs use a population of static size, set beforehand through a specific parameter. Nevertheless, for several decades now, the idea of employing populations of dynamic size, capable of adjusting during the course of a single run, has gained ground. Various methods have been introduced, ranging from simpler ones that use a predefined function to determine the population size variation, to more sophisticated methods where the population size in different phases of the evolutionary process depends on the dynamics of the evolution itself and events occurring within the population during the run. The common underlying idea in many of these approaches, is similar: to save a significant amount of computational effort in phases where the evolution is functioning well, and therefore a large population is not needed. This allows for reusing the previously saved computational effort when optimization becomes more challenging, and hence a greater computational effort is required. Numerous past contributions have demonstrated a notable advantage of using dynamically sized populations, often resulting in comparable results to those obtained by the standard PBBIAs but with a significant saving of computational effort. However, despite the numerous successes that have been presented, to date, there is still no comprehensive collection of past contributions on the use of dynamic populations that allows for their categorization and critical analysis. This article aims to bridge this gap by presenting a systematic literature review regarding the use of dynamic populations in PBBIAs, as well as identifying gaps in the research that can lead the path to future works.

基于种群的生物启发算法（PBBIAs）是一种模拟自然生物过程（如进化或社会行为）来解决优化问题的计算方法。传统上，PBBIAs 使用的是事先通过特定参数设定好的静态种群规模。然而，数十年来，采用动态规模种群（可在单次运行过程中进行调整）的理念已逐渐深入人心。目前已经出现了多种方法，从使用预定函数确定种群规模变化的简单方法，到进化过程不同阶段的种群规模取决于进化本身的动态和运行过程中种群内发生的事件的复杂方法。许多这些方法的共同基本思想是相似的：在进化过程运行良好的阶段节省大量计算工作，因此不需要大量种群。这样，当优化变得更具挑战性，从而需要更大的计算量时，就可以重新使用之前节省下来的计算量。过去的许多研究成果都证明了使用动态规模种群的显著优势，其结果往往与标准 PBBIAs 得出的结果相当，但却大大节省了计算量。然而，尽管已经取得了众多成功，但迄今为止，仍没有一个关于使用动态种群的全面文献集，可以对其进行分类和批判性分析。本文旨在弥合这一差距，系统回顾了有关在 PBBIA 中使用动态种群的文献，并找出了研究中的不足，为今后的工作指明了方向。

{"title":"A survey on dynamic populations in bio-inspired algorithms","authors":"Davide Farinati, Leonardo Vanneschi","doi":"10.1007/s10710-024-09492-4","DOIUrl":"https://doi.org/10.1007/s10710-024-09492-4","url":null,"abstract":"Population-Based Bio-Inspired Algorithms (PBBIAs) are computational methods that simulate natural biological processes, such as evolution or social behaviors, to solve optimization problems. Traditionally, PBBIAs use a population of static size, set beforehand through a specific parameter. Nevertheless, for several decades now, the idea of employing populations of dynamic size, capable of adjusting during the course of a single run, has gained ground. Various methods have been introduced, ranging from simpler ones that use a predefined function to determine the population size variation, to more sophisticated methods where the population size in different phases of the evolutionary process depends on the dynamics of the evolution itself and events occurring within the population during the run. The common underlying idea in many of these approaches, is similar: to save a significant amount of computational effort in phases where the evolution is functioning well, and therefore a large population is not needed. This allows for reusing the previously saved computational effort when optimization becomes more challenging, and hence a greater computational effort is required. Numerous past contributions have demonstrated a notable advantage of using dynamically sized populations, often resulting in comparable results to those obtained by the standard PBBIAs but with a significant saving of computational effort. However, despite the numerous successes that have been presented, to date, there is still no comprehensive collection of past contributions on the use of dynamic populations that allows for their categorization and critical analysis. This article aims to bridge this gap by presenting a systematic literature review regarding the use of dynamic populations in PBBIAs, as well as identifying gaps in the research that can lead the path to future works.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"68 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GSGP-hardware: instantaneous symbolic regression with an FPGA implementation of geometric semantic genetic programming GSGP-硬件：利用 FPGA 实现几何语义遗传编程的瞬时符号回归

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-06-25 DOI: 10.1007/s10710-024-09491-5

Yazmin Maldonado, Ruben Salas, Joel A. Quevedo, Rogelio Valdez, Leonardo Trujillo

Geometric Semantic Genetic Programming (GSGP) proposed an important enhancement to GP-based learning, incorporating search operators that operate directly on the semantics of the parents with bounded effects on the semantics of the offspring. This approach posed any symbolic regression fitness landscape as a unimodal function, allowing for more directed search. Moreover, it became evident that the search could be implemented in a much more efficient manner, that does not require the execution, evaluation or manipulation of variable length syntactic models. Hence, efficient implementations of this algorithm have been developed using both CPU and GPU processing. However, current implementations are still ill-suited for real-time learning, or learning on devices with limited resources, scenarios that are becoming more prevalent with the continued development of the Internet-of-Things and the increased need for efficient and distributed learning on the Edge. This paper presents GSGP-Hardware, a fully pipelined and parallel design of GSGP developed fully using VHDL, for implementation on FPGA devices. Using Vivado AMD-Xilinx for synthesis and simulation, GSGP-Hardware achieves an approximate improvement in efficiency, in terms of run time and Gpops/s, of three and four orders of magnitude, respectively, compared with the state-of-the-art GPU implementation. This is a performance increase that has not been achieved by other FPGA-based implementations of genetic programming. This is possible due to the manner in which GSGP evolves a model, and competitive accuracy is achieved by incorporating simple but powerful enhancements to the original GSGP algorithm. GSGP-Hardware allows for instantaneous symbolic regression, opening up new application domains for this powerful variant of genetic programming.

几何语义遗传编程（GSGP）为基于 GP 的学习提出了一个重要的改进方案，它结合了搜索运算符，可直接对父代的语义进行操作，并对子代的语义产生有界影响。这种方法将任何符号回归适合度景观都视为单模态函数，从而实现了更有方向性的搜索。此外，这种搜索方式显然可以更高效地实现，而不需要执行、评估或操作长度可变的句法模型。因此，该算法的高效实现方法已被开发出来，同时使用 CPU 和 GPU 处理。然而，当前的实现仍不适合实时学习或在资源有限的设备上学习，而随着物联网的不断发展以及对边缘高效分布式学习需求的增加，这种情况正变得越来越普遍。本文介绍了 GSGP 硬件，它是完全使用 VHDL 开发的 GSGP 全流水线并行设计，可在 FPGA 设备上实现。使用 Vivado AMD-Xilinx 进行综合和仿真，GSGP-Hardware 与最先进的 GPU 实现相比，在运行时间和 Gpops/s 方面的效率分别提高了近似三个和四个数量级。这是其他基于 FPGA 的遗传编程实现所无法达到的性能提升。这得益于 GSGP 演化模型的方式，通过对原始 GSGP 算法进行简单而强大的改进，实现了具有竞争力的精度。GSGP 硬件允许瞬时符号回归，为这一强大的遗传编程变体开辟了新的应用领域。

{"title":"GSGP-hardware: instantaneous symbolic regression with an FPGA implementation of geometric semantic genetic programming","authors":"Yazmin Maldonado, Ruben Salas, Joel A. Quevedo, Rogelio Valdez, Leonardo Trujillo","doi":"10.1007/s10710-024-09491-5","DOIUrl":"https://doi.org/10.1007/s10710-024-09491-5","url":null,"abstract":"Geometric Semantic Genetic Programming (GSGP) proposed an important enhancement to GP-based learning, incorporating search operators that operate directly on the semantics of the parents with bounded effects on the semantics of the offspring. This approach posed any symbolic regression fitness landscape as a unimodal function, allowing for more directed search. Moreover, it became evident that the search could be implemented in a much more efficient manner, that does not require the execution, evaluation or manipulation of variable length syntactic models. Hence, efficient implementations of this algorithm have been developed using both CPU and GPU processing. However, current implementations are still ill-suited for real-time learning, or learning on devices with limited resources, scenarios that are becoming more prevalent with the continued development of the Internet-of-Things and the increased need for efficient and distributed learning on the Edge. This paper presents GSGP-Hardware, a fully pipelined and parallel design of GSGP developed fully using VHDL, for implementation on FPGA devices. Using Vivado AMD-Xilinx for synthesis and simulation, GSGP-Hardware achieves an approximate improvement in efficiency, in terms of run time and Gpops/s, of three and four orders of magnitude, respectively, compared with the state-of-the-art GPU implementation. This is a performance increase that has not been achieved by other FPGA-based implementations of genetic programming. This is possible due to the manner in which GSGP evolves a model, and competitive accuracy is achieved by incorporating simple but powerful enhancements to the original GSGP algorithm. GSGP-Hardware allows for instantaneous symbolic regression, opening up new application domains for this powerful variant of genetic programming.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"5 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141501115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Geometric semantic GP with linear scaling: Darwinian versus Lamarckian evolution 线性缩放的几何语义 GP：达尔文进化论与拉马克进化论

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-06-01 DOI: 10.1007/s10710-024-09488-0

Giorgia Nadizar, Berfin Sakallioglu, Fraser Garrow, Sara Silva, Leonardo Vanneschi

Geometric Semantic Genetic Programming (GSGP) has shown notable success in symbolic regression with the introduction of Linear Scaling (LS). This achievement stems from the synergy of the geometric semantic genetic operators of GSGP with the scaling of the individuals for computing their fitness, which favours programs with a promising behaviour. However, the initial combination of GSGP and LS (GSGP-LS) underutilised the potential of LS, scaling individuals only for fitness evaluation, neglecting to incorporate improvements into their genetic material. In this paper we propose an advancement, GSGP with Lamarckian LS (GSGP-LLS), wherein we update the individuals in the population with their scaling coefficients in a Lamarckian fashion, i.e., by inheritance of acquired traits. We assess GSGP-LS and GSGP-LLS against standard GSGP for the task of symbolic regression on five hand-tailored benchmarks and six real-life problems. On the former ones, GSGP-LS and GSGP-LLS both consistently improve GSGP, though with no clear global superiority between them. On the real-world problems, instead, GSGP-LLS steadily outperforms GSGP-LS, achieving faster convergence and superior final performance. Notably, even in cases where LS induces overfitting on challenging problems, GSGP-LLS surpasses GSGP-LS, due to its slower and more localised optimisation steps.

几何语义遗传编程（GSGP）引入线性缩放（LS）后，在符号回归方面取得了显著的成功。这一成就源于 GSGP 的几何语义遗传算子与计算个体适合度的缩放的协同作用，这有利于具有良好行为的程序。然而，GSGP 和 LS 的最初组合（GSGP-LS）并没有充分利用 LS 的潜力，只是为了适配度评估而对个体进行缩放，忽略了将改进纳入其遗传物质中。在本文中，我们提出了一种改进方案，即带有拉马克 LS 的 GSGP（GSGP-LLS），通过拉马克方式（即后天性状的遗传）更新种群中个体的缩放系数。我们在五个手工定制的基准和六个实际问题上评估了 GSGP-LS 和 GSGP-LLS 与标准 GSGP 在符号回归任务上的对比。在前者中，GSGP-LS 和 GSGP-LLS 都持续改进了 GSGP，尽管它们之间没有明显的整体优势。相反，在实际问题上，GSGP-LLS 稳步超越 GSGP-LS，收敛速度更快，最终性能更优。值得注意的是，即使在 LS 引发过拟合的挑战性问题上，GSGP-LLS 也能超越 GSGP-LS，这是因为它的优化步骤更慢、更局部化。

{"title":"Geometric semantic GP with linear scaling: Darwinian versus Lamarckian evolution","authors":"Giorgia Nadizar, Berfin Sakallioglu, Fraser Garrow, Sara Silva, Leonardo Vanneschi","doi":"10.1007/s10710-024-09488-0","DOIUrl":"https://doi.org/10.1007/s10710-024-09488-0","url":null,"abstract":"Geometric Semantic Genetic Programming (GSGP) has shown notable success in symbolic regression with the introduction of Linear Scaling (LS). This achievement stems from the synergy of the geometric semantic genetic operators of GSGP with the scaling of the individuals for computing their fitness, which favours programs with a promising behaviour. However, the initial combination of GSGP and LS (GSGP-LS) underutilised the potential of LS, scaling individuals only for fitness evaluation, neglecting to incorporate improvements into their genetic material. In this paper we propose an advancement, GSGP with Lamarckian LS (GSGP-LLS), wherein we update the individuals in the population with their scaling coefficients in a Lamarckian fashion, i.e., by inheritance of acquired traits. We assess GSGP-LS and GSGP-LLS against standard GSGP for the task of symbolic regression on five hand-tailored benchmarks and six real-life problems. On the former ones, GSGP-LS and GSGP-LLS both consistently improve GSGP, though with no clear global superiority between them. On the real-world problems, instead, GSGP-LLS steadily outperforms GSGP-LS, achieving faster convergence and superior final performance. Notably, even in cases where LS induces overfitting on challenging problems, GSGP-LLS surpasses GSGP-LS, due to its slower and more localised optimisation steps.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"9 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141195445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hierarchical non-dominated sort: analysis and improvement 分层非支配排序：分析与改进

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-04-16 DOI: 10.1007/s10710-024-09487-1

Ved Prakash, Sumit Mishra

Pareto dominance-based multiobjective evolutionary algorithms use non-dominated sorting to rank their solutions. In the last few decades, various approaches have been proposed for non-dominated sorting. However, the running time analysis of some of the approaches has some issues and they are imprecise. In this paper, we focus on one such algorithm namely hierarchical non-dominated sort (HNDS), where the running time is imprecise and obtain the generic equations that show the number of dominance comparisons in the worst and the best case. Based on the equation for the worst case, we obtain the worst-case running time as well as the scenario where the worst case occurs. Based on the equation for the best case, we identify a scenario where HNDS performs less number of dominance comparisons than that presented in the original paper, making the best-case analysis of the original paper unrigorous. In the end, we present an improved version of HNDS which guarantees the claimed worst-case time complexity by the authors of HNDS which is ({mathcal {O}}(MN^2)).

基于帕累托优势的多目标进化算法使用非优势排序对其解决方案进行排序。在过去几十年中，人们提出了各种非支配排序方法。然而，其中一些方法的运行时间分析存在一些问题，而且不精确。在本文中，我们重点研究了运行时间不精确的分层非支配排序（HNDS）算法，并获得了显示最坏和最好情况下支配比较次数的通用方程。根据最坏情况下的等式，我们得到了最坏情况下的运行时间以及出现最坏情况的场景。根据最佳情况下的等式，我们确定了一种情况，即 HNDS 执行的优势比较次数少于原论文中的次数，从而使原论文中的最佳情况分析变得不严谨。最后，我们提出了 HNDS 的改进版本，它保证了 HNDS 作者声称的最坏情况下的时间复杂度，即 ({mathcal {O}}(MN^2)).

引用次数: 0

A genetic programming approach to the automated design of CNN models for image classification and video shorts creation 用于图像分类和视频短片创作的 CNN 模型自动设计遗传编程方法

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-03-14 DOI: 10.1007/s10710-024-09483-5

Abstract

Neural architecture search (NAS) is a rapidly growing field which focuses on the automated design of neural network architectures. Genetic algorithms (GAs) have been predominantly used for evolving neural network architectures. Genetic programming (GP), a variation of GAs that work in the program space rather than a solution space, has not been as well researched for NAS. This paper aims to contribute to the research into GP for NAS. Previous research in this field can be divided into two categories. In the first each program represents neural networks directly or components and parameters of neural networks. In the second category each program is a set of instructions, which when executed, produces a neural network. This study focuses on this second category which has not been well researched. Previous work has used grammatical evolution for generating these programs. This study examines canonical GP for neural network design (GPNND) for this purpose. It also evaluates a variation of GP, iterative structure-based GP (ISBGP) for evolving these programs. The study compares the performance of GAs, GPNND and ISBGP for image classification and video shorts creation. Both GPNND and ISBGP were found to outperform GAs, with ISBGP producing better results than GPNND for both applications. Both GPNND and ISBGP produced better results than previous studies employing grammatical evolution on the CIFAR-10 dataset.

摘要神经架构搜索（NAS）是一个快速发展的领域，其重点是自动设计神经网络架构。遗传算法（GA）主要用于进化神经网络架构。遗传编程（GP）是遗传算法的一种变体，它在程序空间而非解空间工作，但在 NAS 方面的研究还不够深入。本文旨在为针对 NAS 的 GP 研究做出贡献。该领域以往的研究可分为两类。第一类是每个程序直接代表神经网络或神经网络的组件和参数。在第二类中，每个程序都是一组指令，执行时产生一个神经网络。本研究的重点是第二类程序，对这类程序的研究还不够深入。以前的研究使用语法进化来生成这些程序。本研究为此目的研究了用于神经网络设计的典型 GP（GPNND）。它还评估了用于进化这些程序的 GP 变体--基于结构的迭代 GP（ISBGP）。研究比较了 GA、GPNND 和 ISBGP 在图像分类和视频短片创作方面的性能。研究发现，GPNND 和 ISBGP 的性能均优于 GAs，其中 ISBGP 在这两种应用中的结果均优于 GPNND。在 CIFAR-10 数据集上，GPNND 和 ISBGP 的结果都优于之前采用语法进化的研究。

{"title":"A genetic programming approach to the automated design of CNN models for image classification and video shorts creation","authors":"","doi":"10.1007/s10710-024-09483-5","DOIUrl":"https://doi.org/10.1007/s10710-024-09483-5","url":null,"abstract":"<h3>Abstract</h3> Neural architecture search (NAS) is a rapidly growing field which focuses on the automated design of neural network architectures. Genetic algorithms (GAs) have been predominantly used for evolving neural network architectures. Genetic programming (GP), a variation of GAs that work in the program space rather than a solution space, has not been as well researched for NAS. This paper aims to contribute to the research into GP for NAS. Previous research in this field can be divided into two categories. In the first each program represents neural networks directly or components and parameters of neural networks. In the second category each program is a set of instructions, which when executed, produces a neural network. This study focuses on this second category which has not been well researched. Previous work has used grammatical evolution for generating these programs. This study examines canonical GP for neural network design (GPNND) for this purpose. It also evaluates a variation of GP, iterative structure-based GP (ISBGP) for evolving these programs. The study compares the performance of GAs, GPNND and ISBGP for image classification and video shorts creation. Both GPNND and ISBGP were found to outperform GAs, with ISBGP producing better results than GPNND for both applications. Both GPNND and ISBGP produced better results than previous studies employing grammatical evolution on the CIFAR-10 dataset.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"64 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140155652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An ensemble learning interpretation of geometric semantic genetic programming 几何语义遗传编程的集合学习解释

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-03-11 DOI: 10.1007/s10710-024-09482-6

Grant Dick

Geometric semantic genetic programming (GSGP) is a variant of genetic programming (GP) that directly searches the semantic space of programs to produce candidate solutions. GSGP has shown considerable success in improving the performance of GP in terms of program correctness, however this comes at the expense of exponential program growth. Subsequent attempts to address this growth have not fully-exploited the fact that GSGP searches by producing linear combinations of existing solutions. This paper examines this property of GSGP and frames the method as an ensemble learning approach by redefining mutation and crossover as examples of boosting and stacking, respectively. The ensemble interpretation allows for simple integration of regularisation techniques that significantly reduce the size of the resultant programs. Additionally, this paper examines the quality of parse tree base learners within this ensemble learning interpretation of GSGP and suggests that future research could substantially improve the quality of GSGP by examining more effective initialisation techniques. The resulting ensemble learning interpretation leads to variants of GSGP that substantially improve upon the performance of traditional GSGP in regression contexts, and produce a method that frequently outperforms gradient boosting.

几何语义遗传编程（GSGP）是遗传编程（GP）的一种变体，它直接搜索程序的语义空间来生成候选解。GSGP 在提高 GP 的程序正确性方面取得了相当大的成功，但这是以程序的指数级增长为代价的。随后为解决这一增长问题所做的尝试并未充分利用 GSGP 通过生成现有解决方案的线性组合来进行搜索这一事实。本文研究了 GSGP 的这一特性，并通过将突变和交叉分别重新定义为提升和堆叠的实例，将该方法构建为一种集合学习方法。通过集合解释，可以简单地整合正则化技术，从而大大减少结果程序的大小。此外，本文还研究了在 GSGP 的集合学习解释中解析树基础学习器的质量，并建议未来的研究可以通过研究更有效的初始化技术来大幅提高 GSGP 的质量。由此产生的集合学习解释导致了 GSGP 的变体，大大提高了传统 GSGP 在回归背景下的性能，并产生了一种经常优于梯度提升的方法。

{"title":"An ensemble learning interpretation of geometric semantic genetic programming","authors":"Grant Dick","doi":"10.1007/s10710-024-09482-6","DOIUrl":"https://doi.org/10.1007/s10710-024-09482-6","url":null,"abstract":"Geometric semantic genetic programming (GSGP) is a variant of genetic programming (GP) that directly searches the semantic space of programs to produce candidate solutions. GSGP has shown considerable success in improving the performance of GP in terms of program correctness, however this comes at the expense of exponential program growth. Subsequent attempts to address this growth have not fully-exploited the fact that GSGP searches by producing linear combinations of existing solutions. This paper examines this property of GSGP and frames the method as an ensemble learning approach by redefining mutation and crossover as examples of boosting and stacking, respectively. The ensemble interpretation allows for simple integration of regularisation techniques that significantly reduce the size of the resultant programs. Additionally, this paper examines the quality of parse tree base learners within this ensemble learning interpretation of GSGP and suggests that future research could substantially improve the quality of GSGP by examining more effective initialisation techniques. The resulting ensemble learning interpretation leads to variants of GSGP that substantially improve upon the performance of traditional GSGP in regression contexts, and produce a method that frequently outperforms gradient boosting.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"1 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140099841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cellular geometric semantic genetic programming 细胞几何语义遗传编程

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-02-21 DOI: 10.1007/s10710-024-09480-8

Lorenzo Bonin, Luigi Rovito, Andrea De Lorenzo, Luca Manzoni

Among the different variants of Genetic Programming (GP), Geometric Semantic GP (GSGP) has proved to be both efficient and effective in finding good solutions. The fact that the operators of GSGP operate on the semantics of the individuals in a clear way provides guarantees on the way the search is performed. GSGP is not, however, free from limitations like the premature convergence of the population to a small–and possibly sub-optimal–area of the search space. One reason for this issue could be the fact that good individuals can quickly “spread” in the population suppressing the emergence of competition. To mitigate this problem, we impose a cellular automata (CA) inspired communication topology over GSGP. In CAs a collection of agents (as finite state automata) are positioned in a n-dimensional periodic grid and communicates only locally with the automata in their neighbourhoods. Similarly, we assign a location to each individual on an n-dimensional grid and the entire evolution for an individual will happen locally by considering, for each individual, only the individuals in its neighbourhood. Specifically, we present an algorithm in which, for each generation, a subset of the neighbourhood of each individual is sampled and the selection for the given cell in the grid is performed by extracting the two best individuals of this subset, which are employed as parents for the Geometric Semantic Crossover. We compare this cellular GSGP (cGSGP) approach with standard GSGP on eight regression problems, showing that it can provide better solutions than GSGP. Moreover, by analyzing convergence rates, we show that the improvement is observable regardless of the number of executed generations. As a side effect, we additionally show that combining a small-neighbourhood-based cellular spatial structure with GSGP helps in producing smaller solutions. Finally, we measure the spatial autocorrelation of the population by adopting the Moran’s I coefficient to provide an overview of the diversity, showing that our cellular spatial structure helps in providing better diversity during the early stages of the evolution.

在遗传编程（GP）的各种变体中，几何语义 GP（GSGP）已被证明在寻找好的解决方案方面既高效又有效。GSGP 的运算符以明确的方式对个体的语义进行运算，这为搜索方式提供了保证。不过，GSGP 也有其局限性，比如种群会过早收敛到搜索空间的一小块区域，而且可能是次优区域。造成这一问题的原因之一可能是，优秀个体会在种群中迅速 "扩散"，从而抑制竞争的出现。为了缓解这一问题，我们在 GSGP 上采用了受细胞自动机（CA）启发的通信拓扑结构。在蜂窝自动机中，代理集合（作为有限状态自动机）被放置在一个 n 维的周期性网格中，只与其邻近的自动机进行局部通信。同样，我们为 n 维网格中的每个个体分配一个位置，个体的整个进化过程将在本地进行，每个个体只考虑其邻域中的个体。具体来说，我们提出了一种算法，在这种算法中，每一代都会对每个个体的邻域子集进行采样，并通过提取该子集中的两个最佳个体来对网格中的给定单元进行选择，这两个个体将被用作几何语义交叉的父代。我们在八个回归问题上比较了这种蜂窝 GSGP（cGSGP）方法和标准 GSGP，结果表明它能提供比 GSGP 更好的解决方案。此外，通过分析收敛率，我们发现无论执行多少代，都能观察到改进。此外，我们还表明，将基于小邻域的蜂窝空间结构与 GSGP 结合，有助于产生更小的解。最后，我们采用莫兰 I 系数来测量种群的空间自相关性，以提供多样性概览，这表明我们的蜂窝空间结构有助于在演化的早期阶段提供更好的多样性。

{"title":"Cellular geometric semantic genetic programming","authors":"Lorenzo Bonin, Luigi Rovito, Andrea De Lorenzo, Luca Manzoni","doi":"10.1007/s10710-024-09480-8","DOIUrl":"https://doi.org/10.1007/s10710-024-09480-8","url":null,"abstract":"Among the different variants of Genetic Programming (GP), Geometric Semantic GP (GSGP) has proved to be both efficient and effective in finding good solutions. The fact that the operators of GSGP operate on the semantics of the individuals in a clear way provides guarantees on the way the search is performed. GSGP is not, however, free from limitations like the premature convergence of the population to a small–and possibly sub-optimal–area of the search space. One reason for this issue could be the fact that good individuals can quickly “spread” in the population suppressing the emergence of competition. To mitigate this problem, we impose a cellular automata (CA) inspired communication topology over GSGP. In CAs a collection of agents (as finite state automata) are positioned in a n-dimensional periodic grid and communicates only locally with the automata in their neighbourhoods. Similarly, we assign a location to each individual on an n-dimensional grid and the entire evolution for an individual will happen locally by considering, for each individual, only the individuals in its neighbourhood. Specifically, we present an algorithm in which, for each generation, a subset of the neighbourhood of each individual is sampled and the selection for the given cell in the grid is performed by extracting the two best individuals of this subset, which are employed as parents for the Geometric Semantic Crossover. We compare this cellular GSGP (cGSGP) approach with standard GSGP on eight regression problems, showing that it can provide better solutions than GSGP. Moreover, by analyzing convergence rates, we show that the improvement is observable regardless of the number of executed generations. As a side effect, we additionally show that combining a small-neighbourhood-based cellular spatial structure with GSGP helps in producing smaller solutions. Finally, we measure the spatial autocorrelation of the population by adopting the Moran’s I coefficient to provide an overview of the diversity, showing that our cellular spatial structure helps in providing better diversity during the early stages of the evolution.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"402 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139926735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Neural network crossover in genetic algorithms using genetic programming 使用遗传编程的遗传算法中的神经网络交叉

IF 2.6 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Genetic Programming and Evolvable Machines

Pub Date : 2024-02-21 DOI: 10.1007/s10710-024-09481-7

Kyle Pretorius, Nelishia Pillay

The use of genetic algorithms (GAs) to evolve neural network (NN) weights has risen in popularity in recent years, particularly when used together with gradient descent as a mutation operator. However, crossover operators are often omitted from such GAs as they are seen as being highly destructive and detrimental to the performance of the GA. Designing crossover operators that can effectively be applied to NNs has been an active area of research with success limited to specific problem domains. The focus of this study is to use genetic programming (GP) to automatically evolve crossover operators that can be applied to NN weights and used in GAs. A novel GP is proposed and used to evolve both reusable and disposable crossover operators to compare their efficiency. Experiments are conducted to compare the performance of GAs using no crossover operator or a commonly used human designed crossover operator to GAs using GP evolved crossover operators. Results from experiments conducted show that using GP to evolve disposable crossover operators leads to highly effectively crossover operators that significantly improve the results obtained from the GA.

近年来，使用遗传算法（GA）来进化神经网络（NN）权重的做法越来越流行，尤其是将梯度下降算法作为突变算子一起使用时。然而，此类遗传算法通常不使用交叉算子，因为交叉算子被认为具有很强的破坏性，会损害遗传算法的性能。设计能有效应用于网络的交叉算子一直是一个活跃的研究领域，但其成功仅限于特定的问题领域。本研究的重点是使用遗传编程（GP）来自动演化可应用于 NN 权重并在遗传算法中使用的交叉算子。我们提出了一种新颖的 GP，并将其用于进化可重复使用和一次性的交叉算子，以比较它们的效率。实验比较了不使用交叉算子或常用人工设计交叉算子的遗传算法与使用 GP 演化交叉算子的遗传算法的性能。实验结果表明，使用 GP 进化出的一次性交叉算子能产生高效的交叉算子，显著改善遗传算法的结果。

{"title":"Neural network crossover in genetic algorithms using genetic programming","authors":"Kyle Pretorius, Nelishia Pillay","doi":"10.1007/s10710-024-09481-7","DOIUrl":"https://doi.org/10.1007/s10710-024-09481-7","url":null,"abstract":"The use of genetic algorithms (GAs) to evolve neural network (NN) weights has risen in popularity in recent years, particularly when used together with gradient descent as a mutation operator. However, crossover operators are often omitted from such GAs as they are seen as being highly destructive and detrimental to the performance of the GA. Designing crossover operators that can effectively be applied to NNs has been an active area of research with success limited to specific problem domains. The focus of this study is to use genetic programming (GP) to automatically evolve crossover operators that can be applied to NN weights and used in GAs. A novel GP is proposed and used to evolve both reusable and disposable crossover operators to compare their efficiency. Experiments are conducted to compare the performance of GAs using no crossover operator or a commonly used human designed crossover operator to GAs using GP evolved crossover operators. Results from experiments conducted show that using GP to evolve disposable crossover operators leads to highly effectively crossover operators that significantly improve the results obtained from the GA.","PeriodicalId":50424,"journal":{"name":"Genetic Programming and Evolvable Machines","volume":"17 1","pages":""},"PeriodicalIF":2.6,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139926733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0