This article presents a novel method, called Modular Grammatical Evolution (MGE), toward validating the hypothesis that restricting the solution space of NeuroEvolution to modular and simple neural networks enables the efficient generation of smaller and more structured neural networks while providing acceptable (and in some cases superior) accuracy on large data sets. MGE also enhances the state-of-the-art Grammatical Evolution (GE) methods in two directions. First, MGE's representation is modular in that each individual has a set of genes, and each gene is mapped to a neuron by grammatical rules. Second, the proposed representation mitigates two important drawbacks of GE, namely the low scalability and weak locality of representation, toward generating modular and multilayer networks with a high number of neurons. We define and evaluate five different forms of structures with and without modularity using MGE and find single-layer modules with no coupling more productive. Our experiments demonstrate that modularity helps in finding better neural networks faster. We have validated the proposed method using ten well-known classification benchmarks with different sizes, feature counts, and output class counts. Our experimental results indicate that MGE provides superior accuracy with respect to existing NeuroEvolution methods and returns classifiers that are significantly simpler than other machine learning generated classifiers. Finally, we empirically demonstrate that MGE outperforms other GE methods in terms of locality and scalability properties.
{"title":"Modular Grammatical Evolution for the Generation of Artificial Neural Networks","authors":"Khabat Soltanian;Ali Ebnenasir;Mohsen Afsharchi","doi":"10.1162/evco_a_00302","DOIUrl":"10.1162/evco_a_00302","url":null,"abstract":"This article presents a novel method, called Modular Grammatical Evolution (MGE), toward validating the hypothesis that restricting the solution space of NeuroEvolution to modular and simple neural networks enables the efficient generation of smaller and more structured neural networks while providing acceptable (and in some cases superior) accuracy on large data sets. MGE also enhances the state-of-the-art Grammatical Evolution (GE) methods in two directions. First, MGE's representation is modular in that each individual has a set of genes, and each gene is mapped to a neuron by grammatical rules. Second, the proposed representation mitigates two important drawbacks of GE, namely the low scalability and weak locality of representation, toward generating modular and multilayer networks with a high number of neurons. We define and evaluate five different forms of structures with and without modularity using MGE and find single-layer modules with no coupling more productive. Our experiments demonstrate that modularity helps in finding better neural networks faster. We have validated the proposed method using ten well-known classification benchmarks with different sizes, feature counts, and output class counts. Our experimental results indicate that MGE provides superior accuracy with respect to existing NeuroEvolution methods and returns classifiers that are significantly simpler than other machine learning generated classifiers. Finally, we empirically demonstrate that MGE outperforms other GE methods in terms of locality and scalability properties.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 2","pages":"291-327"},"PeriodicalIF":6.8,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39701853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most existing multiobjective evolutionary algorithms (MOEAs) implicitly assume that each objective function can be evaluated within the same period of time. Typically. this is untenable in many real-world optimization scenarios where evaluation of different objectives involves different computer simulations or physical experiments with distinct time complexity. To address this issue, a transfer learning scheme based on surrogate-assisted evolutionary algorithms (SAEAs) is proposed, in which a co-surrogate is adopted to model the functional relationship between the fast and slow objective functions and a transferable instance selection method is introduced to acquire useful knowledge from the search process of the fast objective. Our experimental results on DTLZ and UF test suites demonstrate that the proposed algorithm is competitive for solving bi-objective optimization where objectives have non-uniform evaluation times.
{"title":"Transfer Learning Based Co-Surrogate Assisted Evolutionary Bi-Objective Optimization for Objectives with Non-Uniform Evaluation Times","authors":"Xilu Wang;Yaochu Jin;Sebastian Schmitt;Markus Olhofer","doi":"10.1162/evco_a_00300","DOIUrl":"10.1162/evco_a_00300","url":null,"abstract":"Most existing multiobjective evolutionary algorithms (MOEAs) implicitly assume that each objective function can be evaluated within the same period of time. Typically. this is untenable in many real-world optimization scenarios where evaluation of different objectives involves different computer simulations or physical experiments with distinct time complexity. To address this issue, a transfer learning scheme based on surrogate-assisted evolutionary algorithms (SAEAs) is proposed, in which a co-surrogate is adopted to model the functional relationship between the fast and slow objective functions and a transferable instance selection method is introduced to acquire useful knowledge from the search process of the fast objective. Our experimental results on DTLZ and UF test suites demonstrate that the proposed algorithm is competitive for solving bi-objective optimization where objectives have non-uniform evaluation times.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 2","pages":"221-251"},"PeriodicalIF":6.8,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39859789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract An important challenge in reinforcement learning is to solve multimodal problems, where agents have to act in qualitatively different ways depending on the circumstances. Because multimodal problems are often too difficult to solve directly, it is often helpful to define a curriculum, which is an ordered set of subtasks that can serve as the stepping stones for solving the overall problem. Unfortunately, choosing an effective ordering for these subtasks is difficult, and a poor ordering can reduce the performance of the learning process. Here, we provide a thorough introduction and investigation of the Combinatorial Multiobjective Evolutionary Algorithm (CMOEA), which allows all combinations of subtasks to be explored simultaneously. We compare CMOEA against three algorithms that can similarly optimize on multiple subtasks simultaneously: NSGA-II, NSGA-III, and ε-Lexicase Selection. The algorithms are tested on a function-optimization problem with two subtasks, a simulated multimodal robot locomotion problem with six subtasks, and a simulated robot maze-navigation problem where a hundred random mazes are treated as subtasks. On these problems, CMOEA either outperforms or is competitive with the controls. As a separate contribution, we show that adding a linear combination over all objectives can improve the ability of the control algorithms to solve these multimodal problems. Lastly, we show that CMOEA can leverage auxiliary objectives more effectively than the controls on the multimodal locomotion task. In general, our experiments suggest that CMOEA is a promising algorithm for solving multimodal problems.
{"title":"Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multiobjective Evolutionary Algorithm","authors":"Joost Huizinga;Jeff Clune","doi":"10.1162/evco_a_00301","DOIUrl":"10.1162/evco_a_00301","url":null,"abstract":"Abstract An important challenge in reinforcement learning is to solve multimodal problems, where agents have to act in qualitatively different ways depending on the circumstances. Because multimodal problems are often too difficult to solve directly, it is often helpful to define a curriculum, which is an ordered set of subtasks that can serve as the stepping stones for solving the overall problem. Unfortunately, choosing an effective ordering for these subtasks is difficult, and a poor ordering can reduce the performance of the learning process. Here, we provide a thorough introduction and investigation of the Combinatorial Multiobjective Evolutionary Algorithm (CMOEA), which allows all combinations of subtasks to be explored simultaneously. We compare CMOEA against three algorithms that can similarly optimize on multiple subtasks simultaneously: NSGA-II, NSGA-III, and ε-Lexicase Selection. The algorithms are tested on a function-optimization problem with two subtasks, a simulated multimodal robot locomotion problem with six subtasks, and a simulated robot maze-navigation problem where a hundred random mazes are treated as subtasks. On these problems, CMOEA either outperforms or is competitive with the controls. As a separate contribution, we show that adding a linear combination over all objectives can improve the ability of the control algorithms to solve these multimodal problems. Lastly, we show that CMOEA can leverage auxiliary objectives more effectively than the controls on the multimodal locomotion task. In general, our experiments suggest that CMOEA is a promising algorithm for solving multimodal problems.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 2","pages":"131-164"},"PeriodicalIF":6.8,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39744129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joel Chacón Castillo;Carlos Segura;Carlos A. Coello Coello
Most state-of-the-art Multiobjective Evolutionary Algorithms (moeas) promote the preservation of diversity of objective function space but neglect the diversity of decision variable space. The aim of this article is to show that explicitly managing the amount of diversity maintained in the decision variable space is useful to increase the quality of moeas when taking into account metrics of the objective space. Our novel Variable Space Diversity-based MOEA (vsd-moea) explicitly considers the diversity of both decision variable and objective function space. This information is used with the aim of properly adapting the balance between exploration and intensification during the optimization process. Particularly, at the initial stages, decisions made by the approach are more biased by the information on the diversity of the variable space, whereas it gradually grants more importance to the diversity of objective function space as the evolution progresses. The latter is achieved through a novel density estimator. The new method is compared with state-of-art moeas using several benchmarks with two and three objectives. This novel proposal yields much better results than state-of-the-art schemes when considering metrics applied on objective function space, exhibiting a more stable and robust behavior.
{"title":"VSD-MOEA: A Dominance-Based Multiobjective Evolutionary Algorithm with Explicit Variable Space Diversity Management","authors":"Joel Chacón Castillo;Carlos Segura;Carlos A. Coello Coello","doi":"10.1162/evco_a_00299","DOIUrl":"10.1162/evco_a_00299","url":null,"abstract":"Most state-of-the-art Multiobjective Evolutionary Algorithms (moeas) promote the preservation of diversity of objective function space but neglect the diversity of decision variable space. The aim of this article is to show that explicitly managing the amount of diversity maintained in the decision variable space is useful to increase the quality of moeas when taking into account metrics of the objective space. Our novel Variable Space Diversity-based MOEA (vsd-moea) explicitly considers the diversity of both decision variable and objective function space. This information is used with the aim of properly adapting the balance between exploration and intensification during the optimization process. Particularly, at the initial stages, decisions made by the approach are more biased by the information on the diversity of the variable space, whereas it gradually grants more importance to the diversity of objective function space as the evolution progresses. The latter is achieved through a novel density estimator. The new method is compared with state-of-art moeas using several benchmarks with two and three objectives. This novel proposal yields much better results than state-of-the-art schemes when considering metrics applied on objective function space, exhibiting a more stable and robust behavior.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 2","pages":"195-219"},"PeriodicalIF":6.8,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39698125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The class of algorithms called Hessian Estimation Evolution Strategies (HE-ESs) update the covariance matrix of their sampling distribution by directly estimating the curvature of the objective function. The approach is practically efficient, as attested by respectable performance on the BBOB testbed, even on rather irregular functions. In this article, we formally prove two strong guarantees for the (1 + 4)-HE-ES, a minimal elitist member of the family: stability of the covariance matrix update, and as a consequence, linear convergence on all convex quadratic problems at a rate that is independent of the problem instance.
{"title":"Convergence Analysis of the Hessian Estimation Evolution Strategy","authors":"Tobias Glasmachers;Oswin Krause","doi":"10.1162/evco_a_00295","DOIUrl":"10.1162/evco_a_00295","url":null,"abstract":"The class of algorithms called Hessian Estimation Evolution Strategies (HE-ESs) update the covariance matrix of their sampling distribution by directly estimating the curvature of the objective function. The approach is practically efficient, as attested by respectable performance on the BBOB testbed, even on rather irregular functions. In this article, we formally prove two strong guarantees for the (1 + 4)-HE-ES, a minimal elitist member of the family: stability of the covariance matrix update, and as a consequence, linear convergence on all convex quadratic problems at a rate that is independent of the problem instance.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 1","pages":"27-50"},"PeriodicalIF":6.8,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39625448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Niching methods have been developed to maintain the population diversity, to investigate many peaks in parallel, and to reduce the effect of genetic drift. We present the first rigorous runtime analyses of restricted tournament selection (RTS), embedded in a (μ+1) EA, and analyse its effectiveness at finding both optima of the bimodal function TwoMax. In RTS, an offspring competes against the closest individual, with respect to some distance measure, amongst w (window size) population members (chosen uniformly at random with replacement), to encourage competition within the same niche. We prove that RTS finds both optima on TwoMax efficiently if the window size w is large enough. However, if w is too small, RTS fails to find both optima even in exponential time, with high probability. We further consider a variant of RTS selecting individuals for the tournament without replacement. It yields a more diverse tournament and is more effective at preventing one niche from taking over the other. However, this comes at the expense of a slower progress towards optima when a niche collapses to a single individual. Our theoretical results are accompanied by experimental studies that shed light on parameters not covered by the theoretical results and support a conjectured lower runtime bound.
{"title":"Runtime Analysis of Restricted Tournament Selection for Bimodal Optimisation","authors":"Edgar Covantes Osuna;Dirk Sudholt","doi":"10.1162/evco_a_00292","DOIUrl":"10.1162/evco_a_00292","url":null,"abstract":"Niching methods have been developed to maintain the population diversity, to investigate many peaks in parallel, and to reduce the effect of genetic drift. We present the first rigorous runtime analyses of restricted tournament selection (RTS), embedded in a (μ+1) EA, and analyse its effectiveness at finding both optima of the bimodal function TwoMax. In RTS, an offspring competes against the closest individual, with respect to some distance measure, amongst w (window size) population members (chosen uniformly at random with replacement), to encourage competition within the same niche. We prove that RTS finds both optima on TwoMax efficiently if the window size w is large enough. However, if w is too small, RTS fails to find both optima even in exponential time, with high probability. We further consider a variant of RTS selecting individuals for the tournament without replacement. It yields a more diverse tournament and is more effective at preventing one niche from taking over the other. However, this comes at the expense of a slower progress towards optima when a niche collapses to a single individual. Our theoretical results are accompanied by experimental studies that shed light on parameters not covered by the theoretical results and support a conjectured lower runtime bound.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 1","pages":"1-26"},"PeriodicalIF":6.8,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39499752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Léo Françoso Dal Piccol Sotto;Franz Rothlauf;Vinícius Veloso de Melo;Márcio P. Basgalupp
Linear Genetic Programming (LGP) represents programs as sequences of instructions and has a Directed Acyclic Graph (DAG) dataflow. The results of instructions are stored in registers that can be used as arguments by other instructions. Instructions that are disconnected from the main part of the program are called noneffective instructions, or structural introns. They also appear in other DAG-based GP approaches like Cartesian Genetic Programming (CGP). This article studies four hypotheses on the role of structural introns: noneffective instructions (1) serve as evolutionary memory, where evolved information is stored and later used in search, (2) preserve population diversity, (3) allow neutral search, where structural introns increase the number of neutral mutations and improve performance, and (4) serve as genetic material to enable program growth. We study different variants of LGP controlling the influence of introns for symbolic regression, classification, and digital circuits problems. We find that there is (1) evolved information in the noneffective instructions that can be reactivated and that (2) structural introns can promote programs with higher effective diversity. However, both effects have no influence on LGP search performance. On the other hand, allowing mutations to not only be applied to effective but also to noneffective instructions (3) increases the rate of neutral mutations and (4) contributes to program growth by making use of the genetic material available as structural introns. This comes along with a significant increase of LGP performance, which makes structural introns important for LGP.
{"title":"An Analysis of the Influence of Noneffective Instructions in Linear Genetic Programming","authors":"Léo Françoso Dal Piccol Sotto;Franz Rothlauf;Vinícius Veloso de Melo;Márcio P. Basgalupp","doi":"10.1162/evco_a_00296","DOIUrl":"10.1162/evco_a_00296","url":null,"abstract":"Linear Genetic Programming (LGP) represents programs as sequences of instructions and has a Directed Acyclic Graph (DAG) dataflow. The results of instructions are stored in registers that can be used as arguments by other instructions. Instructions that are disconnected from the main part of the program are called noneffective instructions, or structural introns. They also appear in other DAG-based GP approaches like Cartesian Genetic Programming (CGP). This article studies four hypotheses on the role of structural introns: noneffective instructions (1) serve as evolutionary memory, where evolved information is stored and later used in search, (2) preserve population diversity, (3) allow neutral search, where structural introns increase the number of neutral mutations and improve performance, and (4) serve as genetic material to enable program growth. We study different variants of LGP controlling the influence of introns for symbolic regression, classification, and digital circuits problems. We find that there is (1) evolved information in the noneffective instructions that can be reactivated and that (2) structural introns can promote programs with higher effective diversity. However, both effects have no influence on LGP search performance. On the other hand, allowing mutations to not only be applied to effective but also to noneffective instructions (3) increases the rate of neutral mutations and (4) contributes to program growth by making use of the genetic material available as structural introns. This comes along with a significant increase of LGP performance, which makes structural introns important for LGP.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 1","pages":"51-74"},"PeriodicalIF":6.8,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39339617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High-dimensional unbalanced classification is challenging because of the joint effects of high dimensionality and class imbalance. Genetic programming (GP) has the potential benefits for use in high-dimensional classification due to its built-in capability to select informative features. However, once data are not evenly distributed, GP tends to develop biased classifiers which achieve a high accuracy on the majority class but a low accuracy on the minority class. Unfortunately, the minority class is often at least as important as the majority class. It is of importance to investigate how GP can be effectively utilized for high-dimensional unbalanced classification. In this article, to address the performance bias issue of GP, a new two-criterion fitness function is developed, which considers two criteria, that is, the approximation of area under the curve (AUC) and the classification clarity (i.e., how well a program can separate two classes). The obtained values on the two criteria are combined in pairs, instead of summing them together. Furthermore, this article designs a three-criterion tournament selection to effectively identify and select good programs to be used by genetic operators for generating offspring during the evolutionary learning process. The experimental results show that the proposed method achieves better classification performance than other compared methods.
{"title":"High-Dimensional Unbalanced Binary Classification by Genetic Programming with Multi-Criterion Fitness Evaluation and Selection","authors":"Wenbin Pei;Bing Xue;Lin Shang;Mengjie Zhang","doi":"10.1162/evco_a_00304","DOIUrl":"10.1162/evco_a_00304","url":null,"abstract":"High-dimensional unbalanced classification is challenging because of the joint effects of high dimensionality and class imbalance. Genetic programming (GP) has the potential benefits for use in high-dimensional classification due to its built-in capability to select informative features. However, once data are not evenly distributed, GP tends to develop biased classifiers which achieve a high accuracy on the majority class but a low accuracy on the minority class. Unfortunately, the minority class is often at least as important as the majority class. It is of importance to investigate how GP can be effectively utilized for high-dimensional unbalanced classification. In this article, to address the performance bias issue of GP, a new two-criterion fitness function is developed, which considers two criteria, that is, the approximation of area under the curve (AUC) and the classification clarity (i.e., how well a program can separate two classes). The obtained values on the two criteria are combined in pairs, instead of summing them together. Furthermore, this article designs a three-criterion tournament selection to effectively identify and select good programs to be used by genetic operators for generating offspring during the evolutionary learning process. The experimental results show that the proposed method achieves better classification performance than other compared methods.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 1","pages":"99-129"},"PeriodicalIF":6.8,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39583649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Kronberger;F. O. de Franca;B. Burlacu;C. Haider;M. Kommenda
We investigate the addition of constraints on the function image and its derivatives for the incorporation of prior knowledge in symbolic regression. The approach is called shape-constrained symbolic regression and allows us to enforce, for example, monotonicity of the function over selected inputs. The aim is to find models which conform to expected behavior and which have improved extrapolation capabilities. We demonstrate the feasibility of the idea and propose and compare two evolutionary algorithms for shape-constrained symbolic regression: (i) an extension of tree-based genetic programming which discards infeasible solutions in the selection step, and (ii) a two-population evolutionary algorithm that separates the feasible from the infeasible solutions. In both algorithms we use interval arithmetic to approximate bounds for models and their partial derivatives. The algorithms are tested on a set of 19 synthetic and four real-world regression problems. Both algorithms are able to identify models which conform to shape constraints which is not the case for the unmodified symbolic regression algorithms. However, the predictive accuracy of models with constraints is worse on the training set and the test set. Shape-constrained polynomial regression produces the best results for the test set but also significantly larger models.
{"title":"Shape-Constrained Symbolic Regression—Improving Extrapolation with Prior Knowledge","authors":"G. Kronberger;F. O. de Franca;B. Burlacu;C. Haider;M. Kommenda","doi":"10.1162/evco_a_00294","DOIUrl":"10.1162/evco_a_00294","url":null,"abstract":"We investigate the addition of constraints on the function image and its derivatives for the incorporation of prior knowledge in symbolic regression. The approach is called shape-constrained symbolic regression and allows us to enforce, for example, monotonicity of the function over selected inputs. The aim is to find models which conform to expected behavior and which have improved extrapolation capabilities. We demonstrate the feasibility of the idea and propose and compare two evolutionary algorithms for shape-constrained symbolic regression: (i) an extension of tree-based genetic programming which discards infeasible solutions in the selection step, and (ii) a two-population evolutionary algorithm that separates the feasible from the infeasible solutions. In both algorithms we use interval arithmetic to approximate bounds for models and their partial derivatives. The algorithms are tested on a set of 19 synthetic and four real-world regression problems. Both algorithms are able to identify models which conform to shape constraints which is not the case for the unmodified symbolic regression algorithms. However, the predictive accuracy of models with constraints is worse on the training set and the test set. Shape-constrained polynomial regression produces the best results for the test set but also significantly larger models.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 1","pages":"75-98"},"PeriodicalIF":6.8,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39499749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract This article presents a novel method, called Modular Grammatical Evolution (MGE), toward validating the hypothesis that restricting the solution space of NeuroEvolution to modular and simple neural networks enables the efficient generation of smaller and more structured neural networks while providing acceptable (and in some cases superior) accuracy on large data sets. MGE also enhances the state-of-the-art Grammatical Evolution (GE) methods in two directions. First, MGE's representation is modular in that each individual has a set of genes, and each gene is mapped to a neuron by grammatical rules. Second, the proposed representation mitigates two important drawbacks of GE, namely the low scalability and weak locality of representation, toward generating modular and multilayer networks with a high number of neurons. We define and evaluate five different forms of structures with and without modularity using MGE and find single-layer modules with no coupling more productive. Our experiments demonstrate that modularity helps in finding better neural networks faster. We have validated the proposed method using ten well-known classification benchmarks with different sizes, feature counts, and output class counts. Our experimental results indicate that MGE provides superior accuracy with respect to existing NeuroEvolution methods and returns classifiers that are significantly simpler than other machine learning generated classifiers. Finally, we empirically demonstrate that MGE outperforms other GE methods in terms of locality and scalability properties.
{"title":"Modular Grammatical Evolution for the Generation of Artificial Neural Networks","authors":"Khabat Soltanian, Ali Ebnenasir, M. Afsharchi","doi":"10.1145/3520304.3534072","DOIUrl":"https://doi.org/10.1145/3520304.3534072","url":null,"abstract":"Abstract This article presents a novel method, called Modular Grammatical Evolution (MGE), toward validating the hypothesis that restricting the solution space of NeuroEvolution to modular and simple neural networks enables the efficient generation of smaller and more structured neural networks while providing acceptable (and in some cases superior) accuracy on large data sets. MGE also enhances the state-of-the-art Grammatical Evolution (GE) methods in two directions. First, MGE's representation is modular in that each individual has a set of genes, and each gene is mapped to a neuron by grammatical rules. Second, the proposed representation mitigates two important drawbacks of GE, namely the low scalability and weak locality of representation, toward generating modular and multilayer networks with a high number of neurons. We define and evaluate five different forms of structures with and without modularity using MGE and find single-layer modules with no coupling more productive. Our experiments demonstrate that modularity helps in finding better neural networks faster. We have validated the proposed method using ten well-known classification benchmarks with different sizes, feature counts, and output class counts. Our experimental results indicate that MGE provides superior accuracy with respect to existing NeuroEvolution methods and returns classifiers that are significantly simpler than other machine learning generated classifiers. Finally, we empirically demonstrate that MGE outperforms other GE methods in terms of locality and scalability properties.","PeriodicalId":50470,"journal":{"name":"Evolutionary Computation","volume":"30 1","pages":"291-327"},"PeriodicalIF":6.8,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46524257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}