A. Kini, S. Yadav, Aditya Shankar Thakur, A. Awari, Zimeng Lyu, Travis J. Desell
{"title":"Co-evolving Recurrent Neural Networks and their Hyperparameters with Simplex Hyperparameter Optimization","authors":"A. Kini, S. Yadav, Aditya Shankar Thakur, A. Awari, Zimeng Lyu, Travis J. Desell","doi":"10.1145/3583133.3596407","DOIUrl":null,"url":null,"abstract":"Designing machine learning models involves determining not only the network architecture, but also non-architectural elements such as training hyperparameters. Further confounding this problem, different architectures and datasets will perform more optimally with different hyperparameters. This problem is exacerbated for neuroevolution (NE) and neural architecture search (NAS) algorithms, which can generate and train architectures with a wide variety of architectures in order to find optimal architectures. In such algorithms, if hyperparameters are fixed, then suboptimal architectures can be found as they will be biased towards the fixed parameters. This paper evaluates the use of the simplex hyperparameter optimization (SHO) method, which allows co-evolution of hyperparameters over the course of a NE algorithm, allowing the NE algorithm to simultaneously optimize both network architectures and hyperparameters. SHO has been previously shown to be able to optimize hyperparameters for convolutional neural networks using traditional stochastic gradient descent with Nesterov momentum, and this work extends on this to evaluate SHO for evolving recurrent neural networks with additional modern weight optimizers such as RMSProp and Adam. Results show that incorporating SHO into the neuroevolution process not only enables finding better performing architectures but also faster convergence to optimal architectures across all datasets and optimization methods tested.","PeriodicalId":422029,"journal":{"name":"Proceedings of the Companion Conference on Genetic and Evolutionary Computation","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Companion Conference on Genetic and Evolutionary Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3583133.3596407","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Designing machine learning models involves determining not only the network architecture, but also non-architectural elements such as training hyperparameters. Further confounding this problem, different architectures and datasets will perform more optimally with different hyperparameters. This problem is exacerbated for neuroevolution (NE) and neural architecture search (NAS) algorithms, which can generate and train architectures with a wide variety of architectures in order to find optimal architectures. In such algorithms, if hyperparameters are fixed, then suboptimal architectures can be found as they will be biased towards the fixed parameters. This paper evaluates the use of the simplex hyperparameter optimization (SHO) method, which allows co-evolution of hyperparameters over the course of a NE algorithm, allowing the NE algorithm to simultaneously optimize both network architectures and hyperparameters. SHO has been previously shown to be able to optimize hyperparameters for convolutional neural networks using traditional stochastic gradient descent with Nesterov momentum, and this work extends on this to evaluate SHO for evolving recurrent neural networks with additional modern weight optimizers such as RMSProp and Adam. Results show that incorporating SHO into the neuroevolution process not only enables finding better performing architectures but also faster convergence to optimal architectures across all datasets and optimization methods tested.