Rasa Khosrowshahli, Shahryar Rahnamayan, Beatrice Ombuki-Berman
{"title":"深度学习中的大规模维度缩减和与元启发式算法的混合","authors":"Rasa Khosrowshahli, Shahryar Rahnamayan, Beatrice Ombuki-Berman","doi":"arxiv-2408.07194","DOIUrl":null,"url":null,"abstract":"Deep learning is mainly based on utilizing gradient-based optimization for\ntraining Deep Neural Network (DNN) models. Although robust and widely used,\ngradient-based optimization algorithms are prone to getting stuck in local\nminima. In this modern deep learning era, the state-of-the-art DNN models have\nmillions and billions of parameters, including weights and biases, making them\nhuge-scale optimization problems in terms of search space. Tuning a huge number\nof parameters is a challenging task that causes vanishing/exploding gradients\nand overfitting; likewise, utilized loss functions do not exactly represent our\ntargeted performance metrics. A practical solution to exploring large and\ncomplex solution space is meta-heuristic algorithms. Since DNNs exceed\nthousands and millions of parameters, even robust meta-heuristic algorithms,\nsuch as Differential Evolution, struggle to efficiently explore and converge in\nsuch huge-dimensional search spaces, leading to very slow convergence and high\nmemory demand. To tackle the mentioned curse of dimensionality, the concept of\nblocking was recently proposed as a technique that reduces the search space\ndimensions by grouping them into blocks. In this study, we aim to introduce\nHistogram-based Blocking Differential Evolution (HBDE), a novel approach that\nhybridizes gradient-based and gradient-free algorithms to optimize parameters.\nExperimental results demonstrated that the HBDE could reduce the parameters in\nthe ResNet-18 model from 11M to 3K during the training/optimizing phase by\nmetaheuristics, namely, the proposed HBDE, which outperforms baseline\ngradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and\nCIFAR-100 datasets showcasing its effectiveness with reduced computational\ndemands for the very first time.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"18 1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Massive Dimensions Reduction and Hybridization with Meta-heuristics in Deep Learning\",\"authors\":\"Rasa Khosrowshahli, Shahryar Rahnamayan, Beatrice Ombuki-Berman\",\"doi\":\"arxiv-2408.07194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning is mainly based on utilizing gradient-based optimization for\\ntraining Deep Neural Network (DNN) models. Although robust and widely used,\\ngradient-based optimization algorithms are prone to getting stuck in local\\nminima. In this modern deep learning era, the state-of-the-art DNN models have\\nmillions and billions of parameters, including weights and biases, making them\\nhuge-scale optimization problems in terms of search space. Tuning a huge number\\nof parameters is a challenging task that causes vanishing/exploding gradients\\nand overfitting; likewise, utilized loss functions do not exactly represent our\\ntargeted performance metrics. A practical solution to exploring large and\\ncomplex solution space is meta-heuristic algorithms. Since DNNs exceed\\nthousands and millions of parameters, even robust meta-heuristic algorithms,\\nsuch as Differential Evolution, struggle to efficiently explore and converge in\\nsuch huge-dimensional search spaces, leading to very slow convergence and high\\nmemory demand. To tackle the mentioned curse of dimensionality, the concept of\\nblocking was recently proposed as a technique that reduces the search space\\ndimensions by grouping them into blocks. In this study, we aim to introduce\\nHistogram-based Blocking Differential Evolution (HBDE), a novel approach that\\nhybridizes gradient-based and gradient-free algorithms to optimize parameters.\\nExperimental results demonstrated that the HBDE could reduce the parameters in\\nthe ResNet-18 model from 11M to 3K during the training/optimizing phase by\\nmetaheuristics, namely, the proposed HBDE, which outperforms baseline\\ngradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and\\nCIFAR-100 datasets showcasing its effectiveness with reduced computational\\ndemands for the very first time.\",\"PeriodicalId\":501347,\"journal\":{\"name\":\"arXiv - CS - Neural and Evolutionary Computing\",\"volume\":\"18 1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Neural and Evolutionary Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.07194\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Neural and Evolutionary Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Massive Dimensions Reduction and Hybridization with Meta-heuristics in Deep Learning
Deep learning is mainly based on utilizing gradient-based optimization for
training Deep Neural Network (DNN) models. Although robust and widely used,
gradient-based optimization algorithms are prone to getting stuck in local
minima. In this modern deep learning era, the state-of-the-art DNN models have
millions and billions of parameters, including weights and biases, making them
huge-scale optimization problems in terms of search space. Tuning a huge number
of parameters is a challenging task that causes vanishing/exploding gradients
and overfitting; likewise, utilized loss functions do not exactly represent our
targeted performance metrics. A practical solution to exploring large and
complex solution space is meta-heuristic algorithms. Since DNNs exceed
thousands and millions of parameters, even robust meta-heuristic algorithms,
such as Differential Evolution, struggle to efficiently explore and converge in
such huge-dimensional search spaces, leading to very slow convergence and high
memory demand. To tackle the mentioned curse of dimensionality, the concept of
blocking was recently proposed as a technique that reduces the search space
dimensions by grouping them into blocks. In this study, we aim to introduce
Histogram-based Blocking Differential Evolution (HBDE), a novel approach that
hybridizes gradient-based and gradient-free algorithms to optimize parameters.
Experimental results demonstrated that the HBDE could reduce the parameters in
the ResNet-18 model from 11M to 3K during the training/optimizing phase by
metaheuristics, namely, the proposed HBDE, which outperforms baseline
gradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and
CIFAR-100 datasets showcasing its effectiveness with reduced computational
demands for the very first time.