深度学习中的大规模维度缩减和与元启发式算法的混合

Rasa Khosrowshahli, Shahryar Rahnamayan, Beatrice Ombuki-Berman
{"title":"深度学习中的大规模维度缩减和与元启发式算法的混合","authors":"Rasa Khosrowshahli, Shahryar Rahnamayan, Beatrice Ombuki-Berman","doi":"arxiv-2408.07194","DOIUrl":null,"url":null,"abstract":"Deep learning is mainly based on utilizing gradient-based optimization for\ntraining Deep Neural Network (DNN) models. Although robust and widely used,\ngradient-based optimization algorithms are prone to getting stuck in local\nminima. In this modern deep learning era, the state-of-the-art DNN models have\nmillions and billions of parameters, including weights and biases, making them\nhuge-scale optimization problems in terms of search space. Tuning a huge number\nof parameters is a challenging task that causes vanishing/exploding gradients\nand overfitting; likewise, utilized loss functions do not exactly represent our\ntargeted performance metrics. A practical solution to exploring large and\ncomplex solution space is meta-heuristic algorithms. Since DNNs exceed\nthousands and millions of parameters, even robust meta-heuristic algorithms,\nsuch as Differential Evolution, struggle to efficiently explore and converge in\nsuch huge-dimensional search spaces, leading to very slow convergence and high\nmemory demand. To tackle the mentioned curse of dimensionality, the concept of\nblocking was recently proposed as a technique that reduces the search space\ndimensions by grouping them into blocks. In this study, we aim to introduce\nHistogram-based Blocking Differential Evolution (HBDE), a novel approach that\nhybridizes gradient-based and gradient-free algorithms to optimize parameters.\nExperimental results demonstrated that the HBDE could reduce the parameters in\nthe ResNet-18 model from 11M to 3K during the training/optimizing phase by\nmetaheuristics, namely, the proposed HBDE, which outperforms baseline\ngradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and\nCIFAR-100 datasets showcasing its effectiveness with reduced computational\ndemands for the very first time.","PeriodicalId":501347,"journal":{"name":"arXiv - CS - Neural and Evolutionary Computing","volume":"18 1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Massive Dimensions Reduction and Hybridization with Meta-heuristics in Deep Learning\",\"authors\":\"Rasa Khosrowshahli, Shahryar Rahnamayan, Beatrice Ombuki-Berman\",\"doi\":\"arxiv-2408.07194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning is mainly based on utilizing gradient-based optimization for\\ntraining Deep Neural Network (DNN) models. Although robust and widely used,\\ngradient-based optimization algorithms are prone to getting stuck in local\\nminima. In this modern deep learning era, the state-of-the-art DNN models have\\nmillions and billions of parameters, including weights and biases, making them\\nhuge-scale optimization problems in terms of search space. Tuning a huge number\\nof parameters is a challenging task that causes vanishing/exploding gradients\\nand overfitting; likewise, utilized loss functions do not exactly represent our\\ntargeted performance metrics. A practical solution to exploring large and\\ncomplex solution space is meta-heuristic algorithms. Since DNNs exceed\\nthousands and millions of parameters, even robust meta-heuristic algorithms,\\nsuch as Differential Evolution, struggle to efficiently explore and converge in\\nsuch huge-dimensional search spaces, leading to very slow convergence and high\\nmemory demand. To tackle the mentioned curse of dimensionality, the concept of\\nblocking was recently proposed as a technique that reduces the search space\\ndimensions by grouping them into blocks. In this study, we aim to introduce\\nHistogram-based Blocking Differential Evolution (HBDE), a novel approach that\\nhybridizes gradient-based and gradient-free algorithms to optimize parameters.\\nExperimental results demonstrated that the HBDE could reduce the parameters in\\nthe ResNet-18 model from 11M to 3K during the training/optimizing phase by\\nmetaheuristics, namely, the proposed HBDE, which outperforms baseline\\ngradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and\\nCIFAR-100 datasets showcasing its effectiveness with reduced computational\\ndemands for the very first time.\",\"PeriodicalId\":501347,\"journal\":{\"name\":\"arXiv - CS - Neural and Evolutionary Computing\",\"volume\":\"18 1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Neural and Evolutionary Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.07194\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Neural and Evolutionary Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

深度学习主要基于梯度优化来训练深度神经网络(DNN)模型。基于梯度的优化算法虽然稳健且应用广泛,但容易陷入局部极值。在现代深度学习时代,最先进的 DNN 模型有数百万乃至数十亿个参数,包括权重和偏置,这使得它们在搜索空间上成为超大规模的优化问题。调整大量参数是一项具有挑战性的任务,会导致梯度消失/爆炸和过拟合;同样,利用的损失函数也不能完全代表我们的目标性能指标。元启发式算法是探索庞大而复杂的求解空间的实用解决方案。由于 DNN 的参数超过数千甚至数百万,即使是鲁棒的元启发式算法,如微分进化算法,也很难有效地探索和收敛这种超大维度的搜索空间,从而导致收敛速度非常缓慢,内存需求也很高。为了解决上述 "维度诅咒 "问题,最近有人提出了 "分块"(blocking)的概念,即通过将搜索空间分组成块来减少搜索空间维度的技术。在本研究中,我们旨在引入基于组图的分块差分进化(Histogram-based Blocking Differential Evolution,HBDE),这是一种混合基于梯度和无梯度算法来优化参数的新方法。实验结果表明,在训练/优化阶段,HBDE 可以通过元启发式方法将 ResNet-18 模型的参数从 11M 减少到 3K,即所提出的 HBDE 优于在 CIFAR-10 和 CIFAR-100 数据集上评估的基于梯度和无梯度 DE 算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Massive Dimensions Reduction and Hybridization with Meta-heuristics in Deep Learning
Deep learning is mainly based on utilizing gradient-based optimization for training Deep Neural Network (DNN) models. Although robust and widely used, gradient-based optimization algorithms are prone to getting stuck in local minima. In this modern deep learning era, the state-of-the-art DNN models have millions and billions of parameters, including weights and biases, making them huge-scale optimization problems in terms of search space. Tuning a huge number of parameters is a challenging task that causes vanishing/exploding gradients and overfitting; likewise, utilized loss functions do not exactly represent our targeted performance metrics. A practical solution to exploring large and complex solution space is meta-heuristic algorithms. Since DNNs exceed thousands and millions of parameters, even robust meta-heuristic algorithms, such as Differential Evolution, struggle to efficiently explore and converge in such huge-dimensional search spaces, leading to very slow convergence and high memory demand. To tackle the mentioned curse of dimensionality, the concept of blocking was recently proposed as a technique that reduces the search space dimensions by grouping them into blocks. In this study, we aim to introduce Histogram-based Blocking Differential Evolution (HBDE), a novel approach that hybridizes gradient-based and gradient-free algorithms to optimize parameters. Experimental results demonstrated that the HBDE could reduce the parameters in the ResNet-18 model from 11M to 3K during the training/optimizing phase by metaheuristics, namely, the proposed HBDE, which outperforms baseline gradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and CIFAR-100 datasets showcasing its effectiveness with reduced computational demands for the very first time.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Hardware-Friendly Implementation of Physical Reservoir Computing with CMOS-based Time-domain Analog Spiking Neurons Self-Contrastive Forward-Forward Algorithm Bio-Inspired Mamba: Temporal Locality and Bioplausible Learning in Selective State Space Models PReLU: Yet Another Single-Layer Solution to the XOR Problem Inferno: An Extensible Framework for Spiking Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1