A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations

Hongrong Cheng;Miao Zhang;Javen Qinfeng Shi
{"title":"A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations","authors":"Hongrong Cheng;Miao Zhang;Javen Qinfeng Shi","doi":"10.1109/TPAMI.2024.3447085","DOIUrl":null,"url":null,"abstract":"Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on \n<uri>https://github.com/hrcheng1066/awesome-pruning</uri>\n that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"46 12","pages":"10558-10578"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10643325/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
深度神经网络剪枝调查:分类、比较、分析和建议。
现代深度神经网络,尤其是最新的大型语言模型,具有庞大的模型规模,需要大量的计算和存储资源。为了在资源受限的环境中部署现代模型并加快推理时间,研究人员越来越多地探索剪枝技术,将其作为神经网络压缩的一个热门研究方向。从 2020 年到 2024 年,已经发表了三千多篇剪枝论文。然而,关于剪枝的最新综合综述论文却十分匮乏。为解决这一问题,我们在本调查报告中全面回顾了现有的深度神经网络剪枝研究工作,并从以下几个方面进行了分类:1)通用/特定加速;2)何时剪枝;3)如何剪枝;4)剪枝与其他压缩技术的融合。然后,我们对剪枝的八对对比设置(如非结构化/结构化、单次/迭代、无数据/数据驱动、初始化/预训练权重等)进行了深入的对比分析,并探讨了几个新出现的主题,包括大型语言模型、视觉变换器、扩散模型和大型多模态模型的剪枝、后训练剪枝以及剪枝的不同监督水平,以揭示现有方法的共性和差异,为进一步的方法开发奠定基础。最后,我们就剪枝方法的选择提出了一些有价值的建议,并展望了神经网络剪枝的几个有前景的研究方向。为了促进未来对深度神经网络剪枝的研究,我们总结了剪枝的广泛应用(如对抗鲁棒性、自然语言理解等),并建立了一个数据集、网络和不同应用评估的集合。我们在 https://github.com/hrcheng1066/awesome-pruning 上维护了一个资源库,作为神经网络剪枝论文和相应开源代码的综合资源。我们将不断更新该资源库,以纳入该领域的最新进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Language-Inspired Relation Transfer for Few-Shot Class-Incremental Learning. Multi-Modality Multi-Attribute Contrastive Pre-Training for Image Aesthetics Computing. 360SFUDA++: Towards Source-Free UDA for Panoramic Segmentation by Learning Reliable Category Prototypes. Anti-Forgetting Adaptation for Unsupervised Person Re-Identification. Evolved Hierarchical Masking for Self-Supervised Learning.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1