{"title":"A low functional redundancy-based network slimming method for accelerating deep neural networks","authors":"Zheng Fang , Bo Yin","doi":"10.1016/j.aej.2024.12.118","DOIUrl":null,"url":null,"abstract":"<div><div>Deep neural networks (DNNs) have been widely criticized for their large parameters and computation demands, hindering deployment to edge and embedded devices. In order to reduce the floating point operations (FLOPs) running DNNs and accelerate the inference speed, we start from the model pruning, and realize this goal by removing useless network parameters. In this research, we propose a low functional redundancy-based network slimming method (LFRNS) that can find and remove functional redundant filters by feature clustering algorithm. However, the redundancy of some key features is beneficial to the model, and removing these features will limit the potential of the model to some extent. Build on this view, we propose feature contribution ranking unit (FCR unit) which can automatically learn the feature maps' contribution to the key information with training iterations. FCR unit can assist LFRNS restore some important elements in the pruning set to break the performance bottleneck of the slimming model. Our method mainly removes feature maps with similar functions instead of only pruning the unimportant parts, thus effectively ensuring the integrity of features’ functions and avoiding network degradation. We conduct experiments on image classification task based on CIFAR-10 and CIFAR-100 datasets. Our framework achieves over 2.0 × parameters and FLOPs reductions, while maintaining < 1 % loss in accuracy, and even improve accuracy of large-volume models. We also introduce our method to the vision transformer model (ViT) and achieve performance comparable to state-of-the-art methods with nearly 1.5 × less computation.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"119 ","pages":"Pages 437-450"},"PeriodicalIF":6.2000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016824017162","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Deep neural networks (DNNs) have been widely criticized for their large parameters and computation demands, hindering deployment to edge and embedded devices. In order to reduce the floating point operations (FLOPs) running DNNs and accelerate the inference speed, we start from the model pruning, and realize this goal by removing useless network parameters. In this research, we propose a low functional redundancy-based network slimming method (LFRNS) that can find and remove functional redundant filters by feature clustering algorithm. However, the redundancy of some key features is beneficial to the model, and removing these features will limit the potential of the model to some extent. Build on this view, we propose feature contribution ranking unit (FCR unit) which can automatically learn the feature maps' contribution to the key information with training iterations. FCR unit can assist LFRNS restore some important elements in the pruning set to break the performance bottleneck of the slimming model. Our method mainly removes feature maps with similar functions instead of only pruning the unimportant parts, thus effectively ensuring the integrity of features’ functions and avoiding network degradation. We conduct experiments on image classification task based on CIFAR-10 and CIFAR-100 datasets. Our framework achieves over 2.0 × parameters and FLOPs reductions, while maintaining < 1 % loss in accuracy, and even improve accuracy of large-volume models. We also introduce our method to the vision transformer model (ViT) and achieve performance comparable to state-of-the-art methods with nearly 1.5 × less computation.
期刊介绍:
Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification:
• Mechanical, Production, Marine and Textile Engineering
• Electrical Engineering, Computer Science and Nuclear Engineering
• Civil and Architecture Engineering
• Chemical Engineering and Applied Sciences
• Environmental Engineering