深度神经网络的精度感知结构化滤波器剪枝

2020 International Conference on Computational Science and Computational Intelligence (CSCI) Pub Date : 2020-12-01 DOI:10.1109/CSCI51800.2020.00122

Marina Villalba Carballo, Byeong Kil Lee

{"title":"深度神经网络的精度感知结构化滤波器剪枝","authors":"Marina Villalba Carballo, Byeong Kil Lee","doi":"10.1109/CSCI51800.2020.00122","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) have several technical issues on computational complexity, redundancy, and the parameter size – especially when applied in embedded devices. Among those issues, lots of parameters require high memory capacity which causes migration problem to embedded devices. Many pruning techniques are proposed to reduce the network size in deep neural networks, but there are still various issues that exist for applying pruning techniques to DNNs. In this paper, we propose a simple-yet-efficient scheme, accuracy-aware structured pruning based on the characterization of each convolutional layer. We investigate the accuracy and compression rate of individual layer with a fixed pruning ratio and re-order the pruning priority depending on the accuracy of each layer. To achieve a further compression rate, we also add quantization to the linear layers. Our results show that the order of the layers pruned does affect the final accuracy of the deep neural network. Based on our experiments, the pruned AlexNet and VGG16 models’ parameter size is compressed up to 47.28x and 35.21x with less than 1% accuracy drop with respect to the original model.","PeriodicalId":336929,"journal":{"name":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Accuracy-aware Structured Filter Pruning for Deep Neural Networks\",\"authors\":\"Marina Villalba Carballo, Byeong Kil Lee\",\"doi\":\"10.1109/CSCI51800.2020.00122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks (DNNs) have several technical issues on computational complexity, redundancy, and the parameter size – especially when applied in embedded devices. Among those issues, lots of parameters require high memory capacity which causes migration problem to embedded devices. Many pruning techniques are proposed to reduce the network size in deep neural networks, but there are still various issues that exist for applying pruning techniques to DNNs. In this paper, we propose a simple-yet-efficient scheme, accuracy-aware structured pruning based on the characterization of each convolutional layer. We investigate the accuracy and compression rate of individual layer with a fixed pruning ratio and re-order the pruning priority depending on the accuracy of each layer. To achieve a further compression rate, we also add quantization to the linear layers. Our results show that the order of the layers pruned does affect the final accuracy of the deep neural network. Based on our experiments, the pruned AlexNet and VGG16 models’ parameter size is compressed up to 47.28x and 35.21x with less than 1% accuracy drop with respect to the original model.\",\"PeriodicalId\":336929,\"journal\":{\"name\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCI51800.2020.00122\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI51800.2020.00122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

深度神经网络(dnn)在计算复杂性、冗余度和参数大小等方面存在一些技术问题，特别是在嵌入式设备中应用时。在这些问题中，许多参数需要高内存容量，这导致了向嵌入式设备迁移的问题。为了减小深度神经网络的网络规模，人们提出了许多修剪技术，但是将修剪技术应用到深度神经网络中仍然存在各种问题。在本文中，我们提出了一种简单而高效的方案，即基于每个卷积层特征的精确感知结构化修剪。研究了固定剪枝比下各层的剪枝精度和压缩率，并根据各层的剪枝精度对剪枝优先级进行重新排序。为了获得更高的压缩率，我们还对线性层进行了量化。我们的研究结果表明，层的修剪顺序确实会影响深度神经网络的最终精度。实验结果表明，修剪后的AlexNet和VGG16模型的参数大小分别压缩到47.28倍和35.21倍，与原始模型相比精度下降不到1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Accuracy-aware Structured Filter Pruning for Deep Neural Networks

Deep neural networks (DNNs) have several technical issues on computational complexity, redundancy, and the parameter size – especially when applied in embedded devices. Among those issues, lots of parameters require high memory capacity which causes migration problem to embedded devices. Many pruning techniques are proposed to reduce the network size in deep neural networks, but there are still various issues that exist for applying pruning techniques to DNNs. In this paper, we propose a simple-yet-efficient scheme, accuracy-aware structured pruning based on the characterization of each convolutional layer. We investigate the accuracy and compression rate of individual layer with a fixed pruning ratio and re-order the pruning priority depending on the accuracy of each layer. To achieve a further compression rate, we also add quantization to the linear layers. Our results show that the order of the layers pruned does affect the final accuracy of the deep neural network. Based on our experiments, the pruned AlexNet and VGG16 models’ parameter size is compressed up to 47.28x and 35.21x with less than 1% accuracy drop with respect to the original model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference on Computational Science and Computational Intelligence (CSCI)

自引率

0.00%

发文量