{"title":"深度神经网络的精度感知结构化滤波器剪枝","authors":"Marina Villalba Carballo, Byeong Kil Lee","doi":"10.1109/CSCI51800.2020.00122","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) have several technical issues on computational complexity, redundancy, and the parameter size – especially when applied in embedded devices. Among those issues, lots of parameters require high memory capacity which causes migration problem to embedded devices. Many pruning techniques are proposed to reduce the network size in deep neural networks, but there are still various issues that exist for applying pruning techniques to DNNs. In this paper, we propose a simple-yet-efficient scheme, accuracy-aware structured pruning based on the characterization of each convolutional layer. We investigate the accuracy and compression rate of individual layer with a fixed pruning ratio and re-order the pruning priority depending on the accuracy of each layer. To achieve a further compression rate, we also add quantization to the linear layers. Our results show that the order of the layers pruned does affect the final accuracy of the deep neural network. Based on our experiments, the pruned AlexNet and VGG16 models’ parameter size is compressed up to 47.28x and 35.21x with less than 1% accuracy drop with respect to the original model.","PeriodicalId":336929,"journal":{"name":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Accuracy-aware Structured Filter Pruning for Deep Neural Networks\",\"authors\":\"Marina Villalba Carballo, Byeong Kil Lee\",\"doi\":\"10.1109/CSCI51800.2020.00122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks (DNNs) have several technical issues on computational complexity, redundancy, and the parameter size – especially when applied in embedded devices. Among those issues, lots of parameters require high memory capacity which causes migration problem to embedded devices. Many pruning techniques are proposed to reduce the network size in deep neural networks, but there are still various issues that exist for applying pruning techniques to DNNs. In this paper, we propose a simple-yet-efficient scheme, accuracy-aware structured pruning based on the characterization of each convolutional layer. We investigate the accuracy and compression rate of individual layer with a fixed pruning ratio and re-order the pruning priority depending on the accuracy of each layer. To achieve a further compression rate, we also add quantization to the linear layers. Our results show that the order of the layers pruned does affect the final accuracy of the deep neural network. Based on our experiments, the pruned AlexNet and VGG16 models’ parameter size is compressed up to 47.28x and 35.21x with less than 1% accuracy drop with respect to the original model.\",\"PeriodicalId\":336929,\"journal\":{\"name\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCI51800.2020.00122\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI51800.2020.00122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Accuracy-aware Structured Filter Pruning for Deep Neural Networks
Deep neural networks (DNNs) have several technical issues on computational complexity, redundancy, and the parameter size – especially when applied in embedded devices. Among those issues, lots of parameters require high memory capacity which causes migration problem to embedded devices. Many pruning techniques are proposed to reduce the network size in deep neural networks, but there are still various issues that exist for applying pruning techniques to DNNs. In this paper, we propose a simple-yet-efficient scheme, accuracy-aware structured pruning based on the characterization of each convolutional layer. We investigate the accuracy and compression rate of individual layer with a fixed pruning ratio and re-order the pruning priority depending on the accuracy of each layer. To achieve a further compression rate, we also add quantization to the linear layers. Our results show that the order of the layers pruned does affect the final accuracy of the deep neural network. Based on our experiments, the pruned AlexNet and VGG16 models’ parameter size is compressed up to 47.28x and 35.21x with less than 1% accuracy drop with respect to the original model.