Xiuqin Fang, Han Liu, Guo Xie, Youmin Zhang, Ding Liu
{"title":"Deep Neural Network Compression Method Based on Product Quantization","authors":"Xiuqin Fang, Han Liu, Guo Xie, Youmin Zhang, Ding Liu","doi":"10.23919/CCC50068.2020.9188698","DOIUrl":null,"url":null,"abstract":"In this paper a method based on the combination of product quantization and pruning to compress deep neural network with large size model and great amount of calculation is proposed. First of all, we use pruning to reduce redundant parameters in deep neural network, and then refine the tune network for fine tuning. Then we use product quantization to quantize the parameters of the neural network to 8 bits, which reduces the storage overhead so that the deep neural network can be deployed in embedded devices. For the classification tasks in the Mnist dataset and Cifar10 dataset, the network models such as LeNet5, AlexNet, ResNet are compressed to 23 to 38 times without losing accuracy as much as possible.","PeriodicalId":255872,"journal":{"name":"2020 39th Chinese Control Conference (CCC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 39th Chinese Control Conference (CCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/CCC50068.2020.9188698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this paper a method based on the combination of product quantization and pruning to compress deep neural network with large size model and great amount of calculation is proposed. First of all, we use pruning to reduce redundant parameters in deep neural network, and then refine the tune network for fine tuning. Then we use product quantization to quantize the parameters of the neural network to 8 bits, which reduces the storage overhead so that the deep neural network can be deployed in embedded devices. For the classification tasks in the Mnist dataset and Cifar10 dataset, the network models such as LeNet5, AlexNet, ResNet are compressed to 23 to 38 times without losing accuracy as much as possible.