{"title":"An Efficient Mixed Bit-width Searching Strategy for CNN Quantization based on BN Scale Factors","authors":"Xuecong Han, Xulin Zhou, Zhongjian Ma","doi":"10.1145/3517077.3517108","DOIUrl":null,"url":null,"abstract":"In recent years, the rapid development of mixed-precision quantification technology has greatly reduced the scale of the model and the amount of calculation. However, the previous mixed bit-width strategies are too complicated, such as reinforcement learning strategies and Hessian matrix strategies. This paper proposes an efficient mixed bit-width searching strategy, which measures the sensitivity of the convolutional layer by the scale factors of the BN layer. The advantage of this strategy is that the parameters of the pre-trained model are used and no extra computation is introduced, which greatly simplifies the complexity of the bit-width selection strategy. In this paper, Resnet18 and Resnet50 models are used to conduct comparative experiments, and the differences between the proposed strategy and several previous algorithms are compared in terms of accuracy, model size and computation amount. It is verified that the accuracy of quantization in this paper is reduced within 2% compared with FP32 baseline, and the accuracy is reduced with about 0.5% compared with HAWQ. Overall, the performance is similar to that of HAWQ. This paper also compares the calculation complexity of the quantized bit-width of HAWQ-V3 with the calculation complexity of the quantized bit-width of this paper, which proves that the computational complexity of the strategy in this paper is far less than that of HAWQ-V3.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"181 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Multimedia and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517077.3517108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the rapid development of mixed-precision quantification technology has greatly reduced the scale of the model and the amount of calculation. However, the previous mixed bit-width strategies are too complicated, such as reinforcement learning strategies and Hessian matrix strategies. This paper proposes an efficient mixed bit-width searching strategy, which measures the sensitivity of the convolutional layer by the scale factors of the BN layer. The advantage of this strategy is that the parameters of the pre-trained model are used and no extra computation is introduced, which greatly simplifies the complexity of the bit-width selection strategy. In this paper, Resnet18 and Resnet50 models are used to conduct comparative experiments, and the differences between the proposed strategy and several previous algorithms are compared in terms of accuracy, model size and computation amount. It is verified that the accuracy of quantization in this paper is reduced within 2% compared with FP32 baseline, and the accuracy is reduced with about 0.5% compared with HAWQ. Overall, the performance is similar to that of HAWQ. This paper also compares the calculation complexity of the quantized bit-width of HAWQ-V3 with the calculation complexity of the quantized bit-width of this paper, which proves that the computational complexity of the strategy in this paper is far less than that of HAWQ-V3.