An Efficient Mixed Bit-width Searching Strategy for CNN Quantization based on BN Scale Factors

2022 7th International Conference on Multimedia and Image Processing Pub Date : 2022-01-14 DOI:10.1145/3517077.3517108

Xuecong Han, Xulin Zhou, Zhongjian Ma

{"title":"An Efficient Mixed Bit-width Searching Strategy for CNN Quantization based on BN Scale Factors","authors":"Xuecong Han, Xulin Zhou, Zhongjian Ma","doi":"10.1145/3517077.3517108","DOIUrl":null,"url":null,"abstract":"In recent years, the rapid development of mixed-precision quantification technology has greatly reduced the scale of the model and the amount of calculation. However, the previous mixed bit-width strategies are too complicated, such as reinforcement learning strategies and Hessian matrix strategies. This paper proposes an efficient mixed bit-width searching strategy, which measures the sensitivity of the convolutional layer by the scale factors of the BN layer. The advantage of this strategy is that the parameters of the pre-trained model are used and no extra computation is introduced, which greatly simplifies the complexity of the bit-width selection strategy. In this paper, Resnet18 and Resnet50 models are used to conduct comparative experiments, and the differences between the proposed strategy and several previous algorithms are compared in terms of accuracy, model size and computation amount. It is verified that the accuracy of quantization in this paper is reduced within 2% compared with FP32 baseline, and the accuracy is reduced with about 0.5% compared with HAWQ. Overall, the performance is similar to that of HAWQ. This paper also compares the calculation complexity of the quantized bit-width of HAWQ-V3 with the calculation complexity of the quantized bit-width of this paper, which proves that the computational complexity of the strategy in this paper is far less than that of HAWQ-V3.","PeriodicalId":233686,"journal":{"name":"2022 7th International Conference on Multimedia and Image Processing","volume":"181 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Multimedia and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517077.3517108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, the rapid development of mixed-precision quantification technology has greatly reduced the scale of the model and the amount of calculation. However, the previous mixed bit-width strategies are too complicated, such as reinforcement learning strategies and Hessian matrix strategies. This paper proposes an efficient mixed bit-width searching strategy, which measures the sensitivity of the convolutional layer by the scale factors of the BN layer. The advantage of this strategy is that the parameters of the pre-trained model are used and no extra computation is introduced, which greatly simplifies the complexity of the bit-width selection strategy. In this paper, Resnet18 and Resnet50 models are used to conduct comparative experiments, and the differences between the proposed strategy and several previous algorithms are compared in terms of accuracy, model size and computation amount. It is verified that the accuracy of quantization in this paper is reduced within 2% compared with FP32 baseline, and the accuracy is reduced with about 0.5% compared with HAWQ. Overall, the performance is similar to that of HAWQ. This paper also compares the calculation complexity of the quantized bit-width of HAWQ-V3 with the calculation complexity of the quantized bit-width of this paper, which proves that the computational complexity of the strategy in this paper is far less than that of HAWQ-V3.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于BN尺度因子的CNN量化混合位宽搜索策略

近年来，混合精度量化技术的快速发展，大大降低了模型的规模和计算量。然而，以往的混合位宽策略过于复杂，如强化学习策略和Hessian矩阵策略。本文提出了一种高效的混合位宽搜索策略，该策略通过BN层的尺度因子来衡量卷积层的灵敏度。该策略的优点是使用预训练模型的参数，不引入额外的计算量，大大简化了位宽选择策略的复杂度。本文使用Resnet18和Resnet50模型进行对比实验，比较本文提出的策略与之前几种算法在准确率、模型大小、计算量等方面的差异。验证了本文量化的精度与FP32基线相比降低了2%以内，与HAWQ相比降低了0.5%左右。总体而言，性能与HAWQ相似。本文还将HAWQ-V3量化位宽的计算复杂度与本文量化位宽的计算复杂度进行了比较，证明本文策略的计算复杂度远远小于HAWQ-V3。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 7th International Conference on Multimedia and Image Processing

自引率

0.00%

发文量