{"title":"Empirical Analysis of Fixed Point Precision Quantization of CNNs","authors":"Anaam Ansari, T. Ogunfunmi","doi":"10.1109/MWSCAS.2019.8885263","DOIUrl":null,"url":null,"abstract":"Image classification, speech processing, autonomous driving, and medical diagnosis have made Convolutional Neural Networks (CNN) mainstream. Due to their success, many deep networks have been developed such as AlexNet, VGGNet, GoogleNet, ResidualNet [1]–[4],etc. Implementing these deep and complex networks in hardware is a challenge. There have been many hardware and algorithmic solutions to improve the throughput, latency and accuracy. Compression and optimization techniques help reduce the size of the model while maintaining the accuracy. Traditionally, quantization of weights and inputs are used to reduce the memory transfer and power consumption. Quantizing the outputs of layers can be a challenge since the output of each layer changes with the input. In this paper, we use quantization on the output of each layer for AlexNet and VGGNET16 sequentially to analyze the effect it has on accuracy. We use Signal to Quantization Noise Ratio (SQNR) to empirically determine the integer length (IL) as well as the fractional length (FL) for the fixed point precision. Based on our observations, we can report that accuracy is sensitive to fractional length as well as integer length. For AlexNet we observe deterioration in accuracy as the word length decreases. The results are similar in the case of VGGNET16.","PeriodicalId":287815,"journal":{"name":"2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)","volume":"27 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MWSCAS.2019.8885263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Image classification, speech processing, autonomous driving, and medical diagnosis have made Convolutional Neural Networks (CNN) mainstream. Due to their success, many deep networks have been developed such as AlexNet, VGGNet, GoogleNet, ResidualNet [1]–[4],etc. Implementing these deep and complex networks in hardware is a challenge. There have been many hardware and algorithmic solutions to improve the throughput, latency and accuracy. Compression and optimization techniques help reduce the size of the model while maintaining the accuracy. Traditionally, quantization of weights and inputs are used to reduce the memory transfer and power consumption. Quantizing the outputs of layers can be a challenge since the output of each layer changes with the input. In this paper, we use quantization on the output of each layer for AlexNet and VGGNET16 sequentially to analyze the effect it has on accuracy. We use Signal to Quantization Noise Ratio (SQNR) to empirically determine the integer length (IL) as well as the fractional length (FL) for the fixed point precision. Based on our observations, we can report that accuracy is sensitive to fractional length as well as integer length. For AlexNet we observe deterioration in accuracy as the word length decreases. The results are similar in the case of VGGNET16.