Hang Xiao, Haobo Xu, Xiaoming Chen, Yujie Wang, Yinhe Han
{"title":"P3S:用于CNN加速的高精度概率预测处理系统","authors":"Hang Xiao, Haobo Xu, Xiaoming Chen, Yujie Wang, Yinhe Han","doi":"10.1145/3526241.3530322","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNNs) achieve state-of-the-art performance for perception tasks at the cost of billions of computational operations. In this paper, we propose a probabilistic prediction processing system, dubbed P3S, to eliminate redundant compute-heavy convolution operations by predicting whether output activations are zero-valued. By exploiting the probability characteristic of Gaussian-like distributed activations and weights in CNNs, P3S calculates the partial convolution across values greater than a standard deviation-related threshold, to predict the ineffectual output activations. P3S skips remaining convolutions and sets outputs to zero in advance if output activations are predicted to be zero. P3S reduces 67% computations within 0.2% accuracy loss and does not even require retraining or fine-tuning CNNs. We further implement a P3S-based CNN accelerator that achieves 2.02x speedup and 2.23x energy efficiency on average over the traditional accelerator. Compared with the state-of-the-art prediction-based accelerator with 3% accuracy degradation, our P$^3$S yields up to 1.49x speedup and 1.69x energy efficiency.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"04 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"P3S: A High Accuracy Probabilistic Prediction Processing System for CNN Acceleration\",\"authors\":\"Hang Xiao, Haobo Xu, Xiaoming Chen, Yujie Wang, Yinhe Han\",\"doi\":\"10.1145/3526241.3530322\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional Neural Networks (CNNs) achieve state-of-the-art performance for perception tasks at the cost of billions of computational operations. In this paper, we propose a probabilistic prediction processing system, dubbed P3S, to eliminate redundant compute-heavy convolution operations by predicting whether output activations are zero-valued. By exploiting the probability characteristic of Gaussian-like distributed activations and weights in CNNs, P3S calculates the partial convolution across values greater than a standard deviation-related threshold, to predict the ineffectual output activations. P3S skips remaining convolutions and sets outputs to zero in advance if output activations are predicted to be zero. P3S reduces 67% computations within 0.2% accuracy loss and does not even require retraining or fine-tuning CNNs. We further implement a P3S-based CNN accelerator that achieves 2.02x speedup and 2.23x energy efficiency on average over the traditional accelerator. Compared with the state-of-the-art prediction-based accelerator with 3% accuracy degradation, our P$^3$S yields up to 1.49x speedup and 1.69x energy efficiency.\",\"PeriodicalId\":188228,\"journal\":{\"name\":\"Proceedings of the Great Lakes Symposium on VLSI 2022\",\"volume\":\"04 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Great Lakes Symposium on VLSI 2022\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3526241.3530322\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Great Lakes Symposium on VLSI 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526241.3530322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
P3S: A High Accuracy Probabilistic Prediction Processing System for CNN Acceleration
Convolutional Neural Networks (CNNs) achieve state-of-the-art performance for perception tasks at the cost of billions of computational operations. In this paper, we propose a probabilistic prediction processing system, dubbed P3S, to eliminate redundant compute-heavy convolution operations by predicting whether output activations are zero-valued. By exploiting the probability characteristic of Gaussian-like distributed activations and weights in CNNs, P3S calculates the partial convolution across values greater than a standard deviation-related threshold, to predict the ineffectual output activations. P3S skips remaining convolutions and sets outputs to zero in advance if output activations are predicted to be zero. P3S reduces 67% computations within 0.2% accuracy loss and does not even require retraining or fine-tuning CNNs. We further implement a P3S-based CNN accelerator that achieves 2.02x speedup and 2.23x energy efficiency on average over the traditional accelerator. Compared with the state-of-the-art prediction-based accelerator with 3% accuracy degradation, our P$^3$S yields up to 1.49x speedup and 1.69x energy efficiency.