{"title":"基于高效非结构化修剪策略的面向数据流的细粒度稀疏CNN加速器","authors":"Tianyang Yu, Bi Wu, Ke Chen, C. Yan, Weiqiang Liu","doi":"10.1145/3526241.3530318","DOIUrl":null,"url":null,"abstract":"Network pruning can effectively alleviate the excessive parameters and computation issues in CNNs. However, unstructured pruning is not hardware friendly, while structured pruning will result in a significant loss of accuracy. In this paper, an unstructured fine-grained pruning strategy is proposed and achieves a 16X compression ratio with a top-1 accuracy loss of 1.4% for VGG-16. Combined with the proposed hardware-oriented hyperparameter selection method, compression rates of up to 64X can be obtained while fully meeting the edge-side accuracy requirements. Further, a light-weight, high-performance sparse CNN accelerator with modified systolic array is proposed for pruned VGG-16. The experimental results show that compared with the most advanced design, the proposed accelerator can achieve 21 Frames Per Second (FPS) with 3X better power efficiency and 2.19X better calculation density.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data Stream Oriented Fine-grained Sparse CNN Accelerator with Efficient Unstructured Pruning Strategy\",\"authors\":\"Tianyang Yu, Bi Wu, Ke Chen, C. Yan, Weiqiang Liu\",\"doi\":\"10.1145/3526241.3530318\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network pruning can effectively alleviate the excessive parameters and computation issues in CNNs. However, unstructured pruning is not hardware friendly, while structured pruning will result in a significant loss of accuracy. In this paper, an unstructured fine-grained pruning strategy is proposed and achieves a 16X compression ratio with a top-1 accuracy loss of 1.4% for VGG-16. Combined with the proposed hardware-oriented hyperparameter selection method, compression rates of up to 64X can be obtained while fully meeting the edge-side accuracy requirements. Further, a light-weight, high-performance sparse CNN accelerator with modified systolic array is proposed for pruned VGG-16. The experimental results show that compared with the most advanced design, the proposed accelerator can achieve 21 Frames Per Second (FPS) with 3X better power efficiency and 2.19X better calculation density.\",\"PeriodicalId\":188228,\"journal\":{\"name\":\"Proceedings of the Great Lakes Symposium on VLSI 2022\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Great Lakes Symposium on VLSI 2022\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3526241.3530318\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Great Lakes Symposium on VLSI 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3526241.3530318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Stream Oriented Fine-grained Sparse CNN Accelerator with Efficient Unstructured Pruning Strategy
Network pruning can effectively alleviate the excessive parameters and computation issues in CNNs. However, unstructured pruning is not hardware friendly, while structured pruning will result in a significant loss of accuracy. In this paper, an unstructured fine-grained pruning strategy is proposed and achieves a 16X compression ratio with a top-1 accuracy loss of 1.4% for VGG-16. Combined with the proposed hardware-oriented hyperparameter selection method, compression rates of up to 64X can be obtained while fully meeting the edge-side accuracy requirements. Further, a light-weight, high-performance sparse CNN accelerator with modified systolic array is proposed for pruned VGG-16. The experimental results show that compared with the most advanced design, the proposed accelerator can achieve 21 Frames Per Second (FPS) with 3X better power efficiency and 2.19X better calculation density.