Yanghan Zheng, Zhaofang Li, Kaihang Sun, Kuang Lee, K. Tang
{"title":"具有可重构乘法器的40nm区域高效有效位组合DNN加速器","authors":"Yanghan Zheng, Zhaofang Li, Kaihang Sun, Kuang Lee, K. Tang","doi":"10.1109/AICAS57966.2023.10168550","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) are widely used in various tasks, such as image classification and speech recognition. When deploying DNN to the edge device, the inputs and weights are usually quantized. And there are obvious patterns in the data distribution. Most data have numerous redundant bits, which reduce the utilization rate of computation resources. We proposed an area-efficient DNN accelerator with an effective bit combination mechanism and a reconfigurable multiplier. Based on the modified Baugh-Wooly multiplier, we proposed a multiplier that can process two 4-bit multiplication operations in one cycle, consuming only 1.57 times the area and 2.31 times the power consumption of a traditional multiplier. Based on the data distribution in DNN, we propose a gating approach for the weights of 0, -1, and 1, resulting in a 34.96% reduction in power consumption. The normalized area efficiency of the proposed DNN accelerator using 40nm CMOS technology is 1.11 to 4.90 times higher than previous works [4] - [7].","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A 40nm area-efficient Effective-bit-combination-based DNN accelerator with the reconfigurable multiplier\",\"authors\":\"Yanghan Zheng, Zhaofang Li, Kaihang Sun, Kuang Lee, K. Tang\",\"doi\":\"10.1109/AICAS57966.2023.10168550\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks (DNNs) are widely used in various tasks, such as image classification and speech recognition. When deploying DNN to the edge device, the inputs and weights are usually quantized. And there are obvious patterns in the data distribution. Most data have numerous redundant bits, which reduce the utilization rate of computation resources. We proposed an area-efficient DNN accelerator with an effective bit combination mechanism and a reconfigurable multiplier. Based on the modified Baugh-Wooly multiplier, we proposed a multiplier that can process two 4-bit multiplication operations in one cycle, consuming only 1.57 times the area and 2.31 times the power consumption of a traditional multiplier. Based on the data distribution in DNN, we propose a gating approach for the weights of 0, -1, and 1, resulting in a 34.96% reduction in power consumption. The normalized area efficiency of the proposed DNN accelerator using 40nm CMOS technology is 1.11 to 4.90 times higher than previous works [4] - [7].\",\"PeriodicalId\":296649,\"journal\":{\"name\":\"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICAS57966.2023.10168550\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS57966.2023.10168550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A 40nm area-efficient Effective-bit-combination-based DNN accelerator with the reconfigurable multiplier
Deep neural networks (DNNs) are widely used in various tasks, such as image classification and speech recognition. When deploying DNN to the edge device, the inputs and weights are usually quantized. And there are obvious patterns in the data distribution. Most data have numerous redundant bits, which reduce the utilization rate of computation resources. We proposed an area-efficient DNN accelerator with an effective bit combination mechanism and a reconfigurable multiplier. Based on the modified Baugh-Wooly multiplier, we proposed a multiplier that can process two 4-bit multiplication operations in one cycle, consuming only 1.57 times the area and 2.31 times the power consumption of a traditional multiplier. Based on the data distribution in DNN, we propose a gating approach for the weights of 0, -1, and 1, resulting in a 34.96% reduction in power consumption. The normalized area efficiency of the proposed DNN accelerator using 40nm CMOS technology is 1.11 to 4.90 times higher than previous works [4] - [7].