{"title":"An Energy-Efficient Accelerator with Relative- Indexing Memory for Sparse Compressed Convolutional Neural Network","authors":"I-Chen Wu, Po-Tsang Huang, Chin-Yang Lo, W. Hwang","doi":"10.1109/AICAS.2019.8771600","DOIUrl":null,"url":null,"abstract":"Deep convolutional neural networks (CNNs) are widely used in image recognition and feature classification. However, deep CNNs are hard to be fully deployed for edge devices due to both computation-intensive and memory-intensive workloads. The energy efficiency of CNNs is dominated by off-chip memory accesses and convolution computation. In this paper, an energy-efficient accelerator is proposed for sparse compressed CNNs by reducing DRAM accesses and eliminating zero-operand computation. Weight compression is utilized for sparse compressed CNNs to reduce the required memory capacity/bandwidth and a large portion of connections. Thus, ReLU function produces zero-valued activations. Additionally, the workloads are distributed based on channels to increase the degree of task parallelism, and all-row- to-all-row non-zero element multiplication is adopted for skipping redundant computation. The simulation results over the dense accelerator show that the proposed accelerator achieves 1.79x speedup and reduces 23.51%, 69.53%, 88.67% on-chip memory size, energy, and DRAM accesses of VGG-16.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS.2019.8771600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Deep convolutional neural networks (CNNs) are widely used in image recognition and feature classification. However, deep CNNs are hard to be fully deployed for edge devices due to both computation-intensive and memory-intensive workloads. The energy efficiency of CNNs is dominated by off-chip memory accesses and convolution computation. In this paper, an energy-efficient accelerator is proposed for sparse compressed CNNs by reducing DRAM accesses and eliminating zero-operand computation. Weight compression is utilized for sparse compressed CNNs to reduce the required memory capacity/bandwidth and a large portion of connections. Thus, ReLU function produces zero-valued activations. Additionally, the workloads are distributed based on channels to increase the degree of task parallelism, and all-row- to-all-row non-zero element multiplication is adopted for skipping redundant computation. The simulation results over the dense accelerator show that the proposed accelerator achieves 1.79x speedup and reduces 23.51%, 69.53%, 88.67% on-chip memory size, energy, and DRAM accesses of VGG-16.