Reza Hojabr, Kamyar Givaki, S. M. R. Tayaranian, P. Esfahanian, A. Khonsari, D. Rahmati, M. Najafi
Employing convolutional neural networks (CNNs) in embedded devices seeks novel low-cost and energy efficient CNN accelerators. Stochastic computing (SC) is a promising low-cost alternative to conventional binary implementations of CNNs. Despite the low-cost advantage, SC-based arithmetic units suffer from prohibitive execution time due to processing long bit-streams. In particular, multiplication as the main operation in convolution computation, is an extremely time-consuming operation which hampers employing SC methods in designing embedded CNNs.In this work, we propose a novel architecture, called SkippyNN, that reduces the computation time of SC-based multiplications in the convolutional layers of CNNs. Each convolution in a CNN is composed of numerous multiplications where each input value is multiplied by a weight vector. Producing the result of the first multiplication, the following multiplications can be performed by multiplying the input and the differences of the successive weights. Leveraging this property, we develop a differential Multiply-and-Accumulate unit, called DMAC, to reduce the time consumed by convolutions in SkippyNN. We evaluate the efficiency of SkippyNN using four modern CNNs. On average, SkippyNN offers 1.2x speedup and 2.7x energy saving compared to the binary implementation of CNN accelerators.
{"title":"SkippyNN","authors":"Reza Hojabr, Kamyar Givaki, S. M. R. Tayaranian, P. Esfahanian, A. Khonsari, D. Rahmati, M. Najafi","doi":"10.1145/3316781.3317911","DOIUrl":"https://doi.org/10.1145/3316781.3317911","url":null,"abstract":"Employing convolutional neural networks (CNNs) in embedded devices seeks novel low-cost and energy efficient CNN accelerators. Stochastic computing (SC) is a promising low-cost alternative to conventional binary implementations of CNNs. Despite the low-cost advantage, SC-based arithmetic units suffer from prohibitive execution time due to processing long bit-streams. In particular, multiplication as the main operation in convolution computation, is an extremely time-consuming operation which hampers employing SC methods in designing embedded CNNs.In this work, we propose a novel architecture, called SkippyNN, that reduces the computation time of SC-based multiplications in the convolutional layers of CNNs. Each convolution in a CNN is composed of numerous multiplications where each input value is multiplied by a weight vector. Producing the result of the first multiplication, the following multiplications can be performed by multiplying the input and the differences of the successive weights. Leveraging this property, we develop a differential Multiply-and-Accumulate unit, called DMAC, to reduce the time consumed by convolutions in SkippyNN. We evaluate the efficiency of SkippyNN using four modern CNNs. On average, SkippyNN offers 1.2x speedup and 2.7x energy saving compared to the binary implementation of CNN accelerators.","PeriodicalId":391209,"journal":{"name":"Proceedings of the 56th Annual Design Automation Conference 2019","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125865355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Imani, Justin Morris, John G. Messerly, Helen Shu, Yaobang Deng, Tajana Rosing
Brain-inspired Hyperdimensional (HD) computing is a new computing paradigm emulating the neuron’s activity in high-dimensional space. The first step in HD computing is to map each data point into high-dimensional space (e.g., 10,000), which requires the computation of thousands of operations for each element of data in the original domain. Encoding alone takes about 80% of the execution time of training. In this paper, we propose BRIC, a fully binary Brain-Inspired Classifier based on HD computing for energy-efficient and high-accuracy classification. BRIC introduces a novel encoding module based on random projection with a predictable memory access pattern which can efficiently be implemented in hardware. BRIC is the first HD-based approach which provides data projection with a 1:1 ratio to the original data and enables all training/inference computation to be performed using binary hypervectors. To further improve BRIC efficiency, we develop an online dimension reduction approach which removes insignificant hypervector dimensions during training. Additionally, we designed a fully pipelined FPGA implementation which accelerates BRIC in both training and inference phases. Our evaluation of BRIC a wide range of classification applications show that BRIC can achieve $64.1 times$ and $9.8 times (43.8 times$ and $6.1 times) $ energy efficiency and speed up as compared to baseline HD computing during training (inference) while providing the same classification accuracy.CCS CONCEPTS• Computing methodologies $rightarrow$ Machinelearningapproaches; Supervised learning;
{"title":"BRIC","authors":"M. Imani, Justin Morris, John G. Messerly, Helen Shu, Yaobang Deng, Tajana Rosing","doi":"10.1145/3316781.3317785","DOIUrl":"https://doi.org/10.1145/3316781.3317785","url":null,"abstract":"Brain-inspired Hyperdimensional (HD) computing is a new computing paradigm emulating the neuron’s activity in high-dimensional space. The first step in HD computing is to map each data point into high-dimensional space (e.g., 10,000), which requires the computation of thousands of operations for each element of data in the original domain. Encoding alone takes about 80% of the execution time of training. In this paper, we propose BRIC, a fully binary Brain-Inspired Classifier based on HD computing for energy-efficient and high-accuracy classification. BRIC introduces a novel encoding module based on random projection with a predictable memory access pattern which can efficiently be implemented in hardware. BRIC is the first HD-based approach which provides data projection with a 1:1 ratio to the original data and enables all training/inference computation to be performed using binary hypervectors. To further improve BRIC efficiency, we develop an online dimension reduction approach which removes insignificant hypervector dimensions during training. Additionally, we designed a fully pipelined FPGA implementation which accelerates BRIC in both training and inference phases. Our evaluation of BRIC a wide range of classification applications show that BRIC can achieve $64.1 times$ and $9.8 times (43.8 times$ and $6.1 times) $ energy efficiency and speed up as compared to baseline HD computing during training (inference) while providing the same classification accuracy.CCS CONCEPTS• Computing methodologies $rightarrow$ Machinelearningapproaches; Supervised learning;","PeriodicalId":391209,"journal":{"name":"Proceedings of the 56th Annual Design Automation Conference 2019","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116680937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 56th Annual Design Automation Conference 2019","authors":"","doi":"10.1145/3316781","DOIUrl":"https://doi.org/10.1145/3316781","url":null,"abstract":"","PeriodicalId":391209,"journal":{"name":"Proceedings of the 56th Annual Design Automation Conference 2019","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125153897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}