SimBNN:一个相似度感知的二值化神经网络加速框架

2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2019-04-01 DOI:10.1109/FCCM.2019.00060

Cheng Fu, Shilin Zhu, Huili Chen, F. Koushanfar, Hao Su, Jishen Zhao

{"title":"SimBNN:一个相似度感知的二值化神经网络加速框架","authors":"Cheng Fu, Shilin Zhu, Huili Chen, F. Koushanfar, Hao Su, Jishen Zhao","doi":"10.1109/FCCM.2019.00060","DOIUrl":null,"url":null,"abstract":"Binarized Neural Networks (BNNs) eliminate bitwidth redundancy in Convolutional Neural Networks (CNNs) by using a single bit (-1/+1) for network parameters and intermediate representations. This greatly reduces off-chip data transfer and storage overhead. However, considerable computation redundancy remains in BNN inference. To tackle this problem, we investigate the similarity property in input data and kernel weights. We identify an average of 79% input similarity and 61% kernel similarity measured by our proposed metric across common network architectures. Motivated by this observation, we propose SimBNN, a fast and energy-efficient acceleration framework for BNN inference that leverages similarity properties. SimBNN consists of a set of similarity-aware accelerators, a weight reuse optimization algorithm, and a similarity selection mechanism. SimBNN incorporates two types of BNN accelerators, which exploit the input similarity and kernel similarity, respectively. More specifically, the result from the previous stage is reused if similarity is identified, thus significantly reducing BNN computation overhead. Furthermore, we propose a weight reuse optimization algorithm, which increases the weight similarity by off-line re-ordering weight kernels. Finally, our framework provides a systematic method to determine the optimal strategy between input data and kernel weights reuse, based on the similarity characteristics of input data and pre-trained BNNs.","PeriodicalId":116955,"journal":{"name":"2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"SimBNN: A Similarity-Aware Binarized Neural Network Acceleration Framework\",\"authors\":\"Cheng Fu, Shilin Zhu, Huili Chen, F. Koushanfar, Hao Su, Jishen Zhao\",\"doi\":\"10.1109/FCCM.2019.00060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Binarized Neural Networks (BNNs) eliminate bitwidth redundancy in Convolutional Neural Networks (CNNs) by using a single bit (-1/+1) for network parameters and intermediate representations. This greatly reduces off-chip data transfer and storage overhead. However, considerable computation redundancy remains in BNN inference. To tackle this problem, we investigate the similarity property in input data and kernel weights. We identify an average of 79% input similarity and 61% kernel similarity measured by our proposed metric across common network architectures. Motivated by this observation, we propose SimBNN, a fast and energy-efficient acceleration framework for BNN inference that leverages similarity properties. SimBNN consists of a set of similarity-aware accelerators, a weight reuse optimization algorithm, and a similarity selection mechanism. SimBNN incorporates two types of BNN accelerators, which exploit the input similarity and kernel similarity, respectively. More specifically, the result from the previous stage is reused if similarity is identified, thus significantly reducing BNN computation overhead. Furthermore, we propose a weight reuse optimization algorithm, which increases the weight similarity by off-line re-ordering weight kernels. Finally, our framework provides a systematic method to determine the optimal strategy between input data and kernel weights reuse, based on the similarity characteristics of input data and pre-trained BNNs.\",\"PeriodicalId\":116955,\"journal\":{\"name\":\"2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2019.00060\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2019.00060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

二值化神经网络(bnn)通过使用单个比特(-1/+1)表示网络参数和中间表示来消除卷积神经网络(cnn)中的位宽冗余。这大大减少了片外数据传输和存储开销。然而，在BNN推理中仍然存在相当大的计算冗余。为了解决这个问题，我们研究了输入数据和核权值的相似性。我们确定了79%的输入相似度和61%的内核相似度，这些相似度通过我们提出的指标在常见的网络架构中测量出来。基于这一观察结果，我们提出了SimBNN，这是一种利用相似性特性的快速节能的BNN推理加速框架。SimBNN由一组相似度感知加速器、权重重用优化算法和相似度选择机制组成。SimBNN包含两种类型的BNN加速器，它们分别利用输入相似度和核相似度。更具体地说，如果识别出相似性，则重用前一阶段的结果，从而大大减少了BNN计算开销。此外，我们提出了一种权值重用优化算法，该算法通过离线重排序权值核来提高权值相似度。最后，我们的框架提供了一种系统的方法来确定输入数据和核权重用之间的最优策略，基于输入数据和预训练的bnn的相似性特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SimBNN: A Similarity-Aware Binarized Neural Network Acceleration Framework

Binarized Neural Networks (BNNs) eliminate bitwidth redundancy in Convolutional Neural Networks (CNNs) by using a single bit (-1/+1) for network parameters and intermediate representations. This greatly reduces off-chip data transfer and storage overhead. However, considerable computation redundancy remains in BNN inference. To tackle this problem, we investigate the similarity property in input data and kernel weights. We identify an average of 79% input similarity and 61% kernel similarity measured by our proposed metric across common network architectures. Motivated by this observation, we propose SimBNN, a fast and energy-efficient acceleration framework for BNN inference that leverages similarity properties. SimBNN consists of a set of similarity-aware accelerators, a weight reuse optimization algorithm, and a similarity selection mechanism. SimBNN incorporates two types of BNN accelerators, which exploit the input similarity and kernel similarity, respectively. More specifically, the result from the previous stage is reused if similarity is identified, thus significantly reducing BNN computation overhead. Furthermore, we propose a weight reuse optimization algorithm, which increases the weight similarity by off-line re-ordering weight kernels. Finally, our framework provides a systematic method to determine the optimal strategy between input data and kernel weights reuse, based on the similarity characteristics of input data and pre-trained BNNs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)

自引率

0.00%

发文量