An Efficient Hardware Architecture for Activation Function in Deep Learning Processor

Lin Li, Shengbing Zhang, Juan Wu
{"title":"An Efficient Hardware Architecture for Activation Function in Deep Learning Processor","authors":"Lin Li, Shengbing Zhang, Juan Wu","doi":"10.1109/ICIVC.2018.8492754","DOIUrl":null,"url":null,"abstract":"In order to explore the efficient design and implementation of activation function in deep learning processor, this paper presents an efficient five-stage pipelined hardware architecture for activation function based on the piecewise linear interpolation, and a novel neuron data-LUT address mapping algorithm. Compared with the previous designs based on serial calculation, the proposed hardware architecture can achieve at least 3 times of acceleration. Four commonly used activation functions are designed based on the proposed hardware architecture, which is implemented on the XC6VLX240T of Xilinx. The LeNet-5 and AlexNet are selected as benchmarks to test the inference accuracy of different activation functions with different piecewise numbers on the MNIST and CIFAR-10 test sets in the deep learning processor prototype system. The experiment results show that the proposed hardware architecture can effectively accomplish the relevant calculation of activation functions in the deep learning processor and the accuracy loss is negligible. The proposed hardware architecture is adaptable for numerous activation functions, which can be widely used in the design of other deep learning processors.","PeriodicalId":173981,"journal":{"name":"2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIVC.2018.8492754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

In order to explore the efficient design and implementation of activation function in deep learning processor, this paper presents an efficient five-stage pipelined hardware architecture for activation function based on the piecewise linear interpolation, and a novel neuron data-LUT address mapping algorithm. Compared with the previous designs based on serial calculation, the proposed hardware architecture can achieve at least 3 times of acceleration. Four commonly used activation functions are designed based on the proposed hardware architecture, which is implemented on the XC6VLX240T of Xilinx. The LeNet-5 and AlexNet are selected as benchmarks to test the inference accuracy of different activation functions with different piecewise numbers on the MNIST and CIFAR-10 test sets in the deep learning processor prototype system. The experiment results show that the proposed hardware architecture can effectively accomplish the relevant calculation of activation functions in the deep learning processor and the accuracy loss is negligible. The proposed hardware architecture is adaptable for numerous activation functions, which can be widely used in the design of other deep learning processors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种高效的深度学习处理器激活函数硬件结构
为了探索激活函数在深度学习处理器中的高效设计与实现,本文提出了一种基于分段线性插值的激活函数高效五阶段流水线硬件架构,以及一种新的神经元数据- lut地址映射算法。与以往基于串行计算的设计相比,所提出的硬件架构可以实现至少3倍的加速。基于所提出的硬件架构,设计了四种常用的激活函数,并在Xilinx的XC6VLX240T上实现。选择LeNet-5和AlexNet作为基准,在深度学习处理器原型系统的MNIST和CIFAR-10测试集上测试不同分段数的不同激活函数的推理精度。实验结果表明,所提出的硬件架构可以有效地完成深度学习处理器中激活函数的相关计算,精度损失可以忽略不计。所提出的硬件架构可适应多种激活函数,可广泛应用于其他深度学习处理器的设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Investigation of Skeleton-Based Optical Flow-Guided Features for 3D Action Recognition Using a Multi-Stream CNN Model Research on the Counting Algorithm of Bundled Steel Bars Based on the Features Matching of Connected Regions Hybrid Change Detection Based on ISFA for High-Resolution Imagery Scene Recognition with Convolutional Residual Features via Deep Forest Design and Implementation of T-Hash Tree in Main Memory Data Base
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1