Energy Efficient Hardware Implementation of 2-D Convolution for Convolutional Neural Network

S. K. Sharma, Anu Gupta, K. Raju
{"title":"Energy Efficient Hardware Implementation of 2-D Convolution for Convolutional Neural Network","authors":"S. K. Sharma, Anu Gupta, K. Raju","doi":"10.1109/UPCON56432.2022.9986483","DOIUrl":null,"url":null,"abstract":"Over the last year, Deep neural networks (DNN) have been significantly accepted for computer vision applications because of high classification accuracy and versatility. Convolutional Neural Network (CNN) is one of the most popular architectures of DNN which is widely adopted for image, speech and video recognition. Extensive computation and large memory requirement of CNN s poses the bottleneck on its application. Field Programmable Gate Arrays (FPGAs) are considered to be suitable hardware platforms for deployment of CNNs with low power requirements. This paper focus on the design and implementation of hardware accelerator to perform the convolution product (matrix-matrix multiplication. We have used two optimization techniques to achieve energy efficiency. First, dataflow of the convolution phase is rescheduled to reduce the undesired on-chip memory accesses. Further, efficiency is enhanced by reducing the internal parallelism of structure as much as possible. Our architecture is implemented on the Xilinx ZCU104 evaluation board. The implemented design attains 98.1 GOPS/Joule and 32.77 GOPS/Joule for 8-bit and 16-bit data width respectively.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UPCON56432.2022.9986483","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Over the last year, Deep neural networks (DNN) have been significantly accepted for computer vision applications because of high classification accuracy and versatility. Convolutional Neural Network (CNN) is one of the most popular architectures of DNN which is widely adopted for image, speech and video recognition. Extensive computation and large memory requirement of CNN s poses the bottleneck on its application. Field Programmable Gate Arrays (FPGAs) are considered to be suitable hardware platforms for deployment of CNNs with low power requirements. This paper focus on the design and implementation of hardware accelerator to perform the convolution product (matrix-matrix multiplication. We have used two optimization techniques to achieve energy efficiency. First, dataflow of the convolution phase is rescheduled to reduce the undesired on-chip memory accesses. Further, efficiency is enhanced by reducing the internal parallelism of structure as much as possible. Our architecture is implemented on the Xilinx ZCU104 evaluation board. The implemented design attains 98.1 GOPS/Joule and 32.77 GOPS/Joule for 8-bit and 16-bit data width respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
卷积神经网络二维卷积的节能硬件实现
在过去的一年中,深度神经网络(DNN)因其高分类精度和多功能性而被广泛应用于计算机视觉应用。卷积神经网络(CNN)是深度神经网络中最流行的架构之一,广泛应用于图像、语音和视频识别。CNN的计算量大、内存需求大是其应用的瓶颈。现场可编程门阵列(fpga)被认为是部署低功耗cnn的合适硬件平台。本文重点研究了实现卷积积(矩阵-矩阵乘法)的硬件加速器的设计与实现。我们使用了两种优化技术来实现能源效率。首先,对卷积阶段的数据流进行重新调度,以减少不必要的片上存储器访问。此外,通过尽可能减少结构的内部平行度来提高效率。我们的架构是在赛灵思ZCU104评估板上实现的。实现的设计在8位和16位数据宽度下分别达到98.1 GOPS/焦耳和32.77 GOPS/焦耳。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Mains Interface Circuit Design for Traveling Wave Tube Amplifier A Passive Technique for Detecting Islanding Using Voltage Sequence Component A Unified Framework for Covariance Adaptation with Multiple Source Domains Advance Sensor for Monitoring Electrolyte Leakage in Lithium-ion Batteries for Electric Vehicles A comparative study of survey papers based on energy efficient, coverage-aware, and fault tolerant in static sink node of WSN
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1