Concurrent MAC unit design using VHDL for deep learning networks on FPGA

2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE) Pub Date : 2018-04-28 DOI:10.1109/ISCAIE.2018.8405440

Hossam O. Ahmed, M. Ghoneima, M. Dessouky

{"title":"Concurrent MAC unit design using VHDL for deep learning networks on FPGA","authors":"Hossam O. Ahmed, M. Ghoneima, M. Dessouky","doi":"10.1109/ISCAIE.2018.8405440","DOIUrl":null,"url":null,"abstract":"Deep neural network algorithms have proven their enormous capabilities in wide range of artificial intelligence applications, specially in Printed/Handwritten text recognition, Multimedia processing, Robotics and many other high end technological trends. The most challenging aspect nowadays is to overcome the extremely computational processing demands in applying such algorithms, especially in real-time systems. Recently, the Field Programmable Gate Array (FPGA) has been considered as one of the optimum hardware accelerator platform for accelerating the deep neural network architectures due to its large adaptability and the high degree of parallelism it offers. In this paper, the proposed 8-bits fixed-point parallel multiply-accumulate (MAC) unit architecture aimed to create a fully-customize MAC unit for the Convolutional Neural Networks (CNN) instead of depending on the conventional DSP blocks and embedded memories units on the FPGAs architecture silicon fabrics. The proposed 8-bits fixed-point parallel multiply-accumulate (MAC) unit architecture is designed using VHDL language and can performs a computational speed up to 4.17 Giga Operation per Second (GOPS) using high-density FPGAs.","PeriodicalId":333327,"journal":{"name":"2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAIE.2018.8405440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

Deep neural network algorithms have proven their enormous capabilities in wide range of artificial intelligence applications, specially in Printed/Handwritten text recognition, Multimedia processing, Robotics and many other high end technological trends. The most challenging aspect nowadays is to overcome the extremely computational processing demands in applying such algorithms, especially in real-time systems. Recently, the Field Programmable Gate Array (FPGA) has been considered as one of the optimum hardware accelerator platform for accelerating the deep neural network architectures due to its large adaptability and the high degree of parallelism it offers. In this paper, the proposed 8-bits fixed-point parallel multiply-accumulate (MAC) unit architecture aimed to create a fully-customize MAC unit for the Convolutional Neural Networks (CNN) instead of depending on the conventional DSP blocks and embedded memories units on the FPGAs architecture silicon fabrics. The proposed 8-bits fixed-point parallel multiply-accumulate (MAC) unit architecture is designed using VHDL language and can performs a computational speed up to 4.17 Giga Operation per Second (GOPS) using high-density FPGAs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于FPGA的深度学习网络并行MAC单元设计

深度神经网络算法已经在广泛的人工智能应用中证明了其巨大的能力，特别是在印刷/手写文本识别，多媒体处理，机器人和许多其他高端技术趋势中。目前最具挑战性的方面是克服应用这些算法的极端计算处理需求，特别是在实时系统中。近年来，现场可编程门阵列(FPGA)由于其具有较大的适应性和高度的并行性，被认为是加速深度神经网络架构的最佳硬件加速器平台之一。在本文中，提出的8位定点并行乘法累积(MAC)单元架构旨在为卷积神经网络(CNN)创建一个完全定制的MAC单元，而不是依赖于传统的DSP模块和fpga架构硅结构上的嵌入式存储器单元。所提出的8位定点并行乘法累加(MAC)单元架构采用VHDL语言设计，采用高密度fpga可实现高达4.17千兆运算每秒(GOPS)的计算速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE)

自引率

0.00%

发文量