Systolic Array based Multiply Accumulation Unit for IoT Edge Accelerators

P. Lahari, S. Yellampalli, R. Vaddi
{"title":"Systolic Array based Multiply Accumulation Unit for IoT Edge Accelerators","authors":"P. Lahari, S. Yellampalli, R. Vaddi","doi":"10.1109/iSES52644.2021.00058","DOIUrl":null,"url":null,"abstract":"Accelerator is a hardware that runs along with the processor and executes the key functions much faster than the processor. The Main purpose of the Accelerator is to increase speed. Deep Neural Networks has achieved wide results in the various Machine Learning Applications Such as image, video, text classification and language translation. The purpose of DNN Accelerators is to speed up the most complex Computation i.e., matrix multiplication. Systolic array Based Accelerator seems like multiply Accumulate unit with Systolic Array based multiplication followed by Adder and accumulator. Multiply Accumulate Unit comprises multiplier, adder and Accumulator. Multiplier is designed used systolic array and that output is given as one of the inputs to the adder followed by Accumulator. In this paper general Matrix based Multiply Accumulate Unit is compared with systolic array based Multiply Accumulate Unit using Xilinx ISE 14.5, various parameters like area, delay and speed are compared. Systolic Array based Multiply Accumulate Unit consumes less area of 49%, less delay of 35% and in turn provides high speed when compared with general matrix multiplier-based multiplier Accumulate unit.","PeriodicalId":293167,"journal":{"name":"2021 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iSES52644.2021.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Accelerator is a hardware that runs along with the processor and executes the key functions much faster than the processor. The Main purpose of the Accelerator is to increase speed. Deep Neural Networks has achieved wide results in the various Machine Learning Applications Such as image, video, text classification and language translation. The purpose of DNN Accelerators is to speed up the most complex Computation i.e., matrix multiplication. Systolic array Based Accelerator seems like multiply Accumulate unit with Systolic Array based multiplication followed by Adder and accumulator. Multiply Accumulate Unit comprises multiplier, adder and Accumulator. Multiplier is designed used systolic array and that output is given as one of the inputs to the adder followed by Accumulator. In this paper general Matrix based Multiply Accumulate Unit is compared with systolic array based Multiply Accumulate Unit using Xilinx ISE 14.5, various parameters like area, delay and speed are compared. Systolic Array based Multiply Accumulate Unit consumes less area of 49%, less delay of 35% and in turn provides high speed when compared with general matrix multiplier-based multiplier Accumulate unit.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于收缩阵列的物联网边缘加速器乘法积累单元
加速器是一种与处理器一起运行的硬件,它执行关键功能的速度比处理器快得多。加速器的主要目的是提高速度。深度神经网络在各种机器学习应用中取得了广泛的成果,如图像、视频、文本分类和语言翻译。DNN加速器的目的是加速最复杂的计算,即矩阵乘法。基于收缩数组的加速器看起来就像用基于收缩数组的乘法乘以累加器和累加器。乘法累加单元包括乘法器、加法器和累加器。乘法器是用收缩阵列设计的,输出作为加法器的一个输入,然后是累加器。本文利用Xilinx ISE 14.5对基于一般矩阵的乘法累加单元和基于收缩阵列的乘法累加单元进行了比较,比较了面积、延迟和速度等参数。与一般基于矩阵乘法器的乘法器累积单元相比,基于收缩阵列的乘法器累积单元消耗的面积少49%,延迟少35%,速度高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Implementation of Self-Controlled Wheelchairs based on Joystick, Gesture Motion and Voice Recognition Dynamic Two Hand Gesture Recognition using CNN-LSTM based networks Performance Assessment of Dual Metal Graded Channel Negative Capacitance Junctionless FET for Digital/Analog field VLSI Architecture of Sigmoid Activation Function for Rapid Prototyping of Machine Learning Applications. Influence of Nanosilica in PVDF Thin Films for Sensing Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1