卷积神经网络硬件实现的比较分析

Gabriel H. Eisenkraemer, L. Oliveira, E. Carara
{"title":"卷积神经网络硬件实现的比较分析","authors":"Gabriel H. Eisenkraemer, L. Oliveira, E. Carara","doi":"10.1109/SBCCI55532.2022.9893234","DOIUrl":null,"url":null,"abstract":"Artificial Neural Networks (ANNs) have become the most popular machine learning technique for data processing, performing central functions in a wide variety of applications. In many cases, these models are used within constrained scenarios, in which a local execution of the algorithm is necessary to avoid latency and safety issues of remote computing (e.g, autonomous vehicles, edge devices in IoT networks). Even so, the known computational complexity of these models is still a challenge in such contexts, as implementation costs and performance requirements are difficult to balance. In these scenarios, pa-rameter quantization techniques are essential to simplifying the operations and memory footprint to make the hardware implementation more viable. In this paper, a case study is devised in which a convolutional neural network (CNN) architecture is fully implemented in hardware with three different optimization strategies, having parameters mapped to low bit-width fixed point integers with a power-of-two quantization scheme. Both ASIC and FPGA implementation flows are followed, allowing for an in-depth analysis of each circuit version. The obtained results show that the adopted quantization process enables optimizations on the implemented circuit, reducing about 50% of the circuitry area and 87.5% of the memory requirement. At the same time, the application performance was kept at the same level.","PeriodicalId":231587,"journal":{"name":"2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)","volume":"39 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative Analysis of Hardware Implementations of a Convolutional Neural Network\",\"authors\":\"Gabriel H. Eisenkraemer, L. Oliveira, E. Carara\",\"doi\":\"10.1109/SBCCI55532.2022.9893234\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial Neural Networks (ANNs) have become the most popular machine learning technique for data processing, performing central functions in a wide variety of applications. In many cases, these models are used within constrained scenarios, in which a local execution of the algorithm is necessary to avoid latency and safety issues of remote computing (e.g, autonomous vehicles, edge devices in IoT networks). Even so, the known computational complexity of these models is still a challenge in such contexts, as implementation costs and performance requirements are difficult to balance. In these scenarios, pa-rameter quantization techniques are essential to simplifying the operations and memory footprint to make the hardware implementation more viable. In this paper, a case study is devised in which a convolutional neural network (CNN) architecture is fully implemented in hardware with three different optimization strategies, having parameters mapped to low bit-width fixed point integers with a power-of-two quantization scheme. Both ASIC and FPGA implementation flows are followed, allowing for an in-depth analysis of each circuit version. The obtained results show that the adopted quantization process enables optimizations on the implemented circuit, reducing about 50% of the circuitry area and 87.5% of the memory requirement. At the same time, the application performance was kept at the same level.\",\"PeriodicalId\":231587,\"journal\":{\"name\":\"2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)\",\"volume\":\"39 4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SBCCI55532.2022.9893234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBCCI55532.2022.9893234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

人工神经网络(ann)已经成为最流行的数据处理机器学习技术,在各种应用中发挥核心作用。在许多情况下,这些模型在受限的场景中使用,在这些场景中,算法的本地执行是必要的,以避免远程计算的延迟和安全问题(例如,自动驾驶汽车,物联网网络中的边缘设备)。即便如此,在这种情况下,这些模型已知的计算复杂性仍然是一个挑战,因为实现成本和性能需求很难平衡。在这些场景中,参数量化技术对于简化操作和内存占用至关重要,从而使硬件实现更加可行。在本文中,设计了一个案例研究,其中卷积神经网络(CNN)架构在硬件上完全实现,采用三种不同的优化策略,将参数映射到低位宽不动点整数,并采用2次幂量化方案。遵循ASIC和FPGA实现流程,允许对每个电路版本进行深入分析。结果表明,所采用的量化过程可以优化所实现的电路,减少约50%的电路面积和87.5%的内存需求。同时,应用程序性能保持在同一水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Comparative Analysis of Hardware Implementations of a Convolutional Neural Network
Artificial Neural Networks (ANNs) have become the most popular machine learning technique for data processing, performing central functions in a wide variety of applications. In many cases, these models are used within constrained scenarios, in which a local execution of the algorithm is necessary to avoid latency and safety issues of remote computing (e.g, autonomous vehicles, edge devices in IoT networks). Even so, the known computational complexity of these models is still a challenge in such contexts, as implementation costs and performance requirements are difficult to balance. In these scenarios, pa-rameter quantization techniques are essential to simplifying the operations and memory footprint to make the hardware implementation more viable. In this paper, a case study is devised in which a convolutional neural network (CNN) architecture is fully implemented in hardware with three different optimization strategies, having parameters mapped to low bit-width fixed point integers with a power-of-two quantization scheme. Both ASIC and FPGA implementation flows are followed, allowing for an in-depth analysis of each circuit version. The obtained results show that the adopted quantization process enables optimizations on the implemented circuit, reducing about 50% of the circuitry area and 87.5% of the memory requirement. At the same time, the application performance was kept at the same level.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transistor Reordering for Electrical Improvement in CMOS Complex Gates CSIP: A Compact Scrypt IP design with single PBKDF2 core for Blockchain mining A High-level Model to Leverage NoC-based Many-core Research Time Assisted SAR ADC with Bit-guess and Digital Error Correction A Time-Efficient Defect Simulation Framework for Analog and Mixed Signal (AMS) Circuits
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1