New paradigm of FPGA-based computational intelligence from surveying the implementation of DNN accelerators

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Design Automation for Embedded Systems Pub Date : 2022-01-12 DOI:10.1007/s10617-021-09256-8

Yang You, Yinghui Chang, Weikang Wu, Bingrui Guo, Hongyin Luo, Xiaojie Liu, Bijing Liu, Kairong Zhao, Shan He, Lin Li, Donghui Guo

{"title":"New paradigm of FPGA-based computational intelligence from surveying the implementation of DNN accelerators","authors":"Yang You, Yinghui Chang, Weikang Wu, Bingrui Guo, Hongyin Luo, Xiaojie Liu, Bijing Liu, Kairong Zhao, Shan He, Lin Li, Donghui Guo","doi":"10.1007/s10617-021-09256-8","DOIUrl":null,"url":null,"abstract":"<p>With the rapid development of Artificial Intelligence, Internet of Things, 5G, and other technologies, a number of emerging intelligent applications represented by image recognition, voice recognition, autonomous driving, and intelligent manufacturing have appeared. These applications require efficient and intelligent processing systems for massive data calculations, so it is urgent to apply better DNN in a faster way. Although, compared with GPU, FPGA has a higher energy efficiency ratio, and shorter development cycle and better flexibility than ASIC. However, FPGA is not a perfect hardware platform either for computational intelligence. This paper provides a survey of the latest acceleration work related to the familiar DNNs and proposes three new directions to break the bottleneck of the DNN implementation. So as to improve calculating speed and energy efficiency of edge devices, intelligent embedded approaches including model compression and optimized data movement of the entire system are most commonly used. With the gradual slowdown of Moore’s Law, the traditional Von Neumann Architecture generates a “Memory Wall” problem, resulting in more power-consuming. In-memory computation will be the right medicine in the post-Moore law era. More complete software/hardware co-design environment will direct researchers’ attention to explore deep learning algorithms and run the algorithm on the hardware level in a faster way. These new directions start a relatively new paradigm in computational intelligence, which have attracted substantial attention from the research community and demonstrated greater potential over traditional techniques.</p>","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"15 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2022-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Design Automation for Embedded Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10617-021-09256-8","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

With the rapid development of Artificial Intelligence, Internet of Things, 5G, and other technologies, a number of emerging intelligent applications represented by image recognition, voice recognition, autonomous driving, and intelligent manufacturing have appeared. These applications require efficient and intelligent processing systems for massive data calculations, so it is urgent to apply better DNN in a faster way. Although, compared with GPU, FPGA has a higher energy efficiency ratio, and shorter development cycle and better flexibility than ASIC. However, FPGA is not a perfect hardware platform either for computational intelligence. This paper provides a survey of the latest acceleration work related to the familiar DNNs and proposes three new directions to break the bottleneck of the DNN implementation. So as to improve calculating speed and energy efficiency of edge devices, intelligent embedded approaches including model compression and optimized data movement of the entire system are most commonly used. With the gradual slowdown of Moore’s Law, the traditional Von Neumann Architecture generates a “Memory Wall” problem, resulting in more power-consuming. In-memory computation will be the right medicine in the post-Moore law era. More complete software/hardware co-design environment will direct researchers’ attention to explore deep learning algorithms and run the algorithm on the hardware level in a faster way. These new directions start a relatively new paradigm in computational intelligence, which have attracted substantial attention from the research community and demonstrated greater potential over traditional techniques.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于fpga的计算智能新范式——深度神经网络加速器的实现

随着人工智能、物联网、5G等技术的快速发展，出现了以图像识别、语音识别、自动驾驶、智能制造等为代表的一批新兴智能应用。这些应用需要高效和智能的处理系统来进行大量数据计算，因此迫切需要更快更好地应用深度神经网络。尽管与GPU相比，FPGA具有更高的能效比、更短的开发周期和比ASIC更好的灵活性。然而，对于计算智能来说，FPGA也不是一个完美的硬件平台。本文综述了与常见深度神经网络相关的最新加速工作，并提出了突破深度神经网络实现瓶颈的三个新方向。为了提高边缘设备的计算速度和能源效率，最常用的是智能嵌入方法，包括模型压缩和优化整个系统的数据移动。随着摩尔定律的逐渐放缓，传统的冯·诺依曼架构产生了“内存墙”问题，导致更大的功耗。内存计算将是后摩尔定律时代的良药。更完善的软硬件协同设计环境将引导研究人员关注深度学习算法的探索，并以更快的方式在硬件层面运行算法。这些新方向在计算智能领域开创了一个相对新的范式，吸引了研究界的大量关注，并显示出比传统技术更大的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Design Automation for Embedded Systems 工程技术-计算机：软件工程

CiteScore

2.60

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Embedded (electronic) systems have become the electronic engines of modern consumer and industrial devices, from automobiles to satellites, from washing machines to high-definition TVs, and from cellular phones to complete base stations. These embedded systems encompass a variety of hardware and software components which implement a wide range of functions including digital, analog and RF parts. Although embedded systems have been designed for decades, the systematic design of such systems with well defined methodologies, automation tools and technologies has gained attention primarily in the last decade. Advances in silicon technology and increasingly demanding applications have significantly expanded the scope and complexity of embedded systems. These systems are only now becoming possible due to advances in methodologies, tools, architectures and design techniques. Design Automation for Embedded Systems is a multidisciplinary journal which addresses the systematic design of embedded systems, focusing primarily on tools, methodologies and architectures for embedded systems, including HW/SW co-design, simulation and modeling approaches, synthesis techniques, architectures and design exploration, among others. Design Automation for Embedded Systems offers a forum for scientist and engineers to report on their latest works on algorithms, tools, architectures, case studies and real design examples related to embedded systems hardware and software. Design Automation for Embedded Systems is an innovative journal which distinguishes itself by welcoming high-quality papers on the methodology, tools, architectures and design of electronic embedded systems, leading to a true multidisciplinary system design journal.