2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)最新文献

英文中文

CNN hardware acceleration on a low-power and low-cost APSoC 基于低功耗低成本APSoC的CNN硬件加速

2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)

Pub Date : 2019-10-01 DOI: 10.1109/DASIP48288.2019.9049213

P. Meloni, Antonio Garufi, Gianfranco Deriu, Marco Carreras, Daniela Loi

Deep learning and Convolutional Neural Networks (CNNs) in particular, are currently one of the most promising and widely used classes of algorithms in the field of artificial intelligence, being employed in a wide range of tasks. However, their high computational complexity and storage demands limit their efficient deployment on resource-limited embedded systems and IoT devices. To address this problem, in recent years a wide landscape of customized FPGA-based hardware acceleration solutions has been presented in literature, focused on combining high performance and power efficiency. Most of them are implemented on mid- to high-range devices including different computing cores, and target intensive models such as AlexNet and VGG16. In this work, we implement a CNN inference accelerator on a compact and cost-optimized device, the Minized development board from Avnet, integrating a single-core Zynq 7Z007S. We measure the execution time and energy consumption of the developed accelerator, and we compare it with a CPU-based software implementation. The results show that the accelerator achieves a frame rate of 13 fps on the end-to-end execution of ALL-CNN-C model, and 4 fps on DarkNet. Compared with the software implementation, it was 5 times faster providing up to 10.62 giga operations per second (GOPS) at 80 MHz while consuming 1.08 W of on-chip power.

特别是深度学习和卷积神经网络(cnn)，是目前人工智能领域最有前途和最广泛使用的算法之一，被广泛应用于各种任务中。然而，它们的高计算复杂度和存储需求限制了它们在资源有限的嵌入式系统和物联网设备上的有效部署。为了解决这个问题，近年来在文献中提出了基于fpga的定制硬件加速解决方案，重点是结合高性能和功率效率。它们大多是在包括不同计算核心的中高档设备上实现的，目标是AlexNet和VGG16等密集型模型。在这项工作中，我们在安富利的小型化开发板上实现了一个CNN推理加速器，该开发板集成了单核Zynq 7Z007S。我们测量了所开发的加速器的执行时间和能耗，并将其与基于cpu的软件实现进行了比较。结果表明，该加速器在ALL-CNN-C模型端到端执行时帧率达到13 fps，在DarkNet上帧率达到4 fps。与软件实现相比，它的速度快了5倍，在80 MHz下提供高达每秒10.62千兆的操作(GOPS)，同时消耗1.08 W的片上功率。

{"title":"CNN hardware acceleration on a low-power and low-cost APSoC","authors":"P. Meloni, Antonio Garufi, Gianfranco Deriu, Marco Carreras, Daniela Loi","doi":"10.1109/DASIP48288.2019.9049213","DOIUrl":"https://doi.org/10.1109/DASIP48288.2019.9049213","url":null,"abstract":"Deep learning and Convolutional Neural Networks (CNNs) in particular, are currently one of the most promising and widely used classes of algorithms in the field of artificial intelligence, being employed in a wide range of tasks. However, their high computational complexity and storage demands limit their efficient deployment on resource-limited embedded systems and IoT devices. To address this problem, in recent years a wide landscape of customized FPGA-based hardware acceleration solutions has been presented in literature, focused on combining high performance and power efficiency. Most of them are implemented on mid- to high-range devices including different computing cores, and target intensive models such as AlexNet and VGG16. In this work, we implement a CNN inference accelerator on a compact and cost-optimized device, the Minized development board from Avnet, integrating a single-core Zynq 7Z007S. We measure the execution time and energy consumption of the developed accelerator, and we compare it with a CPU-based software implementation. The results show that the accelerator achieves a frame rate of 13 fps on the end-to-end execution of ALL-CNN-C model, and 4 fps on DarkNet. Compared with the software implementation, it was 5 times faster providing up to 10.62 giga operations per second (GOPS) at 80 MHz while consuming 1.08 W of on-chip power.","PeriodicalId":120855,"journal":{"name":"2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134270916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

SparseCCL: Connected Components Labeling and Analysis for sparse images SparseCCL:稀疏图像的连通成分标记和分析

2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)

Pub Date : 2019-10-01 DOI: 10.1109/DASIP48288.2019.9049184

A. Hennequin, Benjamin Couturier, V. Gligorov, L. Lacassagne

Connected components labeling and analysis for dense images have been extensively studied on a wide range of architectures. Some applications, like particles detectors in High Energy Physics, need to analyse many small and sparse images at high throughput. Because they process all pixels of the image, classic algorithms for dense images are inefficient on sparse data. We address this inefficiency by introducing a new algorithm specifically designed for sparse images. We show that we can further improve this sparse algorithm by specializing it for the data input format, avoiding a decoding step and processing multiple pixels at once. A benchmark on Intel and AMD CPUs shows that the algorithm is from x 1.6 to x 2.5 faster on sparse images.

密集图像的连通分量标注和分析已经在各种体系结构上得到了广泛的研究。一些应用，如高能物理中的粒子探测器，需要以高吞吐量分析许多小而稀疏的图像。由于它们处理图像的所有像素，因此用于密集图像的经典算法在稀疏数据上是低效的。我们通过引入专门为稀疏图像设计的新算法来解决这种低效率问题。我们表明，我们可以进一步改进这种稀疏算法，将其专门用于数据输入格式，避免解码步骤并一次处理多个像素。在英特尔和AMD cpu上的基准测试表明，该算法在稀疏图像上的速度从1.6到2.5快。

引用次数: 3

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀