首页 > 最新文献

2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)最新文献

英文 中文
CNN hardware acceleration on a low-power and low-cost APSoC 基于低功耗低成本APSoC的CNN硬件加速
Pub Date : 2019-10-01 DOI: 10.1109/DASIP48288.2019.9049213
P. Meloni, Antonio Garufi, Gianfranco Deriu, Marco Carreras, Daniela Loi
Deep learning and Convolutional Neural Networks (CNNs) in particular, are currently one of the most promising and widely used classes of algorithms in the field of artificial intelligence, being employed in a wide range of tasks. However, their high computational complexity and storage demands limit their efficient deployment on resource-limited embedded systems and IoT devices. To address this problem, in recent years a wide landscape of customized FPGA-based hardware acceleration solutions has been presented in literature, focused on combining high performance and power efficiency. Most of them are implemented on mid- to high-range devices including different computing cores, and target intensive models such as AlexNet and VGG16. In this work, we implement a CNN inference accelerator on a compact and cost-optimized device, the Minized development board from Avnet, integrating a single-core Zynq 7Z007S. We measure the execution time and energy consumption of the developed accelerator, and we compare it with a CPU-based software implementation. The results show that the accelerator achieves a frame rate of 13 fps on the end-to-end execution of ALL-CNN-C model, and 4 fps on DarkNet. Compared with the software implementation, it was 5 times faster providing up to 10.62 giga operations per second (GOPS) at 80 MHz while consuming 1.08 W of on-chip power.
特别是深度学习和卷积神经网络(cnn),是目前人工智能领域最有前途和最广泛使用的算法之一,被广泛应用于各种任务中。然而,它们的高计算复杂度和存储需求限制了它们在资源有限的嵌入式系统和物联网设备上的有效部署。为了解决这个问题,近年来在文献中提出了基于fpga的定制硬件加速解决方案,重点是结合高性能和功率效率。它们大多是在包括不同计算核心的中高档设备上实现的,目标是AlexNet和VGG16等密集型模型。在这项工作中,我们在安富利的小型化开发板上实现了一个CNN推理加速器,该开发板集成了单核Zynq 7Z007S。我们测量了所开发的加速器的执行时间和能耗,并将其与基于cpu的软件实现进行了比较。结果表明,该加速器在ALL-CNN-C模型端到端执行时帧率达到13 fps,在DarkNet上帧率达到4 fps。与软件实现相比,它的速度快了5倍,在80 MHz下提供高达每秒10.62千兆的操作(GOPS),同时消耗1.08 W的片上功率。
{"title":"CNN hardware acceleration on a low-power and low-cost APSoC","authors":"P. Meloni, Antonio Garufi, Gianfranco Deriu, Marco Carreras, Daniela Loi","doi":"10.1109/DASIP48288.2019.9049213","DOIUrl":"https://doi.org/10.1109/DASIP48288.2019.9049213","url":null,"abstract":"Deep learning and Convolutional Neural Networks (CNNs) in particular, are currently one of the most promising and widely used classes of algorithms in the field of artificial intelligence, being employed in a wide range of tasks. However, their high computational complexity and storage demands limit their efficient deployment on resource-limited embedded systems and IoT devices. To address this problem, in recent years a wide landscape of customized FPGA-based hardware acceleration solutions has been presented in literature, focused on combining high performance and power efficiency. Most of them are implemented on mid- to high-range devices including different computing cores, and target intensive models such as AlexNet and VGG16. In this work, we implement a CNN inference accelerator on a compact and cost-optimized device, the Minized development board from Avnet, integrating a single-core Zynq 7Z007S. We measure the execution time and energy consumption of the developed accelerator, and we compare it with a CPU-based software implementation. The results show that the accelerator achieves a frame rate of 13 fps on the end-to-end execution of ALL-CNN-C model, and 4 fps on DarkNet. Compared with the software implementation, it was 5 times faster providing up to 10.62 giga operations per second (GOPS) at 80 MHz while consuming 1.08 W of on-chip power.","PeriodicalId":120855,"journal":{"name":"2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134270916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
SparseCCL: Connected Components Labeling and Analysis for sparse images SparseCCL:稀疏图像的连通成分标记和分析
Pub Date : 2019-10-01 DOI: 10.1109/DASIP48288.2019.9049184
A. Hennequin, Benjamin Couturier, V. Gligorov, L. Lacassagne
Connected components labeling and analysis for dense images have been extensively studied on a wide range of architectures. Some applications, like particles detectors in High Energy Physics, need to analyse many small and sparse images at high throughput. Because they process all pixels of the image, classic algorithms for dense images are inefficient on sparse data. We address this inefficiency by introducing a new algorithm specifically designed for sparse images. We show that we can further improve this sparse algorithm by specializing it for the data input format, avoiding a decoding step and processing multiple pixels at once. A benchmark on Intel and AMD CPUs shows that the algorithm is from x 1.6 to x 2.5 faster on sparse images.
密集图像的连通分量标注和分析已经在各种体系结构上得到了广泛的研究。一些应用,如高能物理中的粒子探测器,需要以高吞吐量分析许多小而稀疏的图像。由于它们处理图像的所有像素,因此用于密集图像的经典算法在稀疏数据上是低效的。我们通过引入专门为稀疏图像设计的新算法来解决这种低效率问题。我们表明,我们可以进一步改进这种稀疏算法,将其专门用于数据输入格式,避免解码步骤并一次处理多个像素。在英特尔和AMD cpu上的基准测试表明,该算法在稀疏图像上的速度从1.6到2.5快。
{"title":"SparseCCL: Connected Components Labeling and Analysis for sparse images","authors":"A. Hennequin, Benjamin Couturier, V. Gligorov, L. Lacassagne","doi":"10.1109/DASIP48288.2019.9049184","DOIUrl":"https://doi.org/10.1109/DASIP48288.2019.9049184","url":null,"abstract":"Connected components labeling and analysis for dense images have been extensively studied on a wide range of architectures. Some applications, like particles detectors in High Energy Physics, need to analyse many small and sparse images at high throughput. Because they process all pixels of the image, classic algorithms for dense images are inefficient on sparse data. We address this inefficiency by introducing a new algorithm specifically designed for sparse images. We show that we can further improve this sparse algorithm by specializing it for the data input format, avoiding a decoding step and processing multiple pixels at once. A benchmark on Intel and AMD CPUs shows that the algorithm is from x 1.6 to x 2.5 faster on sparse images.","PeriodicalId":120855,"journal":{"name":"2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129737978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2019 Conference on Design and Architectures for Signal and Image Processing (DASIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1