Searching Tiny Neural Networks for Deployment on Embedded FPGA

Haiyan Qin, Yejun Zeng, Jinyu Bai, Wang Kang
{"title":"Searching Tiny Neural Networks for Deployment on Embedded FPGA","authors":"Haiyan Qin, Yejun Zeng, Jinyu Bai, Wang Kang","doi":"10.1109/AICAS57966.2023.10168571","DOIUrl":null,"url":null,"abstract":"Embedded FPGAs have become increasingly popular as acceleration platforms for the deployment of edge-side artificial intelligence (AI) applications, due in part to their flexible and configurable heterogeneous architectures. However, the complex deployment process hinders the realization of AI democratization, particularly at the edge. In this paper, we propose a software-hardware co-design framework that enables simultaneous searching for neural network architectures and corresponding accelerator designs on embedded FPGAs. The proposed framework comprises a hardware-friendly neural architecture search space, a reconfigurable streaming-based accelerator architecture, and a model performance estimator. An evolutionary algorithm targeting multi-objective optimization is employed to identify the optimal neural architecture and corresponding accelerator design. We evaluate our framework on various datasets and demonstrate that, in a typical edge AI scenario, the searched network and accelerator can achieve up to a 2.9% accuracy improvement and up to a 21 speedup compared to manually designed networks based on× common accelerator designs when deployed on a widely used embedded FPGA (Xilinx XC7Z020).","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS57966.2023.10168571","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Embedded FPGAs have become increasingly popular as acceleration platforms for the deployment of edge-side artificial intelligence (AI) applications, due in part to their flexible and configurable heterogeneous architectures. However, the complex deployment process hinders the realization of AI democratization, particularly at the edge. In this paper, we propose a software-hardware co-design framework that enables simultaneous searching for neural network architectures and corresponding accelerator designs on embedded FPGAs. The proposed framework comprises a hardware-friendly neural architecture search space, a reconfigurable streaming-based accelerator architecture, and a model performance estimator. An evolutionary algorithm targeting multi-objective optimization is employed to identify the optimal neural architecture and corresponding accelerator design. We evaluate our framework on various datasets and demonstrate that, in a typical edge AI scenario, the searched network and accelerator can achieve up to a 2.9% accuracy improvement and up to a 21 speedup compared to manually designed networks based on× common accelerator designs when deployed on a widely used embedded FPGA (Xilinx XC7Z020).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向嵌入式FPGA部署的微型神经网络搜索
嵌入式fpga作为边缘人工智能(AI)应用部署的加速平台越来越受欢迎,部分原因是其灵活且可配置的异构架构。然而,复杂的部署过程阻碍了人工智能民主化的实现,特别是在边缘。在本文中,我们提出了一个软硬件协同设计框架,可以同时搜索嵌入式fpga上的神经网络架构和相应的加速器设计。该框架包括一个硬件友好的神经结构搜索空间、一个可重构的基于流的加速器结构和一个模型性能估计器。采用多目标优化进化算法确定最优神经结构和相应的加速器设计。我们在各种数据集上评估了我们的框架,并证明,在典型的边缘人工智能场景中,当部署在广泛使用的嵌入式FPGA (Xilinx XC7Z020)上时,与基于通用加速器设计的手动设计网络相比,搜索网络和加速器可以实现高达2.9%的精度提高和高达21%的加速。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Synaptic metaplasticity with multi-level memristive devices Unsupervised Learning of Spike-Timing-Dependent Plasticity Based on a Neuromorphic Implementation A Fully Differential 4-Bit Analog Compute-In-Memory Architecture for Inference Application Convergent Waveform Relaxation Schemes for the Transient Analysis of Associative ReLU Arrays Performance Assessment of an Extremely Energy-Efficient Binary Neural Network Using Adiabatic Superconductor Devices
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1