Algorithm-Hardware Co-Optimization for Energy-Efficient Drone Detection on Resource-Constrained FPGA

2021 International Conference on Field-Programmable Technology (ICFPT) Pub Date : 2021-12-06 DOI:10.1145/3583074

Han-Sok Suh, Jian Meng, Ty Nguyen, S. Venkataramanaiah, Vijay Kumar, Yu Cao, Jae-sun Seo

{"title":"Algorithm-Hardware Co-Optimization for Energy-Efficient Drone Detection on Resource-Constrained FPGA","authors":"Han-Sok Suh, Jian Meng, Ty Nguyen, S. Venkataramanaiah, Vijay Kumar, Yu Cao, Jae-sun Seo","doi":"10.1145/3583074","DOIUrl":null,"url":null,"abstract":"Convolutional neural network (CNN) based object detection has achieved very high accuracy, e.g. single-shot multi-box detectors (SSD) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this paper, we designed and co-optimized algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, throughput optimization. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy-efficiency of 79 GOPS/W and throughput of 158 GOPS using Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 2.7X higher energy efficiency than prior works using the same FPGA device, at a low-power consumption of 1.98 W.","PeriodicalId":376220,"journal":{"name":"2021 International Conference on Field-Programmable Technology (ICFPT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3583074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Convolutional neural network (CNN) based object detection has achieved very high accuracy, e.g. single-shot multi-box detectors (SSD) can efficiently detect and localize various objects in an input image. However, they require a high amount of computation and memory storage, which makes it difficult to perform efficient inference on resource-constrained hardware devices such as drones or unmanned aerial vehicles (UAVs). Drone/UAV detection is an important task for applications including surveillance, defense, and multi-drone self-localization and formation control. In this paper, we designed and co-optimized algorithm and hardware for energy-efficient drone detection on resource-constrained FPGA devices. We trained SSD object detection algorithm with a custom drone dataset. For inference, we employed low-precision quantization and adapted the width of the SSD CNN model. To improve throughput, we use dual-data rate operations for DSPs to effectively double the throughput with limited DSP counts. For different SSD algorithm models, we analyze accuracy or mean average precision (mAP) and evaluate the corresponding FPGA hardware utilization, DRAM communication, throughput optimization. Our proposed design achieves a high mAP of 88.42% on the multi-drone dataset, with a high energy-efficiency of 79 GOPS/W and throughput of 158 GOPS using Xilinx Zynq ZU3EG FPGA device on the Open Vision Computer version 3 (OVC3) platform. Our design achieves 2.7X higher energy efficiency than prior works using the same FPGA device, at a low-power consumption of 1.98 W.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于FPGA的节能无人机检测算法-硬件协同优化

基于卷积神经网络(CNN)的目标检测已经达到了非常高的精度，例如单镜头多盒检测器(SSD)可以有效地检测和定位输入图像中的各种目标。然而，它们需要大量的计算和内存存储，这使得难以在资源受限的硬件设备(如无人机或无人驾驶飞行器(uav))上执行有效的推理。无人机/无人机检测是监视、防御、多无人机自定位和编队控制等应用的重要任务。在本文中，我们设计并协同优化了在资源受限的FPGA设备上进行节能无人机检测的算法和硬件。我们使用自定义无人机数据集训练SSD目标检测算法。对于推理，我们采用了低精度量化，并调整了SSD CNN模型的宽度。为了提高吞吐量，我们对DSP使用双数据速率操作，在有限的DSP计数下有效地将吞吐量提高一倍。对于不同的SSD算法模型，我们分析了精度或平均精度(mAP)，并评估了相应的FPGA硬件利用率，DRAM通信，吞吐量优化。我们的设计在开放式视觉计算机版本3 (OVC3)平台上使用Xilinx Zynq ZU3EG FPGA器件，在多无人机数据集上实现了88.42%的高mAP, 79 GOPS/W的高能效和158 GOPS的吞吐量。我们的设计在1.98 W的低功耗下，实现了比使用相同FPGA器件的先前工作高2.7倍的能效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 International Conference on Field-Programmable Technology (ICFPT)

自引率

0.00%

发文量