Yixuan Zhao, Baolei Hu, Feiyang Liu, Tanbao Yan, Han Gao
{"title":"基于PYNQ-Z2平台的yolov2微型加速器设计","authors":"Yixuan Zhao, Baolei Hu, Feiyang Liu, Tanbao Yan, Han Gao","doi":"10.1117/12.2689581","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNNs) have been widely used in the field of image recognition. To meet the massive computational requirements of CNNs, GPUs or other intelligent computing hardware are typically used for data processing. FPGA supports parallel computing and is characterized by programmability, high performance, low energy consumption, and strong stability. In this paper, we improved and optimized the YOLOv2-Tiny algorithm by combining it with the hardware implementation based on FPGA's hardware structure. We divided the neural network tasks and preprocessed data using the 16-bit fixed-point method to reduce hardware resource consumption. By using the PYNQ-z2 development platform to accelerate the YOLOv2-Tiny CNN, we achieved target object detection and recognition. Compared with CPU (i7-10710U), the processing capacity was 2.94 times that of CPU, and the power consumption was 3.1% of CPU.","PeriodicalId":118234,"journal":{"name":"4th International Conference on Information Science, Electrical and Automation Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design of YOLOv2-tiny accelerator based on PYNQ-Z2 platform\",\"authors\":\"Yixuan Zhao, Baolei Hu, Feiyang Liu, Tanbao Yan, Han Gao\",\"doi\":\"10.1117/12.2689581\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural networks (CNNs) have been widely used in the field of image recognition. To meet the massive computational requirements of CNNs, GPUs or other intelligent computing hardware are typically used for data processing. FPGA supports parallel computing and is characterized by programmability, high performance, low energy consumption, and strong stability. In this paper, we improved and optimized the YOLOv2-Tiny algorithm by combining it with the hardware implementation based on FPGA's hardware structure. We divided the neural network tasks and preprocessed data using the 16-bit fixed-point method to reduce hardware resource consumption. By using the PYNQ-z2 development platform to accelerate the YOLOv2-Tiny CNN, we achieved target object detection and recognition. Compared with CPU (i7-10710U), the processing capacity was 2.94 times that of CPU, and the power consumption was 3.1% of CPU.\",\"PeriodicalId\":118234,\"journal\":{\"name\":\"4th International Conference on Information Science, Electrical and Automation Engineering\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"4th International Conference on Information Science, Electrical and Automation Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2689581\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"4th International Conference on Information Science, Electrical and Automation Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2689581","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design of YOLOv2-tiny accelerator based on PYNQ-Z2 platform
Convolutional neural networks (CNNs) have been widely used in the field of image recognition. To meet the massive computational requirements of CNNs, GPUs or other intelligent computing hardware are typically used for data processing. FPGA supports parallel computing and is characterized by programmability, high performance, low energy consumption, and strong stability. In this paper, we improved and optimized the YOLOv2-Tiny algorithm by combining it with the hardware implementation based on FPGA's hardware structure. We divided the neural network tasks and preprocessed data using the 16-bit fixed-point method to reduce hardware resource consumption. By using the PYNQ-z2 development platform to accelerate the YOLOv2-Tiny CNN, we achieved target object detection and recognition. Compared with CPU (i7-10710U), the processing capacity was 2.94 times that of CPU, and the power consumption was 3.1% of CPU.