{"title":"AKGF:在CPU-FPGA上自动生成DNN内核","authors":"Dong Dong, Hongxu Jiang, Boyu Diao","doi":"10.1093/comjnl/bxad086","DOIUrl":null,"url":null,"abstract":"Abstract While tensor accelerated compilers have proven effective in deploying deep neural networks (DNN) on general-purpose hardware, optimizing for FPGA remains challenging due to the complex DNN architectures and the heterogeneous, semi-open compute units. This paper introduces the Automatic Kernel Generation for DNN on CPU-FPGA (AKGF) framework for efficient deployment of DNN on heterogeneous CPU-FPGA platforms. AKGF generates an intermediate representation (IR) of the DNN using TVM’s Halide IR, annotates the operators of model layers in the IR to compute them on the corresponding hardware cores, and further optimizes the operator code for CPU and FPGA using ARM’s function library and the polyhedral model to enhance model inference speed and power consumption. The experimental tests conducted on a CPU-FPGA board validate the effectiveness of AKGF, demonstrating significant acceleration ratios (up to 6.7x) compared to state-of-the-art accelerators while achieving a 2x power optimization. AKGF effectively leverages the computational capabilities of both CPU and FPGA for high-performance deployment of DNN on CPU-FPGA platforms.","PeriodicalId":50641,"journal":{"name":"Computer Journal","volume":"10 1","pages":"0"},"PeriodicalIF":1.5000,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AKGF: Automatic Kernel Generation for DNN on CPU-FPGA\",\"authors\":\"Dong Dong, Hongxu Jiang, Boyu Diao\",\"doi\":\"10.1093/comjnl/bxad086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract While tensor accelerated compilers have proven effective in deploying deep neural networks (DNN) on general-purpose hardware, optimizing for FPGA remains challenging due to the complex DNN architectures and the heterogeneous, semi-open compute units. This paper introduces the Automatic Kernel Generation for DNN on CPU-FPGA (AKGF) framework for efficient deployment of DNN on heterogeneous CPU-FPGA platforms. AKGF generates an intermediate representation (IR) of the DNN using TVM’s Halide IR, annotates the operators of model layers in the IR to compute them on the corresponding hardware cores, and further optimizes the operator code for CPU and FPGA using ARM’s function library and the polyhedral model to enhance model inference speed and power consumption. The experimental tests conducted on a CPU-FPGA board validate the effectiveness of AKGF, demonstrating significant acceleration ratios (up to 6.7x) compared to state-of-the-art accelerators while achieving a 2x power optimization. AKGF effectively leverages the computational capabilities of both CPU and FPGA for high-performance deployment of DNN on CPU-FPGA platforms.\",\"PeriodicalId\":50641,\"journal\":{\"name\":\"Computer Journal\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/comjnl/bxad086\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/comjnl/bxad086","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
AKGF: Automatic Kernel Generation for DNN on CPU-FPGA
Abstract While tensor accelerated compilers have proven effective in deploying deep neural networks (DNN) on general-purpose hardware, optimizing for FPGA remains challenging due to the complex DNN architectures and the heterogeneous, semi-open compute units. This paper introduces the Automatic Kernel Generation for DNN on CPU-FPGA (AKGF) framework for efficient deployment of DNN on heterogeneous CPU-FPGA platforms. AKGF generates an intermediate representation (IR) of the DNN using TVM’s Halide IR, annotates the operators of model layers in the IR to compute them on the corresponding hardware cores, and further optimizes the operator code for CPU and FPGA using ARM’s function library and the polyhedral model to enhance model inference speed and power consumption. The experimental tests conducted on a CPU-FPGA board validate the effectiveness of AKGF, demonstrating significant acceleration ratios (up to 6.7x) compared to state-of-the-art accelerators while achieving a 2x power optimization. AKGF effectively leverages the computational capabilities of both CPU and FPGA for high-performance deployment of DNN on CPU-FPGA platforms.
期刊介绍:
The Computer Journal is one of the longest-established journals serving all branches of the academic computer science community. It is currently published in four sections.