{"title":"基于算子流识别和量化参数优化的异构平台预量化深度学习模型","authors":"Kuen-Wey Lin, Yan-Ying Li, Kuan Wang, Ming-Chih Tung","doi":"10.1109/ICASI57738.2023.10179562","DOIUrl":null,"url":null,"abstract":"Quantized deep learning models are suitable for the embedded devices with limited computation resource. For computation-intensive neural network operators such as convolution, heterogeneous platforms with a set of processing units of different types become common in the embedded devices. These embedded devices usually operate on fixed-point calculations; moreover, they rely on customized kernel functions to deploy deep learning models. In this paper, a flow of deploying pre-quantized deep learning models on heterogeneous platforms using TVM is presented. We propose an optimization to convert quantization parameters. To leverage customized kernel functions, we propose the operator flow recognition. To demonstrate our flow, we utilize embARC Machine Learning Inference (embARC MLI), an open-source software library targeted for low-power applications. A set of pre-quantized deep learning models are deployed on a heterogeneous platform comprising x86 and embARC MLI. Experimental results show that for each model, the accuracy obtained from the heterogeneous platform is much the same as the one obtained from an x86 platform.","PeriodicalId":281254,"journal":{"name":"2023 9th International Conference on Applied System Innovation (ICASI)","volume":"106 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deploying Pre-Quantized Deep Learning Models on Heterogeneous Platforms with Operator Flow Recognition and Quantization Parameter Optimization\",\"authors\":\"Kuen-Wey Lin, Yan-Ying Li, Kuan Wang, Ming-Chih Tung\",\"doi\":\"10.1109/ICASI57738.2023.10179562\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quantized deep learning models are suitable for the embedded devices with limited computation resource. For computation-intensive neural network operators such as convolution, heterogeneous platforms with a set of processing units of different types become common in the embedded devices. These embedded devices usually operate on fixed-point calculations; moreover, they rely on customized kernel functions to deploy deep learning models. In this paper, a flow of deploying pre-quantized deep learning models on heterogeneous platforms using TVM is presented. We propose an optimization to convert quantization parameters. To leverage customized kernel functions, we propose the operator flow recognition. To demonstrate our flow, we utilize embARC Machine Learning Inference (embARC MLI), an open-source software library targeted for low-power applications. A set of pre-quantized deep learning models are deployed on a heterogeneous platform comprising x86 and embARC MLI. Experimental results show that for each model, the accuracy obtained from the heterogeneous platform is much the same as the one obtained from an x86 platform.\",\"PeriodicalId\":281254,\"journal\":{\"name\":\"2023 9th International Conference on Applied System Innovation (ICASI)\",\"volume\":\"106 5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 9th International Conference on Applied System Innovation (ICASI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASI57738.2023.10179562\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 9th International Conference on Applied System Innovation (ICASI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASI57738.2023.10179562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deploying Pre-Quantized Deep Learning Models on Heterogeneous Platforms with Operator Flow Recognition and Quantization Parameter Optimization
Quantized deep learning models are suitable for the embedded devices with limited computation resource. For computation-intensive neural network operators such as convolution, heterogeneous platforms with a set of processing units of different types become common in the embedded devices. These embedded devices usually operate on fixed-point calculations; moreover, they rely on customized kernel functions to deploy deep learning models. In this paper, a flow of deploying pre-quantized deep learning models on heterogeneous platforms using TVM is presented. We propose an optimization to convert quantization parameters. To leverage customized kernel functions, we propose the operator flow recognition. To demonstrate our flow, we utilize embARC Machine Learning Inference (embARC MLI), an open-source software library targeted for low-power applications. A set of pre-quantized deep learning models are deployed on a heterogeneous platform comprising x86 and embARC MLI. Experimental results show that for each model, the accuracy obtained from the heterogeneous platform is much the same as the one obtained from an x86 platform.