Endah Kristiani, Chao-Tung Yang, K. L. Phuong Nguyen
{"title":"Optimization of Deep Learning Inference on Edge Devices","authors":"Endah Kristiani, Chao-Tung Yang, K. L. Phuong Nguyen","doi":"10.1109/ICPAI51961.2020.00056","DOIUrl":null,"url":null,"abstract":"Concerning Artificial Intelligence (AI)-based applications, it is necessary to reduce latency in real-time inference. This paper implements and compares two separate models, Inception V3 and Mobilenet, using Intel Neural Compute Stick (NCS) 2 and Raspberry Pi 4 as the edge devices. The Model Optimizer (MO), which generates an Intermediate Rep- resentation (IR) of the network, is used for optimizing these models. Then, the IR models are inferences in the edge device. Finally, the comparison of frame per second speed (FPS) and precision is provided. The results show that the speed on Inception V3 is 9 frames per second, while that on Mobilenet is 24 frames per second. Simultaneously, the accuracy reaches 41.28% on Inception V3, but misclassifies for Nissan Altima 2014, and reaches 71.29% on Mobilenet with right classification for Toyota Camry 2014.","PeriodicalId":330198,"journal":{"name":"2020 International Conference on Pervasive Artificial Intelligence (ICPAI)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Pervasive Artificial Intelligence (ICPAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPAI51961.2020.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Concerning Artificial Intelligence (AI)-based applications, it is necessary to reduce latency in real-time inference. This paper implements and compares two separate models, Inception V3 and Mobilenet, using Intel Neural Compute Stick (NCS) 2 and Raspberry Pi 4 as the edge devices. The Model Optimizer (MO), which generates an Intermediate Rep- resentation (IR) of the network, is used for optimizing these models. Then, the IR models are inferences in the edge device. Finally, the comparison of frame per second speed (FPS) and precision is provided. The results show that the speed on Inception V3 is 9 frames per second, while that on Mobilenet is 24 frames per second. Simultaneously, the accuracy reaches 41.28% on Inception V3, but misclassifies for Nissan Altima 2014, and reaches 71.29% on Mobilenet with right classification for Toyota Camry 2014.