{"title":"Performance Analysis of Convolutional Neural Networks on Embedded Systems","authors":"Lukasz Grzymkowski, T. Stefański","doi":"10.23919/MIXDES49814.2020.9155741","DOIUrl":null,"url":null,"abstract":"Machine learning is no longer confined to cloud and high-end server systems and has been successfully deployed on devices that are part of Internet of Things. This paper presents the analysis of performance of convolutional neural networks deployed on an ARM microcontroller. Inference time is measured for different core frequencies, with and without DSP instructions and disabled access to cache. Networks use both real-valued and complex-valued tensors and are tested using different inference engines. We conclude that the system must be tuned in a holistic way to achieve optimal efficiency.","PeriodicalId":145224,"journal":{"name":"2020 27th International Conference on Mixed Design of Integrated Circuits and System (MIXDES)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 27th International Conference on Mixed Design of Integrated Circuits and System (MIXDES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/MIXDES49814.2020.9155741","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Machine learning is no longer confined to cloud and high-end server systems and has been successfully deployed on devices that are part of Internet of Things. This paper presents the analysis of performance of convolutional neural networks deployed on an ARM microcontroller. Inference time is measured for different core frequencies, with and without DSP instructions and disabled access to cache. Networks use both real-valued and complex-valued tensors and are tested using different inference engines. We conclude that the system must be tuned in a holistic way to achieve optimal efficiency.