{"title":"Hardware Deployment of HBONext using NXP Bluebox 2.0","authors":"S. Joshi, M. El-Sharkawy","doi":"10.1109/AIIoT52608.2021.9454210","DOIUrl":null,"url":null,"abstract":"Deep learning models require a lot of computation and memory, so they can only be run on high-performance computing platforms such as CPUs or GPUs. However, due to resource, energy, and real-time constraints, they often fail to meet portable requirements. As a result, there is an increasing interest in real-time object recognition solutions based on CNNs, which are typically implemented on embedded systems with limited resources and energy consumption. Recently, hardware accelerators have been developed to provide the computing power needed by AI and machine learning tools. These edge accelerators deliver high-performance hardware while maintaining the needed accuracy for the task at hand. This paper takes a step forward by suggesting a design approach for porting CNNs to low-resource embedded systems, bridging the gap between deep learning models and embedded edge systems. To complete our task, we employ closer computing approaches to minimize the computational load and memory consumption of the computer while maintaining impressive deployment performance. HBONext is one of those models that was designed to be easily deployable on embedded and mobile devices. We demonstrate how to use NXP BlueBox 2.0 to introduce a real-time HBONext image classifier in this work. Incorporating this concept into this hardware has been a huge success due to its limited architectural scale of 3 MB. This model was trained and validated using the CIFAR10 data set, which performed exceptionally well due to its smaller size and higher accuracy.","PeriodicalId":443405,"journal":{"name":"2021 IEEE World AI IoT Congress (AIIoT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE World AI IoT Congress (AIIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIIoT52608.2021.9454210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Deep learning models require a lot of computation and memory, so they can only be run on high-performance computing platforms such as CPUs or GPUs. However, due to resource, energy, and real-time constraints, they often fail to meet portable requirements. As a result, there is an increasing interest in real-time object recognition solutions based on CNNs, which are typically implemented on embedded systems with limited resources and energy consumption. Recently, hardware accelerators have been developed to provide the computing power needed by AI and machine learning tools. These edge accelerators deliver high-performance hardware while maintaining the needed accuracy for the task at hand. This paper takes a step forward by suggesting a design approach for porting CNNs to low-resource embedded systems, bridging the gap between deep learning models and embedded edge systems. To complete our task, we employ closer computing approaches to minimize the computational load and memory consumption of the computer while maintaining impressive deployment performance. HBONext is one of those models that was designed to be easily deployable on embedded and mobile devices. We demonstrate how to use NXP BlueBox 2.0 to introduce a real-time HBONext image classifier in this work. Incorporating this concept into this hardware has been a huge success due to its limited architectural scale of 3 MB. This model was trained and validated using the CIFAR10 data set, which performed exceptionally well due to its smaller size and higher accuracy.