Pub Date : 2022-09-30DOI: 10.48550/arXiv.2209.15257
Dominika Przewlocka-Rus, T. Kryjak
Deep neural networks virtually dominate the domain of most modern vision systems, providing high performance at a cost of increased computational complexity.Since for those systems it is often required to operate both in real-time and with minimal energy consumption (e.g., for wearable devices or autonomous vehicles, edge Internet of Things (IoT), sensor networks), various network optimisation techniques are used, e.g., quantisation, pruning, or dedicated lightweight architectures. Due to the logarithmic distribution of weights in neural network layers, a method providing high performance with significant reduction in computational precision (for 4-bit weights and less) is the Power-of-Two (PoT) quantisation (and therefore also with a logarithmic distribution). This method introduces additional possibilities of replacing the typical for neural networks Multiply and ACcumulate (MAC -- performing, e.g., convolution operations) units, with more energy-efficient Bitshift and ACcumulate (BAC). In this paper, we show that a hardware neural network accelerator with PoT weights implemented on the Zynq UltraScale + MPSoC ZCU104 SoC FPGA can be at least $1.4x$ more energy efficient than the uniform quantisation version. To further reduce the actual power requirement by omitting part of the computation for zero weights, we also propose a new pruning method adapted to logarithmic quantisation.
{"title":"Energy Efficient Hardware Acceleration of Neural Networks with Power-of-Two Quantisation","authors":"Dominika Przewlocka-Rus, T. Kryjak","doi":"10.48550/arXiv.2209.15257","DOIUrl":"https://doi.org/10.48550/arXiv.2209.15257","url":null,"abstract":"Deep neural networks virtually dominate the domain of most modern vision systems, providing high performance at a cost of increased computational complexity.Since for those systems it is often required to operate both in real-time and with minimal energy consumption (e.g., for wearable devices or autonomous vehicles, edge Internet of Things (IoT), sensor networks), various network optimisation techniques are used, e.g., quantisation, pruning, or dedicated lightweight architectures. Due to the logarithmic distribution of weights in neural network layers, a method providing high performance with significant reduction in computational precision (for 4-bit weights and less) is the Power-of-Two (PoT) quantisation (and therefore also with a logarithmic distribution). This method introduces additional possibilities of replacing the typical for neural networks Multiply and ACcumulate (MAC -- performing, e.g., convolution operations) units, with more energy-efficient Bitshift and ACcumulate (BAC). In this paper, we show that a hardware neural network accelerator with PoT weights implemented on the Zynq UltraScale + MPSoC ZCU104 SoC FPGA can be at least $1.4x$ more energy efficient than the uniform quantisation version. To further reduce the actual power requirement by omitting part of the computation for zero weights, we also propose a new pruning method adapted to logarithmic quantisation.","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130712387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-30DOI: 10.48550/arXiv.2209.15252
K. Lis, T. Kryjak
3D object detection from LiDAR sensor data is an important topic in the context of autonomous cars and drones. In this paper, we present the results of experiments on the impact of backbone selection of a deep convolutional neural network on detection accuracy and computation speed. We chose the PointPillars network, which is characterised by a simple architecture, high speed, and modularity that allows for easy expansion. During the experiments, we paid particular attention to the change in detection efficiency (measured by the mAP metric) and the total number of multiply-addition operations needed to process one point cloud. We tested 10 different convolutional neural network architectures that are widely used in image-based detection problems. For a backbone like MobilenetV1, we obtained an almost 4x speedup at the cost of a 1.13% decrease in mAP. On the other hand, for CSPDarknet we got an acceleration of more than 1.5x at an increase in mAP of 0.33%. We have thus demonstrated that it is possible to significantly speed up a 3D object detector in LiDAR point clouds with a small decrease in detection efficiency. This result can be used when PointPillars or similar algorithms are implemented in embedded systems, including SoC FPGAs. The code is available at https://github.com/vision-agh/pointpillars_backbone.
{"title":"PointPillars Backbone Type Selection For Fast and Accurate LiDAR Object Detection","authors":"K. Lis, T. Kryjak","doi":"10.48550/arXiv.2209.15252","DOIUrl":"https://doi.org/10.48550/arXiv.2209.15252","url":null,"abstract":"3D object detection from LiDAR sensor data is an important topic in the context of autonomous cars and drones. In this paper, we present the results of experiments on the impact of backbone selection of a deep convolutional neural network on detection accuracy and computation speed. We chose the PointPillars network, which is characterised by a simple architecture, high speed, and modularity that allows for easy expansion. During the experiments, we paid particular attention to the change in detection efficiency (measured by the mAP metric) and the total number of multiply-addition operations needed to process one point cloud. We tested 10 different convolutional neural network architectures that are widely used in image-based detection problems. For a backbone like MobilenetV1, we obtained an almost 4x speedup at the cost of a 1.13% decrease in mAP. On the other hand, for CSPDarknet we got an acceleration of more than 1.5x at an increase in mAP of 0.33%. We have thus demonstrated that it is possible to significantly speed up a 3D object detector in LiDAR point clouds with a small decrease in detection efficiency. This result can be used when PointPillars or similar algorithms are implemented in embedded systems, including SoC FPGAs. The code is available at https://github.com/vision-agh/pointpillars_backbone.","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124020470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-30DOI: 10.48550/arXiv.2209.15251
Sylwia Kuros, T. Kryjak
Quantum Neural Networks (QNNs) are an emerging technology that can be used in many applications including computer vision. In this paper, we presented a traffic sign classification system implemented using a hybrid quantum-classical convolutional neural network. Experiments on the German Traffic Sign Recognition Benchmark dataset indicate that currently QNN do not outperform classical DCNN (Deep Convolutuional Neural Networks), yet still provide an accuracy of over 90% and are a definitely promising solution for advanced computer vision.
{"title":"Traffic Sign Classification Using Deep and Quantum Neural Networks","authors":"Sylwia Kuros, T. Kryjak","doi":"10.48550/arXiv.2209.15251","DOIUrl":"https://doi.org/10.48550/arXiv.2209.15251","url":null,"abstract":"Quantum Neural Networks (QNNs) are an emerging technology that can be used in many applications including computer vision. In this paper, we presented a traffic sign classification system implemented using a hybrid quantum-classical convolutional neural network. Experiments on the German Traffic Sign Recognition Benchmark dataset indicate that currently QNN do not outperform classical DCNN (Deep Convolutuional Neural Networks), yet still provide an accuracy of over 90% and are a definitely promising solution for advanced computer vision.","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"354 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124474801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1007/978-3-030-59006-2_9
Tevin Moodley, D. Haar
{"title":"Scene Recognition Using AlexNet to Recognize Significant Events Within Cricket Game Footage","authors":"Tevin Moodley, D. Haar","doi":"10.1007/978-3-030-59006-2_9","DOIUrl":"https://doi.org/10.1007/978-3-030-59006-2_9","url":null,"abstract":"","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125324432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1007/978-3-030-59006-2_18
Y. Vybornova
{"title":"A New Watermarking Method for Video Authentication with Tamper Localization","authors":"Y. Vybornova","doi":"10.1007/978-3-030-59006-2_18","DOIUrl":"https://doi.org/10.1007/978-3-030-59006-2_18","url":null,"abstract":"","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122800806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1007/978-3-030-59006-2_19
A. Wilkowski, D. Mańkowski
{"title":"RGB-D and Lidar Calibration Supported by GPU","authors":"A. Wilkowski, D. Mańkowski","doi":"10.1007/978-3-030-59006-2_19","DOIUrl":"https://doi.org/10.1007/978-3-030-59006-2_19","url":null,"abstract":"","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132474153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1007/978-3-030-59006-2_16
Ruan Spijkerman, D. Haar
{"title":"Video Footage Highlight Detection in Formula 1 Through Vehicle Recognition with Faster R-CNN Trained on Game Footage","authors":"Ruan Spijkerman, D. Haar","doi":"10.1007/978-3-030-59006-2_16","DOIUrl":"https://doi.org/10.1007/978-3-030-59006-2_16","url":null,"abstract":"","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130172807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1007/978-3-030-59006-2_17
Paula Štancelová, E. Sikudová, Z. Černeková
{"title":"Performance Evaluation of Selected 3D Keypoint Detector-Descriptor Combinations","authors":"Paula Štancelová, E. Sikudová, Z. Černeková","doi":"10.1007/978-3-030-59006-2_17","DOIUrl":"https://doi.org/10.1007/978-3-030-59006-2_17","url":null,"abstract":"","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114261832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1007/978-3-030-59006-2_11
M. Oloko-Oba, Serestina Viriri
{"title":"Tuberculosis Abnormality Detection in Chest X-Rays: A Deep Learning Approach","authors":"M. Oloko-Oba, Serestina Viriri","doi":"10.1007/978-3-030-59006-2_11","DOIUrl":"https://doi.org/10.1007/978-3-030-59006-2_11","url":null,"abstract":"","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121786944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-09-14DOI: 10.1007/978-3-030-59006-2_1
J. D. Akinyemi, O. Onifade
{"title":"Facial Age Estimation Using Compact Facial Features","authors":"J. D. Akinyemi, O. Onifade","doi":"10.1007/978-3-030-59006-2_1","DOIUrl":"https://doi.org/10.1007/978-3-030-59006-2_1","url":null,"abstract":"","PeriodicalId":124003,"journal":{"name":"International Conference on Computer Vision and Graphics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121322575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}