POSS-CNN: An Automatically Generated Convolutional Neural Network with Precision and Operation Separable Structure Aiming at Target Recognition and Detection
{"title":"POSS-CNN: An Automatically Generated Convolutional Neural Network with Precision and Operation Separable Structure Aiming at Target Recognition and Detection","authors":"Jia Hou, Jingyu Zhang, Qi Chen, Siwei Xiang, Yishuo Meng, Jianfei Wang, Cimang Lu, Chen Yang","doi":"10.3390/info14110604","DOIUrl":null,"url":null,"abstract":"Artificial intelligence is changing and influencing our world. As one of the main algorithms in the field of artificial intelligence, convolutional neural networks (CNNs) have developed rapidly in recent years. Especially after the emergence of NASNet, CNNs have gradually pushed the idea of AutoML to the public’s attention, and large numbers of new structures designed by automatic searches are appearing. These networks are usually based on reinforcement learning and evolutionary learning algorithms. However, sometimes, the blocks of these networks are complex, and there is no small model for simpler tasks. Therefore, this paper proposes POSS-CNN aiming at target recognition and detection, which employs a multi-branch CNN structure with PSNC and a method of automatic parallel selection for super parameters based on a multi-branch CNN structure. Moreover, POSS-CNN can be broken up. By choosing a single branch or the combination of two branches as the “benchmark”, as well as the overall POSS-CNN, we can achieve seven models with different precision and operations. The test accuracy of POSS-CNN for a recognition task tested on a CIFAR10 dataset can reach 86.4%, which is equivalent to AlexNet and VggNet, but the operation and parameters of the whole model in this paper are 45.9% and 45.8% of AlexNet, and 29.5% and 29.4% of VggNet. The mAP of POSS-CNN for a detection task tested on the LSVH dataset is 45.8, inferior to the 62.3 of YOLOv3. However, compared with YOLOv3, the operation and parameters of the model in this paper are reduced by 57.4% and 15.6%, respectively. After being accelerated by WRA, POSS-CNN for a detection task tested on an LSVH dataset can achieve 27 fps, and the energy efficiency is 0.42 J/f, which is 5 times and 96.6 times better than GPU 2080Ti in performance and energy efficiency, respectively.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information (Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/info14110604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial intelligence is changing and influencing our world. As one of the main algorithms in the field of artificial intelligence, convolutional neural networks (CNNs) have developed rapidly in recent years. Especially after the emergence of NASNet, CNNs have gradually pushed the idea of AutoML to the public’s attention, and large numbers of new structures designed by automatic searches are appearing. These networks are usually based on reinforcement learning and evolutionary learning algorithms. However, sometimes, the blocks of these networks are complex, and there is no small model for simpler tasks. Therefore, this paper proposes POSS-CNN aiming at target recognition and detection, which employs a multi-branch CNN structure with PSNC and a method of automatic parallel selection for super parameters based on a multi-branch CNN structure. Moreover, POSS-CNN can be broken up. By choosing a single branch or the combination of two branches as the “benchmark”, as well as the overall POSS-CNN, we can achieve seven models with different precision and operations. The test accuracy of POSS-CNN for a recognition task tested on a CIFAR10 dataset can reach 86.4%, which is equivalent to AlexNet and VggNet, but the operation and parameters of the whole model in this paper are 45.9% and 45.8% of AlexNet, and 29.5% and 29.4% of VggNet. The mAP of POSS-CNN for a detection task tested on the LSVH dataset is 45.8, inferior to the 62.3 of YOLOv3. However, compared with YOLOv3, the operation and parameters of the model in this paper are reduced by 57.4% and 15.6%, respectively. After being accelerated by WRA, POSS-CNN for a detection task tested on an LSVH dataset can achieve 27 fps, and the energy efficiency is 0.42 J/f, which is 5 times and 96.6 times better than GPU 2080Ti in performance and energy efficiency, respectively.