{"title":"An Object Detection Algorithm Combining FPN Structure With DETR","authors":"Nan Xiang, Chuanzhong Pan, Xiaozhao Li","doi":"10.1145/3484274.3484284","DOIUrl":null,"url":null,"abstract":"In order to solve the problem of low detection accuracy of the DETR model for small and medium objects, an object detection algorithm with improved feature extraction combined with FPN structure combined with DETR is proposed. This method first extracts features from the original image through the improved Darknet53 network. In this process, the 104*104 size feature map after the first residual error in the second stage is additionally output as a fourth-scale feature map. Combine this feature map with the feature maps output from the original 3 stages to form 4 feature map outputs of different scales. Secondly, it uses FPN to down-sample and up-sample the feature maps of 4 scales, and to merge them to output 52*52 scales. Then, the feature map and the positional encoding are combined and input into the Transformer to obtain the data, and the category and position information of the predicted object are output through FFNs. On the COCO2017 data set, the accuracy has been improved compared with other models.","PeriodicalId":143540,"journal":{"name":"Proceedings of the 4th International Conference on Control and Computer Vision","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Control and Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3484274.3484284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In order to solve the problem of low detection accuracy of the DETR model for small and medium objects, an object detection algorithm with improved feature extraction combined with FPN structure combined with DETR is proposed. This method first extracts features from the original image through the improved Darknet53 network. In this process, the 104*104 size feature map after the first residual error in the second stage is additionally output as a fourth-scale feature map. Combine this feature map with the feature maps output from the original 3 stages to form 4 feature map outputs of different scales. Secondly, it uses FPN to down-sample and up-sample the feature maps of 4 scales, and to merge them to output 52*52 scales. Then, the feature map and the positional encoding are combined and input into the Transformer to obtain the data, and the category and position information of the predicted object are output through FFNs. On the COCO2017 data set, the accuracy has been improved compared with other models.