{"title":"AFMTD: Anchor-free Frame for Multi-scale Target Detection","authors":"Xueting Liu, Jingrou Xu, Ruoxi Lin, Jinyang Pan, Junying Mao, Guangqiang Yin","doi":"10.1109/CCISP55629.2022.9974392","DOIUrl":null,"url":null,"abstract":"Target detection task plays the most fundamental and important role in computer vision. The appearance of deep learning method has produced a positive effect on target detection, but multi-scale target detection is poor. The reasons could be attributed to two aspects; the first one is that the small target tends to contain less semantic information, which leads algorithm be hard to detect it; the other is that the sample distribution in the practical application scenarios is random, and the different-scaled target features will interfere with each other, which poses negative effect on multi-scale target detection. Based on existing technical issues, we propose an anchor-free frame for the multi-scale target detection (AFMTD) algorithm as solution. First, from the direction of feature fusion, we propose a spatial attention fusion module (SAFM), which designs same scale transformation (SST) based on Bi-FPN, strengthens the valuable information between adjacent feature layers, and suppresses interference features, improving the detection accuracy and resolution ability of targets of different scales. Then, from the direction of anchor-free frame detection, the heatmap-based multi-scale detection module (HMDM) is proposed; by introducing a scale distribution mechanism (SDM) and Heatmap-IOU (HIOU) loss function, the module allocates different targets to different corresponding feature maps, which makes the model converge faster and more accurately. Through experiments on the MS COCO dataset, our approach achieved 40.5% average precision (AP), and the AP of large, medium, and small-scale targets is 24.5%, 44.1%, and 53.9%, respectively.","PeriodicalId":431851,"journal":{"name":"2022 7th International Conference on Communication, Image and Signal Processing (CCISP)","volume":"214 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Communication, Image and Signal Processing (CCISP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCISP55629.2022.9974392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Target detection task plays the most fundamental and important role in computer vision. The appearance of deep learning method has produced a positive effect on target detection, but multi-scale target detection is poor. The reasons could be attributed to two aspects; the first one is that the small target tends to contain less semantic information, which leads algorithm be hard to detect it; the other is that the sample distribution in the practical application scenarios is random, and the different-scaled target features will interfere with each other, which poses negative effect on multi-scale target detection. Based on existing technical issues, we propose an anchor-free frame for the multi-scale target detection (AFMTD) algorithm as solution. First, from the direction of feature fusion, we propose a spatial attention fusion module (SAFM), which designs same scale transformation (SST) based on Bi-FPN, strengthens the valuable information between adjacent feature layers, and suppresses interference features, improving the detection accuracy and resolution ability of targets of different scales. Then, from the direction of anchor-free frame detection, the heatmap-based multi-scale detection module (HMDM) is proposed; by introducing a scale distribution mechanism (SDM) and Heatmap-IOU (HIOU) loss function, the module allocates different targets to different corresponding feature maps, which makes the model converge faster and more accurately. Through experiments on the MS COCO dataset, our approach achieved 40.5% average precision (AP), and the AP of large, medium, and small-scale targets is 24.5%, 44.1%, and 53.9%, respectively.