{"title":"RetinaMHSA: Improving in single-stage detector with self-attention","authors":"S. S. Fard, A. Amirkhani, M. Mosavi","doi":"10.1109/ICSPIS54653.2021.9729362","DOIUrl":null,"url":null,"abstract":"In recent years, object detection with two-stage methods is one of the highest accuracies, like faster R-CNN. One-stage methods which use a typical dense sampling of likely item situations may be speedier and more straightforward. However, it has not exceeded the two-stage detectors' accuracy. This study utilizes a Retina network with a backbone ResNet50 block with multi-head self-attention (MHSA) to enhance one-stage method issues, especially small objects. RetinaNet is an efficient and accurate network and uses a new loss function. We swapped c5 in the ResNet50 block with MHSA, while we also used the features of the Retina network. Furthermore, compared to the ResNet50 block, it contains fewer parameters. The results of our study on the Pascal VOC 2007 dataset revealed that the number 81.86 % mAP was obtained, indicating that our technique may achieve promising performance compared to several current two-stage approaches.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729362","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In recent years, object detection with two-stage methods is one of the highest accuracies, like faster R-CNN. One-stage methods which use a typical dense sampling of likely item situations may be speedier and more straightforward. However, it has not exceeded the two-stage detectors' accuracy. This study utilizes a Retina network with a backbone ResNet50 block with multi-head self-attention (MHSA) to enhance one-stage method issues, especially small objects. RetinaNet is an efficient and accurate network and uses a new loss function. We swapped c5 in the ResNet50 block with MHSA, while we also used the features of the Retina network. Furthermore, compared to the ResNet50 block, it contains fewer parameters. The results of our study on the Pascal VOC 2007 dataset revealed that the number 81.86 % mAP was obtained, indicating that our technique may achieve promising performance compared to several current two-stage approaches.