{"title":"基于多尺度注意力自适应网络的遥感图像目标检测","authors":"Qixi Tan, W. Xie, Haojin Tang, Yanshan Li","doi":"10.1109/ICICSP55539.2022.10050627","DOIUrl":null,"url":null,"abstract":"Remote sensing images (RSI) have a large range of variations in the aspect of inter- and intra-class size variability across objects. As a key technology in the field of RSI processing, RSI object detection has been widely applied. Multilevel features fusion network is commonly used to improve the performance of object detection. However, the existing multilevel feature fusion networks for RSI lack the ability to combine global information. Aiming at this problem, A multi-scale attention adaptive network (MA2Net) is proposed to object detection in RSI. The main contributions of this paper are twofold. Firstly, a multi-scale attention adaptive network is designed to adaptively integrate the multilevel features. This network is composed of integrating (IG) block, channel self-attention (CS) block, and adaptive fusion (AF) block. Specifically, IG is designed to transform the multi-level features into an intermediate size. The CS block is an embedded gaussian self-attention module used to model the relationship between the feature channels. AF is developed to learn the multilevel expression of self-attention features to obtain multi-scale feature maps. Secondly, to achieve a balance between multi-task and higher accuracy, a feature align head is utilized to correctly locate and classify objects. The experimental results on DIOR show that our network can achieve higher detection accuracy than the state-of-the-art RSI object detector.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-scale Attention Adaptive Network for Object Detection in Remote Sensing Images\",\"authors\":\"Qixi Tan, W. Xie, Haojin Tang, Yanshan Li\",\"doi\":\"10.1109/ICICSP55539.2022.10050627\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Remote sensing images (RSI) have a large range of variations in the aspect of inter- and intra-class size variability across objects. As a key technology in the field of RSI processing, RSI object detection has been widely applied. Multilevel features fusion network is commonly used to improve the performance of object detection. However, the existing multilevel feature fusion networks for RSI lack the ability to combine global information. Aiming at this problem, A multi-scale attention adaptive network (MA2Net) is proposed to object detection in RSI. The main contributions of this paper are twofold. Firstly, a multi-scale attention adaptive network is designed to adaptively integrate the multilevel features. This network is composed of integrating (IG) block, channel self-attention (CS) block, and adaptive fusion (AF) block. Specifically, IG is designed to transform the multi-level features into an intermediate size. The CS block is an embedded gaussian self-attention module used to model the relationship between the feature channels. AF is developed to learn the multilevel expression of self-attention features to obtain multi-scale feature maps. Secondly, to achieve a balance between multi-task and higher accuracy, a feature align head is utilized to correctly locate and classify objects. The experimental results on DIOR show that our network can achieve higher detection accuracy than the state-of-the-art RSI object detector.\",\"PeriodicalId\":281095,\"journal\":{\"name\":\"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)\",\"volume\":\"84 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICSP55539.2022.10050627\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSP55539.2022.10050627","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-scale Attention Adaptive Network for Object Detection in Remote Sensing Images
Remote sensing images (RSI) have a large range of variations in the aspect of inter- and intra-class size variability across objects. As a key technology in the field of RSI processing, RSI object detection has been widely applied. Multilevel features fusion network is commonly used to improve the performance of object detection. However, the existing multilevel feature fusion networks for RSI lack the ability to combine global information. Aiming at this problem, A multi-scale attention adaptive network (MA2Net) is proposed to object detection in RSI. The main contributions of this paper are twofold. Firstly, a multi-scale attention adaptive network is designed to adaptively integrate the multilevel features. This network is composed of integrating (IG) block, channel self-attention (CS) block, and adaptive fusion (AF) block. Specifically, IG is designed to transform the multi-level features into an intermediate size. The CS block is an embedded gaussian self-attention module used to model the relationship between the feature channels. AF is developed to learn the multilevel expression of self-attention features to obtain multi-scale feature maps. Secondly, to achieve a balance between multi-task and higher accuracy, a feature align head is utilized to correctly locate and classify objects. The experimental results on DIOR show that our network can achieve higher detection accuracy than the state-of-the-art RSI object detector.