Qingming Yi;Mingfeng Zheng;Min Shi;Jian Weng;Aiwen Luo
{"title":"AFANet: A Multibackbone Compatible Feature Fusion Framework for Effective Remote Sensing Object Detection","authors":"Qingming Yi;Mingfeng Zheng;Min Shi;Jian Weng;Aiwen Luo","doi":"10.1109/LGRS.2024.3462089","DOIUrl":null,"url":null,"abstract":"Remote sensing object detection (RSOD) using convolutional neural networks (CNNs) continues to pose challenges in achieving high detection accuracy due to the inherent complexity of remote sensing images, characterized by intricate backgrounds, massive multiscale objects with irregular shapes, and significant variations. In addition, existing RSOD methods often rely on a particular backbone architecture, hindering their adaptability to achieve high accuracy across diverse networks with varying backbones. To address these challenges, we propose a novel multibackbone compatible feature fusion framework termed attention-aware feature aggregation network (AFANet). First, a multibranch attention-based semantic aggregation (MASA) module is introduced to adaptively capture the high-level semantic information. Second, the multiscale spatial features are integrated with the semantic information using a self-attention-guided global contextual feature fusion (SGCFF) strategy. Finally, we incorporate a dual-attention mechanism to capture more fine-grained features to detect small objects. Extensive experiments on the DIOR and NWPU VHR-10 datasets demonstrate the effectiveness of the proposed AFANet across various backbones, achieving superior detection accuracy. The code is available at \n<uri>https://github.com/lawlawCodes/AFANet</uri>\n.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"21 ","pages":"1-5"},"PeriodicalIF":4.4000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10681114/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Remote sensing object detection (RSOD) using convolutional neural networks (CNNs) continues to pose challenges in achieving high detection accuracy due to the inherent complexity of remote sensing images, characterized by intricate backgrounds, massive multiscale objects with irregular shapes, and significant variations. In addition, existing RSOD methods often rely on a particular backbone architecture, hindering their adaptability to achieve high accuracy across diverse networks with varying backbones. To address these challenges, we propose a novel multibackbone compatible feature fusion framework termed attention-aware feature aggregation network (AFANet). First, a multibranch attention-based semantic aggregation (MASA) module is introduced to adaptively capture the high-level semantic information. Second, the multiscale spatial features are integrated with the semantic information using a self-attention-guided global contextual feature fusion (SGCFF) strategy. Finally, we incorporate a dual-attention mechanism to capture more fine-grained features to detect small objects. Extensive experiments on the DIOR and NWPU VHR-10 datasets demonstrate the effectiveness of the proposed AFANet across various backbones, achieving superior detection accuracy. The code is available at
https://github.com/lawlawCodes/AFANet
.