Wenxiang Jiang , Yan Chen , Xiaofeng Wang , Menglei Kang , Mengyuan Wang , Xuejun Zhang , Lixiang Xu , Cheng Zhang
{"title":"Multi-branch reverse attention semantic segmentation network for building extraction","authors":"Wenxiang Jiang , Yan Chen , Xiaofeng Wang , Menglei Kang , Mengyuan Wang , Xuejun Zhang , Lixiang Xu , Cheng Zhang","doi":"10.1016/j.ejrs.2023.12.003","DOIUrl":null,"url":null,"abstract":"<div><p>Extraction of color and texture features of buildings from high-resolution remote sensing images often encounters the problems of interference of background information and varying target scales. In addition, most of the current attention mechanisms focus on building key feature selection for building extraction optimization, but ignore the influence of the complex background. Hence, we propose incorporating a novel reverse attention module into the network. The innovative module enables the model to selectively extract crucial building features while suppressing the impact of intricate background noise. It mitigates the influence of uniform spectral and structurally similar heterogeneous background targets on building segmentation and extraction. As a result, the overall generalizability of the model is improved. The reverse attention can also emphasize and amplify the specific details pertaining to the boundaries of the target. Furthermore, we couple a new multi-branch convolution block into the encoder, integrating dilated convolutions with multiple dilation rates. Compared to other methods that use only one multi-scale module to extract multi-scale information from high-level features, we use different receptive field convolutions to simultaneously capture multi-scale targets from multi-level features, effectively improving the ability of the model to extract multi-scale building features. The experimental findings demonstrate that our proposed multi-branch reverse attention semantic segmentation network achieves IoU of 90.59% and 81.79% on the well-known WHU building and Inria aerial image datasets, respectively.</p></div>","PeriodicalId":48539,"journal":{"name":"Egyptian Journal of Remote Sensing and Space Sciences","volume":"27 1","pages":"Pages 10-17"},"PeriodicalIF":3.7000,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1110982323001035/pdfft?md5=0f9a312c78c3551ba2cf17857997a7db&pid=1-s2.0-S1110982323001035-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Journal of Remote Sensing and Space Sciences","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110982323001035","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Extraction of color and texture features of buildings from high-resolution remote sensing images often encounters the problems of interference of background information and varying target scales. In addition, most of the current attention mechanisms focus on building key feature selection for building extraction optimization, but ignore the influence of the complex background. Hence, we propose incorporating a novel reverse attention module into the network. The innovative module enables the model to selectively extract crucial building features while suppressing the impact of intricate background noise. It mitigates the influence of uniform spectral and structurally similar heterogeneous background targets on building segmentation and extraction. As a result, the overall generalizability of the model is improved. The reverse attention can also emphasize and amplify the specific details pertaining to the boundaries of the target. Furthermore, we couple a new multi-branch convolution block into the encoder, integrating dilated convolutions with multiple dilation rates. Compared to other methods that use only one multi-scale module to extract multi-scale information from high-level features, we use different receptive field convolutions to simultaneously capture multi-scale targets from multi-level features, effectively improving the ability of the model to extract multi-scale building features. The experimental findings demonstrate that our proposed multi-branch reverse attention semantic segmentation network achieves IoU of 90.59% and 81.79% on the well-known WHU building and Inria aerial image datasets, respectively.
期刊介绍:
The Egyptian Journal of Remote Sensing and Space Sciences (EJRS) encompasses a comprehensive range of topics within Remote Sensing, Geographic Information Systems (GIS), planetary geology, and space technology development, including theories, applications, and modeling. EJRS aims to disseminate high-quality, peer-reviewed research focusing on the advancement of remote sensing and GIS technologies and their practical applications for effective planning, sustainable development, and environmental resource conservation. The journal particularly welcomes innovative papers with broad scientific appeal.