通过整合编码解码器和边缘增强模型，改进从遥感图像中提取建筑物的工作

IF 2.2 4区地球科学 Q3 ENVIRONMENTAL SCIENCES Journal of the Indian Society of Remote Sensing Pub Date : 2024-09-13 DOI:10.1007/s12524-024-01992-1

Somenath Bera, Vandita Srivastava, Vimal K. Shrivastava

{"title":"通过整合编码解码器和边缘增强模型，改进从遥感图像中提取建筑物的工作","authors":"Somenath Bera, Vandita Srivastava, Vimal K. Shrivastava","doi":"10.1007/s12524-024-01992-1","DOIUrl":null,"url":null,"abstract":"<p>Building extraction from high-resolution images has been a fundamental task in the remote sensing field. It helps in monitoring natural disasters and developing urban areas. Encoder–Decoder based convolutional neural network (CNN) has provided a paradigm for automatic building extraction. However, extracting building information is difficult due to many reasons like diverse scales, complex background and variety of building structures. Moreover, achieving accurate boundary information remains challenging due to various impediments surrounding buildings. To deal with these challenges, in this article, we proposed a dual-branch model. One branch is the segmentation branch that includes an encoder–decoder framework (based on Attention-ResUNet architecture) combining residual unit and attention network, to generate the segmentation mask. The residual unit improves the ability to learn the deep and complex building features whereas the attention network focuses on the informative spatial information. In addition, a dilated module is positioned at the end of the decoder of Attention-ResUNet to capture the multiscale information. Another branch is the edge branch consisting of canny edge extraction, morphological operation and squeeze-excitation network, to improve the boundary information. The canny edge detection method extracts the edges of the buildings which is further enhanced through the morphological operation. In addition, a squeeze-excitation network is added for fine adjustment of generated feature maps. At the end, our proposed model integrates the segmentation mask obtained using the segmentation branch and boundary information generated by the edge branch to produce the refined segmentation mask. Experiments have been performed on the Massachusetts building dataset and the WHU-I building dataset. The performance of proposed model is compared with state-of-the-art models such as SegNet, DeepLabV3Plus, UNet, Attention-UNet, ResUNet and Attention-ResUNet. The results demonstrate that the proposed approach improves the performance for both the datasets. Hence, we can conclude that the proposed approach has a great potential in extracting multiscale information and enhancing the boundary information of buildings.</p>","PeriodicalId":17510,"journal":{"name":"Journal of the Indian Society of Remote Sensing","volume":"29 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Building Extraction from Remotely Sensed Images by Integration of Encode–Decoder and Edge Enhancement Models\",\"authors\":\"Somenath Bera, Vandita Srivastava, Vimal K. Shrivastava\",\"doi\":\"10.1007/s12524-024-01992-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Building extraction from high-resolution images has been a fundamental task in the remote sensing field. It helps in monitoring natural disasters and developing urban areas. Encoder–Decoder based convolutional neural network (CNN) has provided a paradigm for automatic building extraction. However, extracting building information is difficult due to many reasons like diverse scales, complex background and variety of building structures. Moreover, achieving accurate boundary information remains challenging due to various impediments surrounding buildings. To deal with these challenges, in this article, we proposed a dual-branch model. One branch is the segmentation branch that includes an encoder–decoder framework (based on Attention-ResUNet architecture) combining residual unit and attention network, to generate the segmentation mask. The residual unit improves the ability to learn the deep and complex building features whereas the attention network focuses on the informative spatial information. In addition, a dilated module is positioned at the end of the decoder of Attention-ResUNet to capture the multiscale information. Another branch is the edge branch consisting of canny edge extraction, morphological operation and squeeze-excitation network, to improve the boundary information. The canny edge detection method extracts the edges of the buildings which is further enhanced through the morphological operation. In addition, a squeeze-excitation network is added for fine adjustment of generated feature maps. At the end, our proposed model integrates the segmentation mask obtained using the segmentation branch and boundary information generated by the edge branch to produce the refined segmentation mask. Experiments have been performed on the Massachusetts building dataset and the WHU-I building dataset. The performance of proposed model is compared with state-of-the-art models such as SegNet, DeepLabV3Plus, UNet, Attention-UNet, ResUNet and Attention-ResUNet. The results demonstrate that the proposed approach improves the performance for both the datasets. Hence, we can conclude that the proposed approach has a great potential in extracting multiscale information and enhancing the boundary information of buildings.</p>\",\"PeriodicalId\":17510,\"journal\":{\"name\":\"Journal of the Indian Society of Remote Sensing\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Indian Society of Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s12524-024-01992-1\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Indian Society of Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s12524-024-01992-1","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

从高分辨率图像中提取建筑物是遥感领域的一项基本任务。它有助于监测自然灾害和开发城市区域。基于编码器-解码器的卷积神经网络（CNN）为建筑物的自动提取提供了一个范例。然而，由于尺度不同、背景复杂和建筑结构多样等多种原因，提取建筑信息十分困难。此外，由于建筑物周围存在各种障碍，要获得准确的边界信息仍然具有挑战性。为了应对这些挑战，我们在本文中提出了一个双分支模型。其中一个分支是分割分支，包括一个编码器-解码器框架（基于注意力-ResUNet 架构），结合残差单元和注意力网络，生成分割掩码。残差单元提高了学习深层复杂建筑特征的能力，而注意力网络则侧重于信息丰富的空间信息。此外，Attention-ResUNet 的解码器末端还有一个扩张模块，用于捕捉多尺度信息。另一个分支是边缘分支，由边缘提取（canny edge extraction）、形态学运算和挤压激励网络（squeeze-excitation network）组成，以改善边界信息。Canny 边缘检测方法可提取建筑物的边缘，并通过形态学运算进一步增强。此外，还添加了挤压激励网络，用于微调生成的特征图。最后，我们提出的模型将使用分割分支获得的分割掩码和边缘分支生成的边界信息整合在一起，生成精细的分割掩码。我们在马萨诸塞州建筑数据集和 WHU-I 建筑数据集上进行了实验。提议模型的性能与 SegNet、DeepLabV3Plus、UNet、Attention-UNet、ResUNet 和 Attention-ResUNet 等最先进模型进行了比较。结果表明，所提出的方法提高了两个数据集的性能。因此，我们可以得出结论，所提出的方法在提取多尺度信息和增强建筑物边界信息方面具有巨大潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Improved Building Extraction from Remotely Sensed Images by Integration of Encode–Decoder and Edge Enhancement Models

Building extraction from high-resolution images has been a fundamental task in the remote sensing field. It helps in monitoring natural disasters and developing urban areas. Encoder–Decoder based convolutional neural network (CNN) has provided a paradigm for automatic building extraction. However, extracting building information is difficult due to many reasons like diverse scales, complex background and variety of building structures. Moreover, achieving accurate boundary information remains challenging due to various impediments surrounding buildings. To deal with these challenges, in this article, we proposed a dual-branch model. One branch is the segmentation branch that includes an encoder–decoder framework (based on Attention-ResUNet architecture) combining residual unit and attention network, to generate the segmentation mask. The residual unit improves the ability to learn the deep and complex building features whereas the attention network focuses on the informative spatial information. In addition, a dilated module is positioned at the end of the decoder of Attention-ResUNet to capture the multiscale information. Another branch is the edge branch consisting of canny edge extraction, morphological operation and squeeze-excitation network, to improve the boundary information. The canny edge detection method extracts the edges of the buildings which is further enhanced through the morphological operation. In addition, a squeeze-excitation network is added for fine adjustment of generated feature maps. At the end, our proposed model integrates the segmentation mask obtained using the segmentation branch and boundary information generated by the edge branch to produce the refined segmentation mask. Experiments have been performed on the Massachusetts building dataset and the WHU-I building dataset. The performance of proposed model is compared with state-of-the-art models such as SegNet, DeepLabV3Plus, UNet, Attention-UNet, ResUNet and Attention-ResUNet. The results demonstrate that the proposed approach improves the performance for both the datasets. Hence, we can conclude that the proposed approach has a great potential in extracting multiscale information and enhancing the boundary information of buildings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the Indian Society of Remote Sensing ENVIRONMENTAL SCIENCES-REMOTE SENSING

CiteScore

4.80

自引率

8.00%

发文量

163

审稿时长

7 months

期刊介绍： The aims and scope of the Journal of the Indian Society of Remote Sensing are to help towards advancement, dissemination and application of the knowledge of Remote Sensing technology, which is deemed to include photo interpretation, photogrammetry, aerial photography, image processing, and other related technologies in the field of survey, planning and management of natural resources and other areas of application where the technology is considered to be appropriate, to promote interaction among all persons, bodies, institutions (private and/or state-owned) and industries interested in achieving advancement, dissemination and application of the technology, to encourage and undertake research in remote sensing and related technologies and to undertake and execute all acts which shall promote all or any of the aims and objectives of the Indian Society of Remote Sensing.