Somenath Bera, Vandita Srivastava, Vimal K. Shrivastava
{"title":"通过整合编码解码器和边缘增强模型,改进从遥感图像中提取建筑物的工作","authors":"Somenath Bera, Vandita Srivastava, Vimal K. Shrivastava","doi":"10.1007/s12524-024-01992-1","DOIUrl":null,"url":null,"abstract":"<p>Building extraction from high-resolution images has been a fundamental task in the remote sensing field. It helps in monitoring natural disasters and developing urban areas. Encoder–Decoder based convolutional neural network (CNN) has provided a paradigm for automatic building extraction. However, extracting building information is difficult due to many reasons like diverse scales, complex background and variety of building structures. Moreover, achieving accurate boundary information remains challenging due to various impediments surrounding buildings. To deal with these challenges, in this article, we proposed a dual-branch model. One branch is the segmentation branch that includes an encoder–decoder framework (based on Attention-ResUNet architecture) combining residual unit and attention network, to generate the segmentation mask. The residual unit improves the ability to learn the deep and complex building features whereas the attention network focuses on the informative spatial information. In addition, a dilated module is positioned at the end of the decoder of Attention-ResUNet to capture the multiscale information. Another branch is the edge branch consisting of canny edge extraction, morphological operation and squeeze-excitation network, to improve the boundary information. The canny edge detection method extracts the edges of the buildings which is further enhanced through the morphological operation. In addition, a squeeze-excitation network is added for fine adjustment of generated feature maps. At the end, our proposed model integrates the segmentation mask obtained using the segmentation branch and boundary information generated by the edge branch to produce the refined segmentation mask. Experiments have been performed on the Massachusetts building dataset and the WHU-I building dataset. The performance of proposed model is compared with state-of-the-art models such as SegNet, DeepLabV3Plus, UNet, Attention-UNet, ResUNet and Attention-ResUNet. The results demonstrate that the proposed approach improves the performance for both the datasets. Hence, we can conclude that the proposed approach has a great potential in extracting multiscale information and enhancing the boundary information of buildings.</p>","PeriodicalId":17510,"journal":{"name":"Journal of the Indian Society of Remote Sensing","volume":"29 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Building Extraction from Remotely Sensed Images by Integration of Encode–Decoder and Edge Enhancement Models\",\"authors\":\"Somenath Bera, Vandita Srivastava, Vimal K. Shrivastava\",\"doi\":\"10.1007/s12524-024-01992-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Building extraction from high-resolution images has been a fundamental task in the remote sensing field. It helps in monitoring natural disasters and developing urban areas. Encoder–Decoder based convolutional neural network (CNN) has provided a paradigm for automatic building extraction. However, extracting building information is difficult due to many reasons like diverse scales, complex background and variety of building structures. Moreover, achieving accurate boundary information remains challenging due to various impediments surrounding buildings. To deal with these challenges, in this article, we proposed a dual-branch model. One branch is the segmentation branch that includes an encoder–decoder framework (based on Attention-ResUNet architecture) combining residual unit and attention network, to generate the segmentation mask. The residual unit improves the ability to learn the deep and complex building features whereas the attention network focuses on the informative spatial information. In addition, a dilated module is positioned at the end of the decoder of Attention-ResUNet to capture the multiscale information. Another branch is the edge branch consisting of canny edge extraction, morphological operation and squeeze-excitation network, to improve the boundary information. The canny edge detection method extracts the edges of the buildings which is further enhanced through the morphological operation. In addition, a squeeze-excitation network is added for fine adjustment of generated feature maps. At the end, our proposed model integrates the segmentation mask obtained using the segmentation branch and boundary information generated by the edge branch to produce the refined segmentation mask. Experiments have been performed on the Massachusetts building dataset and the WHU-I building dataset. The performance of proposed model is compared with state-of-the-art models such as SegNet, DeepLabV3Plus, UNet, Attention-UNet, ResUNet and Attention-ResUNet. The results demonstrate that the proposed approach improves the performance for both the datasets. Hence, we can conclude that the proposed approach has a great potential in extracting multiscale information and enhancing the boundary information of buildings.</p>\",\"PeriodicalId\":17510,\"journal\":{\"name\":\"Journal of the Indian Society of Remote Sensing\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Indian Society of Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s12524-024-01992-1\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Indian Society of Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s12524-024-01992-1","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Improved Building Extraction from Remotely Sensed Images by Integration of Encode–Decoder and Edge Enhancement Models
Building extraction from high-resolution images has been a fundamental task in the remote sensing field. It helps in monitoring natural disasters and developing urban areas. Encoder–Decoder based convolutional neural network (CNN) has provided a paradigm for automatic building extraction. However, extracting building information is difficult due to many reasons like diverse scales, complex background and variety of building structures. Moreover, achieving accurate boundary information remains challenging due to various impediments surrounding buildings. To deal with these challenges, in this article, we proposed a dual-branch model. One branch is the segmentation branch that includes an encoder–decoder framework (based on Attention-ResUNet architecture) combining residual unit and attention network, to generate the segmentation mask. The residual unit improves the ability to learn the deep and complex building features whereas the attention network focuses on the informative spatial information. In addition, a dilated module is positioned at the end of the decoder of Attention-ResUNet to capture the multiscale information. Another branch is the edge branch consisting of canny edge extraction, morphological operation and squeeze-excitation network, to improve the boundary information. The canny edge detection method extracts the edges of the buildings which is further enhanced through the morphological operation. In addition, a squeeze-excitation network is added for fine adjustment of generated feature maps. At the end, our proposed model integrates the segmentation mask obtained using the segmentation branch and boundary information generated by the edge branch to produce the refined segmentation mask. Experiments have been performed on the Massachusetts building dataset and the WHU-I building dataset. The performance of proposed model is compared with state-of-the-art models such as SegNet, DeepLabV3Plus, UNet, Attention-UNet, ResUNet and Attention-ResUNet. The results demonstrate that the proposed approach improves the performance for both the datasets. Hence, we can conclude that the proposed approach has a great potential in extracting multiscale information and enhancing the boundary information of buildings.
期刊介绍:
The aims and scope of the Journal of the Indian Society of Remote Sensing are to help towards advancement, dissemination and application of the knowledge of Remote Sensing technology, which is deemed to include photo interpretation, photogrammetry, aerial photography, image processing, and other related technologies in the field of survey, planning and management of natural resources and other areas of application where the technology is considered to be appropriate, to promote interaction among all persons, bodies, institutions (private and/or state-owned) and industries interested in achieving advancement, dissemination and application of the technology, to encourage and undertake research in remote sensing and related technologies and to undertake and execute all acts which shall promote all or any of the aims and objectives of the Indian Society of Remote Sensing.