{"title":"基于小目标和边缘特征增强的遥感图像语义分割方法","authors":"Huaijun Wang, Luqi Qiao, He Li, Xiujuan Li, Junhuai Li, Ting Cao, Chunyi Zhang","doi":"10.1117/1.jrs.17.044503","DOIUrl":null,"url":null,"abstract":"Semantic segmentation of high-resolution remote sensing images based on deep learning has become a hot research topic and has been widely applied. At present, based on the structure of the convolutional neural network, when extracting target features through multiple layer convolutional layers, it is easy to cause the loss of small target features and fuzzy boundary of ground object classification. Therefore, we propose a remote sensing image semantic segmentation method P-Net to detect small target and enhance edge feature. The proposed network was based on an Encoder-Decoder structure. The decoder included the following components: a progressive small target feature enhancement network (IFEN), a boundary thinning module (BRM), and a feature aggregation module (FIAM). Firstly, the dense side output features of the encoder network were utilized to learn and acquired small target feature information and target edge features. Secondly, the pyramid segmentation attention module was introduced to effectively extract fine-grained and multi-scale spatial information. This module enhanced the feature expression of small targets and obtained high-level semantic feature information. The boundary refinement module was designed to refine the low-level spatial feature information extracted by the encoder. Finally, in order to improve the accuracy of remote sensing image object segmentation boundaries, skip connections were used to fuse high-level semantic information and low-level spatial information acrossed layers. These skip connections had the same spatial resolution but different semantic information. In this paper, six evaluation indices including mean intersection over union, frequency weighted intersection over union, pixel accuracy, F1, recall, and precision were used to verify on two public datasets of high-resolution remote sensing images, Gaofen image dataset (GID) and wuhan dense labeling dataset (WHDLD). In the GID dataset, each index reached 78.90%, 78.87%, 87.76%, 87.74%, 87.51%, and 88.04%, respectively; in the WHDLD dataset, each index reached 63.21%, 75.20%, 84.67%, 75.79%, 76.56%, and 75.45%, respectively. The results show that the performance of proposed method is better than that of DeepLabv3+, U-NET, PSPNet, and DUC_HDC methods. More precisely, the recognition performance of small target features is better, and the boundary obtained between object categories is clearer.","PeriodicalId":54879,"journal":{"name":"Journal of Applied Remote Sensing","volume":"1 1","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Remote sensing image semantic segmentation method based on small target and edge feature enhancement\",\"authors\":\"Huaijun Wang, Luqi Qiao, He Li, Xiujuan Li, Junhuai Li, Ting Cao, Chunyi Zhang\",\"doi\":\"10.1117/1.jrs.17.044503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semantic segmentation of high-resolution remote sensing images based on deep learning has become a hot research topic and has been widely applied. At present, based on the structure of the convolutional neural network, when extracting target features through multiple layer convolutional layers, it is easy to cause the loss of small target features and fuzzy boundary of ground object classification. Therefore, we propose a remote sensing image semantic segmentation method P-Net to detect small target and enhance edge feature. The proposed network was based on an Encoder-Decoder structure. The decoder included the following components: a progressive small target feature enhancement network (IFEN), a boundary thinning module (BRM), and a feature aggregation module (FIAM). Firstly, the dense side output features of the encoder network were utilized to learn and acquired small target feature information and target edge features. Secondly, the pyramid segmentation attention module was introduced to effectively extract fine-grained and multi-scale spatial information. This module enhanced the feature expression of small targets and obtained high-level semantic feature information. The boundary refinement module was designed to refine the low-level spatial feature information extracted by the encoder. Finally, in order to improve the accuracy of remote sensing image object segmentation boundaries, skip connections were used to fuse high-level semantic information and low-level spatial information acrossed layers. These skip connections had the same spatial resolution but different semantic information. In this paper, six evaluation indices including mean intersection over union, frequency weighted intersection over union, pixel accuracy, F1, recall, and precision were used to verify on two public datasets of high-resolution remote sensing images, Gaofen image dataset (GID) and wuhan dense labeling dataset (WHDLD). In the GID dataset, each index reached 78.90%, 78.87%, 87.76%, 87.74%, 87.51%, and 88.04%, respectively; in the WHDLD dataset, each index reached 63.21%, 75.20%, 84.67%, 75.79%, 76.56%, and 75.45%, respectively. The results show that the performance of proposed method is better than that of DeepLabv3+, U-NET, PSPNet, and DUC_HDC methods. More precisely, the recognition performance of small target features is better, and the boundary obtained between object categories is clearer.\",\"PeriodicalId\":54879,\"journal\":{\"name\":\"Journal of Applied Remote Sensing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Remote Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/1.jrs.17.044503\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/1.jrs.17.044503","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Remote sensing image semantic segmentation method based on small target and edge feature enhancement
Semantic segmentation of high-resolution remote sensing images based on deep learning has become a hot research topic and has been widely applied. At present, based on the structure of the convolutional neural network, when extracting target features through multiple layer convolutional layers, it is easy to cause the loss of small target features and fuzzy boundary of ground object classification. Therefore, we propose a remote sensing image semantic segmentation method P-Net to detect small target and enhance edge feature. The proposed network was based on an Encoder-Decoder structure. The decoder included the following components: a progressive small target feature enhancement network (IFEN), a boundary thinning module (BRM), and a feature aggregation module (FIAM). Firstly, the dense side output features of the encoder network were utilized to learn and acquired small target feature information and target edge features. Secondly, the pyramid segmentation attention module was introduced to effectively extract fine-grained and multi-scale spatial information. This module enhanced the feature expression of small targets and obtained high-level semantic feature information. The boundary refinement module was designed to refine the low-level spatial feature information extracted by the encoder. Finally, in order to improve the accuracy of remote sensing image object segmentation boundaries, skip connections were used to fuse high-level semantic information and low-level spatial information acrossed layers. These skip connections had the same spatial resolution but different semantic information. In this paper, six evaluation indices including mean intersection over union, frequency weighted intersection over union, pixel accuracy, F1, recall, and precision were used to verify on two public datasets of high-resolution remote sensing images, Gaofen image dataset (GID) and wuhan dense labeling dataset (WHDLD). In the GID dataset, each index reached 78.90%, 78.87%, 87.76%, 87.74%, 87.51%, and 88.04%, respectively; in the WHDLD dataset, each index reached 63.21%, 75.20%, 84.67%, 75.79%, 76.56%, and 75.45%, respectively. The results show that the performance of proposed method is better than that of DeepLabv3+, U-NET, PSPNet, and DUC_HDC methods. More precisely, the recognition performance of small target features is better, and the boundary obtained between object categories is clearer.
期刊介绍:
The Journal of Applied Remote Sensing is a peer-reviewed journal that optimizes the communication of concepts, information, and progress among the remote sensing community.