基于级联 CNN 和全局-局部注意力变换器网络的高分辨率遥感图像语义分割技术

IF 1.4 4区 地球科学 Q4 ENVIRONMENTAL SCIENCES Journal of Applied Remote Sensing Pub Date : 2024-07-01 DOI:10.1117/1.jrs.18.034502
Xiaohui Liu, Lei Zhang, Rui Wang, Xiaoyu Li, Jiyang Xu, Xiaochen Lu
{"title":"基于级联 CNN 和全局-局部注意力变换器网络的高分辨率遥感图像语义分割技术","authors":"Xiaohui Liu, Lei Zhang, Rui Wang, Xiaoyu Li, Jiyang Xu, Xiaochen Lu","doi":"10.1117/1.jrs.18.034502","DOIUrl":null,"url":null,"abstract":"High-resolution remote sensing images (HRRSIs) contain rich local spatial information and long-distance location dependence, which play an important role in semantic segmentation tasks and have received more and more research attention. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In most networks, there are numerous small-scale object omissions and large-scale object fragmentations in the segmentation results because of insufficient local feature extraction and low global information utilization. A network cascaded by convolution neural network and global–local attention transformer is proposed called CNN-transformer cascade network. First, convolution blocks and global–local attention transformer blocks are used to extract multiscale local features and long-range location information, respectively. Then a multilevel channel attention integration block is designed to fuse geometric features and semantic features of different depths and revise the channel weights through the channel attention module to resist the interference of redundant information. Finally, the smoothness of the segmentation is improved through the implementation of upsampling using a deconvolution operation. We compare our method with several state-of-the-art methods on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can improve the integrity and independence of multiscale objects segmentation results.","PeriodicalId":54879,"journal":{"name":"Journal of Applied Remote Sensing","volume":"20 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cascaded CNN and global–local attention transformer network-based semantic segmentation for high-resolution remote sensing image\",\"authors\":\"Xiaohui Liu, Lei Zhang, Rui Wang, Xiaoyu Li, Jiyang Xu, Xiaochen Lu\",\"doi\":\"10.1117/1.jrs.18.034502\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-resolution remote sensing images (HRRSIs) contain rich local spatial information and long-distance location dependence, which play an important role in semantic segmentation tasks and have received more and more research attention. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In most networks, there are numerous small-scale object omissions and large-scale object fragmentations in the segmentation results because of insufficient local feature extraction and low global information utilization. A network cascaded by convolution neural network and global–local attention transformer is proposed called CNN-transformer cascade network. First, convolution blocks and global–local attention transformer blocks are used to extract multiscale local features and long-range location information, respectively. Then a multilevel channel attention integration block is designed to fuse geometric features and semantic features of different depths and revise the channel weights through the channel attention module to resist the interference of redundant information. Finally, the smoothness of the segmentation is improved through the implementation of upsampling using a deconvolution operation. We compare our method with several state-of-the-art methods on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can improve the integrity and independence of multiscale objects segmentation results.\",\"PeriodicalId\":54879,\"journal\":{\"name\":\"Journal of Applied Remote Sensing\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1117/1.jrs.18.034502\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1117/1.jrs.18.034502","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

高分辨率遥感图像(HRRSIs)包含丰富的局部空间信息和远距离位置依赖性,在语义分割任务中发挥着重要作用,受到越来越多的研究关注。然而,由于地面物体的多样性和复杂性,HRRSI 通常表现出较大的类内方差和较小的类间方差,从而给语义分割任务带来巨大挑战。在大多数网络中,由于局部特征提取不足和全局信息利用率低,分割结果中会出现大量小范围的物体遗漏和大范围的物体破碎。我们提出了一种由卷积神经网络和全局-局部注意力转换器级联的网络,称为 CNN-转换器级联网络。首先,卷积块和全局-局部注意力变换器块分别用于提取多尺度局部特征和远距离位置信息。然后,设计一个多级通道注意集成块,以融合不同深度的几何特征和语义特征,并通过通道注意模块修正通道权重,以抵御冗余信息的干扰。最后,通过使用解卷积操作进行上采样,提高了分割的平滑度。我们在 ISPRS Vaihingen 和 Potsdam 数据集上比较了我们的方法和几种最先进的方法。实验结果表明,我们的方法可以提高多尺度物体分割结果的完整性和独立性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Cascaded CNN and global–local attention transformer network-based semantic segmentation for high-resolution remote sensing image
High-resolution remote sensing images (HRRSIs) contain rich local spatial information and long-distance location dependence, which play an important role in semantic segmentation tasks and have received more and more research attention. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In most networks, there are numerous small-scale object omissions and large-scale object fragmentations in the segmentation results because of insufficient local feature extraction and low global information utilization. A network cascaded by convolution neural network and global–local attention transformer is proposed called CNN-transformer cascade network. First, convolution blocks and global–local attention transformer blocks are used to extract multiscale local features and long-range location information, respectively. Then a multilevel channel attention integration block is designed to fuse geometric features and semantic features of different depths and revise the channel weights through the channel attention module to resist the interference of redundant information. Finally, the smoothness of the segmentation is improved through the implementation of upsampling using a deconvolution operation. We compare our method with several state-of-the-art methods on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can improve the integrity and independence of multiscale objects segmentation results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Applied Remote Sensing
Journal of Applied Remote Sensing 环境科学-成像科学与照相技术
CiteScore
3.40
自引率
11.80%
发文量
194
审稿时长
3 months
期刊介绍: The Journal of Applied Remote Sensing is a peer-reviewed journal that optimizes the communication of concepts, information, and progress among the remote sensing community.
期刊最新文献
Monitoring soil moisture in cotton fields with synthetic aperture radar and optical data in arid and semi-arid regions Cascaded CNN and global–local attention transformer network-based semantic segmentation for high-resolution remote sensing image Coastal chlorophyll-a concentration estimation by fusion of Sentinel-2 multispectral instrument and in-situ hyperspectral data Spectral index for estimating leaf water content across diverse plant species using multiple viewing angles Optimal band selection using explainable artificial intelligence for machine learning-based hyperspectral image classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1