用于遥感图像去噪的代理和交叉条纹集成变换器

IF 7.5 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-09-11 DOI:10.1109/TGRS.2024.3457868

Xiaozhe Zhang;Fengying Xie;Haidong Ding;Shaocheng Yan;Zhenwei Shi

{"title":"用于遥感图像去噪的代理和交叉条纹集成变换器","authors":"Xiaozhe Zhang;Fengying Xie;Haidong Ding;Shaocheng Yan;Zhenwei Shi","doi":"10.1109/TGRS.2024.3457868","DOIUrl":null,"url":null,"abstract":"Existing Transformer-based dehazing methods for remote sensing (RS) images, to avoid quadratic computation complexity with respect to the feature map size, either perform self-attention mechanisms within local windows or capture long-range dependencies in the channel dimension rather than spatial. Each of these methods has its drawbacks. To address these limitations, we propose the Proxy and Cross-Stripes Integration Transformer (PCSformer) for RS image dehazing. PCSformer introduces two innovative Transformer blocks, i.e., sliding cross-stripes Transformer block and local proxy-based global Transformer block. The former allows us to directly model long-range dependencies and capture rich contextual information for large-scale objects in RS images. The latter seeks valuable information for thick haze regions within the whole feature map, generating more consistent and realistic scene details for such regions. Both achieve a large receptive field with cost-effective computational complexity within a single Transformer block. Furthermore, we introduce a shallow deep model with a small receptive field to conduct local refinement, which can mitigate artifacts associated with a large receptive field. Finally, to facilitate the better application of dehazing models to downstream visual tasks, we contribute two large-scale datasets for RS image dehazing. Experiments indicate that the dehazing models trained on our datasets can better assist downstream visual tasks under hazy atmospheric conditions compared to the dehazing models trained on existing datasets. Quantitative and qualitative experiments demonstrate that the proposed PCSformer significantly outperforms existing state-of-the-art techniques on dehazing benchmarks, particularly excelling in the restoration of thick haze scenes. The code and datasets are available at \n<uri>https://github.com/SmileShaun/PCSformer</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Proxy and Cross-Stripes Integration Transformer for Remote Sensing Image Dehazing\",\"authors\":\"Xiaozhe Zhang;Fengying Xie;Haidong Ding;Shaocheng Yan;Zhenwei Shi\",\"doi\":\"10.1109/TGRS.2024.3457868\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing Transformer-based dehazing methods for remote sensing (RS) images, to avoid quadratic computation complexity with respect to the feature map size, either perform self-attention mechanisms within local windows or capture long-range dependencies in the channel dimension rather than spatial. Each of these methods has its drawbacks. To address these limitations, we propose the Proxy and Cross-Stripes Integration Transformer (PCSformer) for RS image dehazing. PCSformer introduces two innovative Transformer blocks, i.e., sliding cross-stripes Transformer block and local proxy-based global Transformer block. The former allows us to directly model long-range dependencies and capture rich contextual information for large-scale objects in RS images. The latter seeks valuable information for thick haze regions within the whole feature map, generating more consistent and realistic scene details for such regions. Both achieve a large receptive field with cost-effective computational complexity within a single Transformer block. Furthermore, we introduce a shallow deep model with a small receptive field to conduct local refinement, which can mitigate artifacts associated with a large receptive field. Finally, to facilitate the better application of dehazing models to downstream visual tasks, we contribute two large-scale datasets for RS image dehazing. Experiments indicate that the dehazing models trained on our datasets can better assist downstream visual tasks under hazy atmospheric conditions compared to the dehazing models trained on existing datasets. Quantitative and qualitative experiments demonstrate that the proposed PCSformer significantly outperforms existing state-of-the-art techniques on dehazing benchmarks, particularly excelling in the restoration of thick haze scenes. The code and datasets are available at \\n<uri>https://github.com/SmileShaun/PCSformer</uri>\\n.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10677537/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10677537/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

现有的基于变换器的遥感（RS）图像去毛刺方法，为了避免与特征图大小相关的二次计算复杂性，要么在局部窗口内执行自注意机制，要么在通道维度而非空间维度捕捉长距离依赖关系。每种方法都有其缺点。为了解决这些局限性，我们提出了代理和交叉条纹集成变换器（PCSformer），用于 RS 图像去毛刺。PCSformer 引入了两个创新的变换器模块，即滑动交叉条纹变换器模块和基于本地代理的全局变换器模块。前者允许我们直接建立长距离依赖关系模型，并捕捉 RS 图像中大尺度物体的丰富上下文信息。后者在整个特征图中寻找厚雾区域的有价值信息，为这些区域生成更一致、更真实的场景细节。这两种方法都能在单个变换器块内实现具有成本效益的计算复杂度的大型感受野。此外，我们还引入了一个具有小感受野的浅层深度模型来进行局部细化，这可以减轻与大感受野相关的伪影。最后，为了更好地将去毛刺模型应用到下游视觉任务中，我们为 RS 图像去毛刺提供了两个大型数据集。实验表明，与在现有数据集上训练的去毛刺模型相比，在我们的数据集上训练的去毛刺模型能更好地协助雾霾大气条件下的下游视觉任务。定量和定性实验表明，所提出的 PCSformer 在去雾基准上的表现明显优于现有的最先进技术，尤其是在恢复厚雾场景方面。代码和数据集可在 https://github.com/SmileShaun/PCSformer 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Proxy and Cross-Stripes Integration Transformer for Remote Sensing Image Dehazing

Existing Transformer-based dehazing methods for remote sensing (RS) images, to avoid quadratic computation complexity with respect to the feature map size, either perform self-attention mechanisms within local windows or capture long-range dependencies in the channel dimension rather than spatial. Each of these methods has its drawbacks. To address these limitations, we propose the Proxy and Cross-Stripes Integration Transformer (PCSformer) for RS image dehazing. PCSformer introduces two innovative Transformer blocks, i.e., sliding cross-stripes Transformer block and local proxy-based global Transformer block. The former allows us to directly model long-range dependencies and capture rich contextual information for large-scale objects in RS images. The latter seeks valuable information for thick haze regions within the whole feature map, generating more consistent and realistic scene details for such regions. Both achieve a large receptive field with cost-effective computational complexity within a single Transformer block. Furthermore, we introduce a shallow deep model with a small receptive field to conduct local refinement, which can mitigate artifacts associated with a large receptive field. Finally, to facilitate the better application of dehazing models to downstream visual tasks, we contribute two large-scale datasets for RS image dehazing. Experiments indicate that the dehazing models trained on our datasets can better assist downstream visual tasks under hazy atmospheric conditions compared to the dehazing models trained on existing datasets. Quantitative and qualitative experiments demonstrate that the proposed PCSformer significantly outperforms existing state-of-the-art techniques on dehazing benchmarks, particularly excelling in the restoration of thick haze scenes. The code and datasets are available at https://github.com/SmileShaun/PCSformer .

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.