Hyperspectral and multispectral images fusion based on pyramid swin transformer

IF 3.1 3区物理与天体物理 Q2 INSTRUMENTS & INSTRUMENTATION Infrared Physics & Technology Pub Date : 2024-11-07 DOI:10.1016/j.infrared.2024.105617

Han Lang , Wenxing Bao , Wei Feng , Kewen Qu , Xuan Ma , Xiaowu Zhang

{"title":"Hyperspectral and multispectral images fusion based on pyramid swin transformer","authors":"Han Lang , Wenxing Bao , Wei Feng , Kewen Qu , Xuan Ma , Xiaowu Zhang","doi":"10.1016/j.infrared.2024.105617","DOIUrl":null,"url":null,"abstract":"<div><div>Remote sensing image fusion aims to generate a high spatial resolution hyperspectral image (HR-HSI) by integrating a low spatial resolution hyperspectral image (LR-HSI) and a high spatial resolution multispectral image (HR-MSI). While Convolutional Neural Networks (CNNs) have been widely employed in addressing the HSI-MSI fusion problem, their limited receptive field poses challenges in capturing global relationships within the feature maps. On the other hand, the computational complexity of Transformers hinders their application, especially in dealing with high-dimensional data like hyperspectral images (HSIs). To overcome this challenge, we propose an HSI-MSI fusion method based on the Pyramid Swin Transformer (PSTF). The pyramid design of the PSTF effectively extracts multi-scale information from images. The Spatial–Spectral Crossed Attention (SSCA) module, comprising the Cross Spatial Attention (CSA) and the Spectral Feature Integration (SFI) modules. The CSA module employs a cross-shaped self-attention mechanism, providing greater modeling flexibility for different spatial scales and non-local structures compared to traditional convolutional layers. Meanwhile, the SFI module introduces a global memory block (MB) to select the most relevant low-rank spectral vectors, integrating global spectral information with local spatial–spectral correlation to better extract and preserve spectral information. Additionally, the Separate Feature Extraction (SFE) module enhances the network’s ability to represent image features by independently processing positive and negative parts of shallow features, thus capturing details and structures more effectively and preventing the vanishing gradient problem. Compared with the state-of-the-art (SOTA) methods, experimental results demonstrate the effectiveness of the PSTF method.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"143 ","pages":"Article 105617"},"PeriodicalIF":3.1000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449524005012","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}

引用次数: 0

Abstract

Remote sensing image fusion aims to generate a high spatial resolution hyperspectral image (HR-HSI) by integrating a low spatial resolution hyperspectral image (LR-HSI) and a high spatial resolution multispectral image (HR-MSI). While Convolutional Neural Networks (CNNs) have been widely employed in addressing the HSI-MSI fusion problem, their limited receptive field poses challenges in capturing global relationships within the feature maps. On the other hand, the computational complexity of Transformers hinders their application, especially in dealing with high-dimensional data like hyperspectral images (HSIs). To overcome this challenge, we propose an HSI-MSI fusion method based on the Pyramid Swin Transformer (PSTF). The pyramid design of the PSTF effectively extracts multi-scale information from images. The Spatial–Spectral Crossed Attention (SSCA) module, comprising the Cross Spatial Attention (CSA) and the Spectral Feature Integration (SFI) modules. The CSA module employs a cross-shaped self-attention mechanism, providing greater modeling flexibility for different spatial scales and non-local structures compared to traditional convolutional layers. Meanwhile, the SFI module introduces a global memory block (MB) to select the most relevant low-rank spectral vectors, integrating global spectral information with local spatial–spectral correlation to better extract and preserve spectral information. Additionally, the Separate Feature Extraction (SFE) module enhances the network’s ability to represent image features by independently processing positive and negative parts of shallow features, thus capturing details and structures more effectively and preventing the vanishing gradient problem. Compared with the state-of-the-art (SOTA) methods, experimental results demonstrate the effectiveness of the PSTF method.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于金字塔斯温变换器的高光谱和多光谱图像融合

遥感图像融合旨在通过整合低空间分辨率高光谱图像（LR-HSI）和高空间分辨率多光谱图像（HR-MSI），生成高空间分辨率高光谱图像（HR-HSI）。虽然卷积神经网络（CNN）已被广泛用于解决 HSI-MSI 融合问题，但其有限的感受野在捕捉特征图中的全局关系方面存在挑战。另一方面，变换器的计算复杂性也阻碍了其应用，尤其是在处理像高光谱图像（HSI）这样的高维数据时。为了克服这一难题，我们提出了一种基于金字塔斯温变换器（PSTF）的 HSI-MSI 融合方法。PSTF 的金字塔设计能有效提取图像中的多尺度信息。空间-光谱交叉注意（SSCA）模块由交叉空间注意（CSA）和光谱特征整合（SFI）模块组成。与传统的卷积层相比，CSA 模块采用了十字形自注意机制，为不同空间尺度和非局部结构提供了更大的建模灵活性。同时，SFI 模块引入了全局记忆块（MB），用于选择最相关的低秩光谱向量，将全局光谱信息与局部空间-光谱相关性整合在一起，从而更好地提取和保存光谱信息。此外，分离特征提取（SFE）模块通过独立处理浅层特征的正负部分，增强了网络表示图像特征的能力，从而更有效地捕捉细节和结构，防止梯度消失问题。与最先进的（SOTA）方法相比，实验结果证明了 PSTF 方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Infrared Physics & Technology 物理-光学

CiteScore

5.70

自引率

12.10%

发文量

400

审稿时长

67 days

期刊介绍： The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region. Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine. Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.