基于金字塔斯温变换器的高光谱和多光谱图像融合

IF 3.1 3区 物理与天体物理 Q2 INSTRUMENTS & INSTRUMENTATION Infrared Physics & Technology Pub Date : 2024-11-07 DOI:10.1016/j.infrared.2024.105617
Han Lang , Wenxing Bao , Wei Feng , Kewen Qu , Xuan Ma , Xiaowu Zhang
{"title":"基于金字塔斯温变换器的高光谱和多光谱图像融合","authors":"Han Lang ,&nbsp;Wenxing Bao ,&nbsp;Wei Feng ,&nbsp;Kewen Qu ,&nbsp;Xuan Ma ,&nbsp;Xiaowu Zhang","doi":"10.1016/j.infrared.2024.105617","DOIUrl":null,"url":null,"abstract":"<div><div>Remote sensing image fusion aims to generate a high spatial resolution hyperspectral image (HR-HSI) by integrating a low spatial resolution hyperspectral image (LR-HSI) and a high spatial resolution multispectral image (HR-MSI). While Convolutional Neural Networks (CNNs) have been widely employed in addressing the HSI-MSI fusion problem, their limited receptive field poses challenges in capturing global relationships within the feature maps. On the other hand, the computational complexity of Transformers hinders their application, especially in dealing with high-dimensional data like hyperspectral images (HSIs). To overcome this challenge, we propose an HSI-MSI fusion method based on the Pyramid Swin Transformer (PSTF). The pyramid design of the PSTF effectively extracts multi-scale information from images. The Spatial–Spectral Crossed Attention (SSCA) module, comprising the Cross Spatial Attention (CSA) and the Spectral Feature Integration (SFI) modules. The CSA module employs a cross-shaped self-attention mechanism, providing greater modeling flexibility for different spatial scales and non-local structures compared to traditional convolutional layers. Meanwhile, the SFI module introduces a global memory block (MB) to select the most relevant low-rank spectral vectors, integrating global spectral information with local spatial–spectral correlation to better extract and preserve spectral information. Additionally, the Separate Feature Extraction (SFE) module enhances the network’s ability to represent image features by independently processing positive and negative parts of shallow features, thus capturing details and structures more effectively and preventing the vanishing gradient problem. Compared with the state-of-the-art (SOTA) methods, experimental results demonstrate the effectiveness of the PSTF method.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"143 ","pages":"Article 105617"},"PeriodicalIF":3.1000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hyperspectral and multispectral images fusion based on pyramid swin transformer\",\"authors\":\"Han Lang ,&nbsp;Wenxing Bao ,&nbsp;Wei Feng ,&nbsp;Kewen Qu ,&nbsp;Xuan Ma ,&nbsp;Xiaowu Zhang\",\"doi\":\"10.1016/j.infrared.2024.105617\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Remote sensing image fusion aims to generate a high spatial resolution hyperspectral image (HR-HSI) by integrating a low spatial resolution hyperspectral image (LR-HSI) and a high spatial resolution multispectral image (HR-MSI). While Convolutional Neural Networks (CNNs) have been widely employed in addressing the HSI-MSI fusion problem, their limited receptive field poses challenges in capturing global relationships within the feature maps. On the other hand, the computational complexity of Transformers hinders their application, especially in dealing with high-dimensional data like hyperspectral images (HSIs). To overcome this challenge, we propose an HSI-MSI fusion method based on the Pyramid Swin Transformer (PSTF). The pyramid design of the PSTF effectively extracts multi-scale information from images. The Spatial–Spectral Crossed Attention (SSCA) module, comprising the Cross Spatial Attention (CSA) and the Spectral Feature Integration (SFI) modules. The CSA module employs a cross-shaped self-attention mechanism, providing greater modeling flexibility for different spatial scales and non-local structures compared to traditional convolutional layers. Meanwhile, the SFI module introduces a global memory block (MB) to select the most relevant low-rank spectral vectors, integrating global spectral information with local spatial–spectral correlation to better extract and preserve spectral information. Additionally, the Separate Feature Extraction (SFE) module enhances the network’s ability to represent image features by independently processing positive and negative parts of shallow features, thus capturing details and structures more effectively and preventing the vanishing gradient problem. Compared with the state-of-the-art (SOTA) methods, experimental results demonstrate the effectiveness of the PSTF method.</div></div>\",\"PeriodicalId\":13549,\"journal\":{\"name\":\"Infrared Physics & Technology\",\"volume\":\"143 \",\"pages\":\"Article 105617\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infrared Physics & Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1350449524005012\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INSTRUMENTS & INSTRUMENTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449524005012","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0

摘要

遥感图像融合旨在通过整合低空间分辨率高光谱图像(LR-HSI)和高空间分辨率多光谱图像(HR-MSI),生成高空间分辨率高光谱图像(HR-HSI)。虽然卷积神经网络(CNN)已被广泛用于解决 HSI-MSI 融合问题,但其有限的感受野在捕捉特征图中的全局关系方面存在挑战。另一方面,变换器的计算复杂性也阻碍了其应用,尤其是在处理像高光谱图像(HSI)这样的高维数据时。为了克服这一难题,我们提出了一种基于金字塔斯温变换器(PSTF)的 HSI-MSI 融合方法。PSTF 的金字塔设计能有效提取图像中的多尺度信息。空间-光谱交叉注意(SSCA)模块由交叉空间注意(CSA)和光谱特征整合(SFI)模块组成。与传统的卷积层相比,CSA 模块采用了十字形自注意机制,为不同空间尺度和非局部结构提供了更大的建模灵活性。同时,SFI 模块引入了全局记忆块(MB),用于选择最相关的低秩光谱向量,将全局光谱信息与局部空间-光谱相关性整合在一起,从而更好地提取和保存光谱信息。此外,分离特征提取(SFE)模块通过独立处理浅层特征的正负部分,增强了网络表示图像特征的能力,从而更有效地捕捉细节和结构,防止梯度消失问题。与最先进的(SOTA)方法相比,实验结果证明了 PSTF 方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Hyperspectral and multispectral images fusion based on pyramid swin transformer
Remote sensing image fusion aims to generate a high spatial resolution hyperspectral image (HR-HSI) by integrating a low spatial resolution hyperspectral image (LR-HSI) and a high spatial resolution multispectral image (HR-MSI). While Convolutional Neural Networks (CNNs) have been widely employed in addressing the HSI-MSI fusion problem, their limited receptive field poses challenges in capturing global relationships within the feature maps. On the other hand, the computational complexity of Transformers hinders their application, especially in dealing with high-dimensional data like hyperspectral images (HSIs). To overcome this challenge, we propose an HSI-MSI fusion method based on the Pyramid Swin Transformer (PSTF). The pyramid design of the PSTF effectively extracts multi-scale information from images. The Spatial–Spectral Crossed Attention (SSCA) module, comprising the Cross Spatial Attention (CSA) and the Spectral Feature Integration (SFI) modules. The CSA module employs a cross-shaped self-attention mechanism, providing greater modeling flexibility for different spatial scales and non-local structures compared to traditional convolutional layers. Meanwhile, the SFI module introduces a global memory block (MB) to select the most relevant low-rank spectral vectors, integrating global spectral information with local spatial–spectral correlation to better extract and preserve spectral information. Additionally, the Separate Feature Extraction (SFE) module enhances the network’s ability to represent image features by independently processing positive and negative parts of shallow features, thus capturing details and structures more effectively and preventing the vanishing gradient problem. Compared with the state-of-the-art (SOTA) methods, experimental results demonstrate the effectiveness of the PSTF method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.70
自引率
12.10%
发文量
400
审稿时长
67 days
期刊介绍: The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region. Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine. Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.
期刊最新文献
Breaking dimensional barriers in hyperspectral target detection: Atrous convolution with Gramian Angular field representations Multi-Scale convolutional neural network for finger vein recognition Temporal denoising and deep feature learning for enhanced defect detection in thermography using stacked denoising convolution autoencoder Detection of black tea fermentation quality based on optimized deep neural network and hyperspectral imaging Hyperspectral and multispectral images fusion based on pyramid swin transformer
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1