DSMT: Dual-Stage Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging

Fulin Luo;Xi Chen;Tan Guo;Xiuwen Gong;Lefei Zhang;Ce Zhu
{"title":"DSMT: Dual-Stage Multiscale Transformer for Hyperspectral Snapshot Compressive Imaging","authors":"Fulin Luo;Xi Chen;Tan Guo;Xiuwen Gong;Lefei Zhang;Ce Zhu","doi":"10.1109/TIP.2025.3556520","DOIUrl":null,"url":null,"abstract":"Snapshot compressive imaging (SCI) compresses a 3D hyperspectral image (HSI) into a 2D measurement, significantly improving imaging efficiency while preserving the spatial and spectral information inherent in HSI. However, reconstructing high-quality HSIs from compressed measurements remains a core challenge due to the complexity of the inverse problem. Transformer-based methods have recently shown promising performance in HSI reconstruction. Nonetheless, effectively capturing local information, long-range dependencies, and multi-scale features within a reasonable computational cost remains a significant challenge. In this paper, we propose a dual-stage multiscale Transformer (DSMT) tailored for HSI reconstruction, which adopts a coarse-to-fine framework to enhance reconstruction accuracy and network generalization. Specifically, we design a novel U-Net architecture with a dual-branch encoder, where two separate branches process distinct features and are fused to achieve more refined reconstruction results. Full-scale skip connections are introduced to strengthen feature fusion across different stages. To further improve performance, we develop a novel self-attention mechanism called dual-window multiscale multi-head self-attention (DWM-MSA). By utilizing two differently sized windows, DWM-MSA captures long-range dependencies and local information at multiple scales, significantly boosting reconstruction quality. Additionally, we introduce a hybrid positional embedding method, conditional/relative positional embedding (CRPE), which dynamically models both spatial and spectral dependencies, effectively enhancing the Transformer’s capacity for HSI reconstruction. Extensive quantitative and qualitative experiments on both the simulated and the real data are conducted to demonstrate the superior performance, stability, and generalization ability of our DSMT. Code of this project is at <uri>https://github.com/chenx2000/DSMT</uri>.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"2473-2486"},"PeriodicalIF":13.7000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10955125/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Snapshot compressive imaging (SCI) compresses a 3D hyperspectral image (HSI) into a 2D measurement, significantly improving imaging efficiency while preserving the spatial and spectral information inherent in HSI. However, reconstructing high-quality HSIs from compressed measurements remains a core challenge due to the complexity of the inverse problem. Transformer-based methods have recently shown promising performance in HSI reconstruction. Nonetheless, effectively capturing local information, long-range dependencies, and multi-scale features within a reasonable computational cost remains a significant challenge. In this paper, we propose a dual-stage multiscale Transformer (DSMT) tailored for HSI reconstruction, which adopts a coarse-to-fine framework to enhance reconstruction accuracy and network generalization. Specifically, we design a novel U-Net architecture with a dual-branch encoder, where two separate branches process distinct features and are fused to achieve more refined reconstruction results. Full-scale skip connections are introduced to strengthen feature fusion across different stages. To further improve performance, we develop a novel self-attention mechanism called dual-window multiscale multi-head self-attention (DWM-MSA). By utilizing two differently sized windows, DWM-MSA captures long-range dependencies and local information at multiple scales, significantly boosting reconstruction quality. Additionally, we introduce a hybrid positional embedding method, conditional/relative positional embedding (CRPE), which dynamically models both spatial and spectral dependencies, effectively enhancing the Transformer’s capacity for HSI reconstruction. Extensive quantitative and qualitative experiments on both the simulated and the real data are conducted to demonstrate the superior performance, stability, and generalization ability of our DSMT. Code of this project is at https://github.com/chenx2000/DSMT.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于高光谱快照压缩成像的双级多尺度变压器
快照压缩成像(SCI)将3D高光谱图像(HSI)压缩为2D测量,在保留高光谱图像固有的空间和光谱信息的同时,显著提高了成像效率。然而,由于反问题的复杂性,从压缩测量中重建高质量的hsi仍然是一个核心挑战。基于变压器的方法最近在HSI重建中显示出有希望的性能。尽管如此,在合理的计算成本内有效地捕获本地信息、远程依赖关系和多尺度特征仍然是一个重大挑战。本文提出了一种适合HSI重构的双级多尺度变压器(DSMT),该变压器采用从粗到精的框架来提高重构精度和网络泛化。具体来说,我们设计了一种具有双支路编码器的新型U-Net架构,其中两个独立的支路处理不同的特征并融合以获得更精细的重建结果。引入全尺寸跳跃连接来加强不同阶段的特征融合。为了进一步提高性能,我们开发了一种新的自注意机制,称为双窗口多尺度多头自注意(DWM-MSA)。通过利用两个不同大小的窗口,DWM-MSA在多个尺度上捕获远程依赖关系和局部信息,显著提高了重建质量。此外,我们还引入了一种混合位置嵌入方法——条件/相对位置嵌入(CRPE),该方法动态建模空间和频谱依赖关系,有效提高了Transformer的HSI重建能力。在模拟和真实数据上进行了大量的定量和定性实验,证明了我们的DSMT具有优越的性能、稳定性和泛化能力。这个项目的代码在https://github.com/chenx2000/DSMT。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reflectance Prediction-based Knowledge Distillation for Robust 3D Object Detection in Compressed Point Clouds. Implicit Neural Compression of Point Clouds. Token Calibration for Transformer-based Domain Adaptation. Task-Driven Underwater Image Enhancement via Hierarchical Semantic Refinement. Coupled Diffusion Posterior Sampling for Unsupervised Hyperspectral and Multispectral Images Fusion.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1