Cross-Modal Contrastive Pansharpening via Uncertainty Guidance

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2025-03-28 DOI:10.1109/TGRS.2025.3555610

Haoying Zeng;Xiaoyuan Yang;Kangqing Shen;Yixiao Li;Jin Jiang;Fangyi Li

{"title":"Cross-Modal Contrastive Pansharpening via Uncertainty Guidance","authors":"Haoying Zeng;Xiaoyuan Yang;Kangqing Shen;Yixiao Li;Jin Jiang;Fangyi Li","doi":"10.1109/TGRS.2025.3555610","DOIUrl":null,"url":null,"abstract":"Deep learning (DL)-based pansharpening has been widely applied in high-resolution imaging. Yet, artifacts related to generalization and oversmoothing have continuously been the challenge, primarily due to the mismatch between the simulation dataset and the unseen real-world scenarios. Current approaches address these through unsupervised frameworks or generative models, while modal inconsistency is not fully considered, leading to suboptimal performance. In this article, we propose a contrastive cross-modal framework via uncertainty guidance (UGCC), which comprises three key modules: a contrast feature enhancement module (CFEM), a cross-modal compensation module (CMCM), and an uncertainty guidance module (UGM). First, to enhance generalization and reduce overfitting, CFEM is introduced. Robust contrast features are augmented and learned sparsely in latent space, where sample distributions are refined, and redundant information is filtered from highly similar sample pairs for enhanced training stability. Furthermore, CMCM mitigates modal inconsistency effectively by domain transfer and collaborative attention, achieving efficient modal separation and interaction. Finally, to adaptively balance the performance of CMCM and CFEM based on prediction confidence, a hybrid loss function is designed, where UGM adjusts the weights through quantifying statistical-versus-structural uncertainties. Extensive experiments on Quickbird, Gaofen-2, WorldView-2, and WorldView-3 demonstrate that the performance of the proposed method surpasses or matches the state of the arts. Furthermore, ablation studies validate the effectiveness of each component. The code is now available at: <uri>https://github.com/meimeizeng/UGCF</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10945437/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning (DL)-based pansharpening has been widely applied in high-resolution imaging. Yet, artifacts related to generalization and oversmoothing have continuously been the challenge, primarily due to the mismatch between the simulation dataset and the unseen real-world scenarios. Current approaches address these through unsupervised frameworks or generative models, while modal inconsistency is not fully considered, leading to suboptimal performance. In this article, we propose a contrastive cross-modal framework via uncertainty guidance (UGCC), which comprises three key modules: a contrast feature enhancement module (CFEM), a cross-modal compensation module (CMCM), and an uncertainty guidance module (UGM). First, to enhance generalization and reduce overfitting, CFEM is introduced. Robust contrast features are augmented and learned sparsely in latent space, where sample distributions are refined, and redundant information is filtered from highly similar sample pairs for enhanced training stability. Furthermore, CMCM mitigates modal inconsistency effectively by domain transfer and collaborative attention, achieving efficient modal separation and interaction. Finally, to adaptively balance the performance of CMCM and CFEM based on prediction confidence, a hybrid loss function is designed, where UGM adjusts the weights through quantifying statistical-versus-structural uncertainties. Extensive experiments on Quickbird, Gaofen-2, WorldView-2, and WorldView-3 demonstrate that the performance of the proposed method surpasses or matches the state of the arts. Furthermore, ablation studies validate the effectiveness of each component. The code is now available at: https://github.com/meimeizeng/UGCF.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于不确定性制导的跨模态对比泛锐化

基于深度学习的泛锐化技术在高分辨率成像中得到了广泛的应用。然而，与泛化和过度平滑相关的工件一直是挑战，主要是由于模拟数据集与未见过的现实世界场景之间的不匹配。目前的方法通过无监督框架或生成模型来解决这些问题，而模态不一致没有得到充分考虑，导致性能不佳。在本文中，我们提出了一个基于不确定性指导的对比跨模态框架（UGCC），该框架包括三个关键模块：对比特征增强模块（CFEM）、跨模态补偿模块（CMCM）和不确定性指导模块（UGM）。首先，为了增强泛化和减少过拟合，引入了CFEM。鲁棒性对比特征在潜在空间中被增强和稀疏学习，其中样本分布被细化，并且从高度相似的样本对中过滤冗余信息以增强训练稳定性。此外，CMCM通过领域转移和协同关注有效地缓解了模态不一致，实现了高效的模态分离和交互。最后，为了基于预测置信度自适应平衡CMCM和CFEM的性能，设计了一个混合损失函数，其中UGM通过量化统计与结构的不确定性来调整权重。在Quickbird，高分2号，WorldView-2和WorldView-3上进行的大量实验表明，所提出的方法的性能超过或匹配最先进的技术。此外，消融研究验证了每个组件的有效性。代码现在可以在https://github.com/meimeizeng/UGCF上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.