CS2Fusion: Contrastive learning for Self-Supervised infrared and visible image fusion by estimating feature compensation map

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Information Fusion Pub Date : 2023-09-24 DOI:10.1016/j.inffus.2023.102039
Xue Wang , Zheng Guan , Wenhua Qian , Jinde Cao , Shu Liang , Jin Yan
{"title":"CS2Fusion: Contrastive learning for Self-Supervised infrared and visible image fusion by estimating feature compensation map","authors":"Xue Wang ,&nbsp;Zheng Guan ,&nbsp;Wenhua Qian ,&nbsp;Jinde Cao ,&nbsp;Shu Liang ,&nbsp;Jin Yan","doi":"10.1016/j.inffus.2023.102039","DOIUrl":null,"url":null,"abstract":"<div><p>In infrared and visible image fusion (IVIF), prior knowledge constraints established with image-level information often ignore the identity and differences between source image features and cannot fully utilize the complementary information role of infrared images to visible images. For this purpose, this study develops a <strong>C</strong>ontrastive learning-based <strong>S</strong>elf-<strong>S</strong>upervised fusion model (CS<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Fusion), which considers infrared images as a complement to visible images, and develops a Compensation Perception Network (CPN) to guide the backbone network to generate fusion images by estimating the feature compensation map of infrared images. The core idea behind this method is based on the following observations: (1) there is usually a significant disparity in semantic information between different modalities; (2) despite the large semantic differences, the distribution of self-correlation and saliency features tends to be similar among the same modality features. Building upon these observations, we use self-correlation and saliency operation (SSO) to construct positive and negative pairs, driving CPN to perceive the complementary features of infrared images relative to visible images under the constraint of contrastive loss. CPN also incorporates a self-supervised learning mechanism, where visually impaired areas are simulated by randomly cropping patches from visible images to provide more varied information of the same scene to form multiple positive samples to enhance the model’s fine-grained perception capability. In addition, we also designed a demand-driven module (DDM) in the backbone network, which actively queries to improve the information between layers in the image reconstruction, and then integrates more spatial structural information. Notably, the CPN as an auxiliary network is only used in training to drive the backbone network to complete the IVIF in a self-supervised form. Experiments on various benchmark datasets and high-level vision tasks demonstrate the superiority of our CS<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Fusion over the state-of-the-art IVIF method.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":null,"pages":null},"PeriodicalIF":14.7000,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352300355X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In infrared and visible image fusion (IVIF), prior knowledge constraints established with image-level information often ignore the identity and differences between source image features and cannot fully utilize the complementary information role of infrared images to visible images. For this purpose, this study develops a Contrastive learning-based Self-Supervised fusion model (CS2Fusion), which considers infrared images as a complement to visible images, and develops a Compensation Perception Network (CPN) to guide the backbone network to generate fusion images by estimating the feature compensation map of infrared images. The core idea behind this method is based on the following observations: (1) there is usually a significant disparity in semantic information between different modalities; (2) despite the large semantic differences, the distribution of self-correlation and saliency features tends to be similar among the same modality features. Building upon these observations, we use self-correlation and saliency operation (SSO) to construct positive and negative pairs, driving CPN to perceive the complementary features of infrared images relative to visible images under the constraint of contrastive loss. CPN also incorporates a self-supervised learning mechanism, where visually impaired areas are simulated by randomly cropping patches from visible images to provide more varied information of the same scene to form multiple positive samples to enhance the model’s fine-grained perception capability. In addition, we also designed a demand-driven module (DDM) in the backbone network, which actively queries to improve the information between layers in the image reconstruction, and then integrates more spatial structural information. Notably, the CPN as an auxiliary network is only used in training to drive the backbone network to complete the IVIF in a self-supervised form. Experiments on various benchmark datasets and high-level vision tasks demonstrate the superiority of our CS2Fusion over the state-of-the-art IVIF method.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CS2Fusion:通过估计特征补偿图的自监督红外和可见光图像融合的对比学习
在红外和可见光图像融合(IVIF)中,利用图像级信息建立的先验知识约束往往忽略了源图像特征之间的同一性和差异性,不能充分利用红外图像对可见光图像的互补信息作用。为此,本研究开发了一个基于对比学习的自监督融合模型(CS2Fusion),该模型将红外图像视为可见图像的补充,并开发了补偿感知网络(CPN),通过估计红外图像的特征补偿图来引导骨干网络生成融合图像。该方法背后的核心思想基于以下观察:(1)不同模态之间的语义信息通常存在显著差异;(2) 尽管语义差异很大,但在相同的模态特征中,自相关和显著特征的分布往往相似。在这些观察的基础上,我们使用自相关和显著性运算(SSO)来构建正负对,驱动CPN在对比损失的约束下感知红外图像相对于可见图像的互补特征。CPN还结合了一种自监督学习机制,通过从可见图像中随机裁剪补丁来模拟视障区域,以提供同一场景的更多变化信息,形成多个正样本,从而增强模型的细粒度感知能力。此外,我们还在骨干网络中设计了一个需求驱动模块(DDM),该模块在图像重建中主动查询以改善层间信息,然后集成更多的空间结构信息。值得注意的是,作为辅助网络的CPN仅用于训练,以驱动骨干网络以自监督的形式完成IVIF。在各种基准数据集和高级视觉任务上的实验证明了我们的CS2Fusion优于最先进的IVIF方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
期刊最新文献
Large model-driven hyperscale healthcare data fusion analysis in complex multi-sensors Eco-friendly integration of shared autonomous mobility on demand and public transit based on multi-source data Information fusion for large-scale multi-source data based on the Dempster-Shafer evidence theory DSAP: Analyzing bias through demographic comparison of datasets Generative technology for human emotion recognition: A scoping review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1