DCFusion：红外和可见光图像的差分相关驱动融合机制

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2024-09-12 DOI:10.1016/j.patcog.2024.111002

{"title":"DCFusion：红外和可见光图像的差分相关驱动融合机制","authors":"","doi":"10.1016/j.patcog.2024.111002","DOIUrl":null,"url":null,"abstract":"<div><p>In end-to-end image fusion models, the loss function significantly impacts performance. However, most loss functions treat salient and background regions in source images equally, failing to distinguish complementary areas in multimodal images. This limits the model’s ability to effectively integrate information from these regions. Therefore, we propose difference correlation-driven fusion mechanism of infrared and visible images, which called DCFusion. Specifically, the model utilizes a dual-branch interactive network that dynamically fuses cross-modal multi-scale complementary information through element-wise multiplication, effectively integrating region-specific information. We introduce a two-stage method for generating salient target masks that adaptively focus on high-contrast regions in infrared images by analyzing pixel contrasts in local areas. Furthermore, we utilize the salient target masks to create heterogeneous images and design the <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>S</mi><mi>C</mi><mi>D</mi></mrow></msub></math></span> loss function to minimize the information gap between the heterogeneous images and the fused image, thereby enhancing the model’s interpretability. Experiments on the RoadScene and TNO datasets show that DCFusion surpasses with existing representativity fusion approaches, achieving state-of-the-art performance in both subjective visual and objective evaluations. Our code will be publicly available at <span><span>https://github.com/MinLila/DCFusion</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DCFusion: Difference correlation-driven fusion mechanism of infrared and visible images\",\"authors\":\"\",\"doi\":\"10.1016/j.patcog.2024.111002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In end-to-end image fusion models, the loss function significantly impacts performance. However, most loss functions treat salient and background regions in source images equally, failing to distinguish complementary areas in multimodal images. This limits the model’s ability to effectively integrate information from these regions. Therefore, we propose difference correlation-driven fusion mechanism of infrared and visible images, which called DCFusion. Specifically, the model utilizes a dual-branch interactive network that dynamically fuses cross-modal multi-scale complementary information through element-wise multiplication, effectively integrating region-specific information. We introduce a two-stage method for generating salient target masks that adaptively focus on high-contrast regions in infrared images by analyzing pixel contrasts in local areas. Furthermore, we utilize the salient target masks to create heterogeneous images and design the <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>S</mi><mi>C</mi><mi>D</mi></mrow></msub></math></span> loss function to minimize the information gap between the heterogeneous images and the fused image, thereby enhancing the model’s interpretability. Experiments on the RoadScene and TNO datasets show that DCFusion surpasses with existing representativity fusion approaches, achieving state-of-the-art performance in both subjective visual and objective evaluations. Our code will be publicly available at <span><span>https://github.com/MinLila/DCFusion</span><svg><path></path></svg></span>.</p></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320324007532\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324007532","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在端到端图像融合模型中，损失函数对性能有很大影响。然而，大多数损失函数对源图像中的突出区域和背景区域一视同仁，无法区分多模态图像中的互补区域。这就限制了模型有效整合这些区域信息的能力。因此，我们提出了红外图像和可见光图像的差异相关驱动融合机制，即 DCFusion。具体来说，该模型利用双分支交互网络，通过元素乘法动态融合跨模态多尺度互补信息，有效整合特定区域的信息。我们介绍了一种分两个阶段生成突出目标掩码的方法，这种掩码可通过分析局部区域的像素对比度，自适应地聚焦于红外图像中的高对比度区域。此外，我们还利用突出目标掩码创建异质图像，并设计 LSCD 损失函数来最小化异质图像与融合图像之间的信息差距，从而增强模型的可解释性。在 RoadScene 和 TNO 数据集上的实验表明，DCFusion 超越了现有的表征融合方法，在主观视觉和客观评估方面都达到了最先进的性能。我们的代码将在 https://github.com/MinLila/DCFusion 上公开。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DCFusion: Difference correlation-driven fusion mechanism of infrared and visible images

In end-to-end image fusion models, the loss function significantly impacts performance. However, most loss functions treat salient and background regions in source images equally, failing to distinguish complementary areas in multimodal images. This limits the model’s ability to effectively integrate information from these regions. Therefore, we propose difference correlation-driven fusion mechanism of infrared and visible images, which called DCFusion. Specifically, the model utilizes a dual-branch interactive network that dynamically fuses cross-modal multi-scale complementary information through element-wise multiplication, effectively integrating region-specific information. We introduce a two-stage method for generating salient target masks that adaptively focus on high-contrast regions in infrared images by analyzing pixel contrasts in local areas. Furthermore, we utilize the salient target masks to create heterogeneous images and design the $L_{S C D}$ loss function to minimize the information gap between the heterogeneous images and the fused image, thereby enhancing the model’s interpretability. Experiments on the RoadScene and TNO datasets show that DCFusion surpasses with existing representativity fusion approaches, achieving state-of-the-art performance in both subjective visual and objective evaluations. Our code will be publicly available at https://github.com/MinLila/DCFusion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.

期刊最新文献

A novel domain independent scene text localizer Video Anomaly Detection via self-supervised and spatio-temporal proxy tasks learning FICE: Text-conditioned fashion-image editing with guided GAN inversion Collaborative graph neural networks for augmented graphs: A local-to-global perspective Asymmetric patch sampling for contrastive learning