{"title":"DCFusion: Difference correlation-driven fusion mechanism of infrared and visible images","authors":"","doi":"10.1016/j.patcog.2024.111002","DOIUrl":null,"url":null,"abstract":"<div><p>In end-to-end image fusion models, the loss function significantly impacts performance. However, most loss functions treat salient and background regions in source images equally, failing to distinguish complementary areas in multimodal images. This limits the model’s ability to effectively integrate information from these regions. Therefore, we propose difference correlation-driven fusion mechanism of infrared and visible images, which called DCFusion. Specifically, the model utilizes a dual-branch interactive network that dynamically fuses cross-modal multi-scale complementary information through element-wise multiplication, effectively integrating region-specific information. We introduce a two-stage method for generating salient target masks that adaptively focus on high-contrast regions in infrared images by analyzing pixel contrasts in local areas. Furthermore, we utilize the salient target masks to create heterogeneous images and design the <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>S</mi><mi>C</mi><mi>D</mi></mrow></msub></math></span> loss function to minimize the information gap between the heterogeneous images and the fused image, thereby enhancing the model’s interpretability. Experiments on the RoadScene and TNO datasets show that DCFusion surpasses with existing representativity fusion approaches, achieving state-of-the-art performance in both subjective visual and objective evaluations. Our code will be publicly available at <span><span>https://github.com/MinLila/DCFusion</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324007532","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In end-to-end image fusion models, the loss function significantly impacts performance. However, most loss functions treat salient and background regions in source images equally, failing to distinguish complementary areas in multimodal images. This limits the model’s ability to effectively integrate information from these regions. Therefore, we propose difference correlation-driven fusion mechanism of infrared and visible images, which called DCFusion. Specifically, the model utilizes a dual-branch interactive network that dynamically fuses cross-modal multi-scale complementary information through element-wise multiplication, effectively integrating region-specific information. We introduce a two-stage method for generating salient target masks that adaptively focus on high-contrast regions in infrared images by analyzing pixel contrasts in local areas. Furthermore, we utilize the salient target masks to create heterogeneous images and design the loss function to minimize the information gap between the heterogeneous images and the fused image, thereby enhancing the model’s interpretability. Experiments on the RoadScene and TNO datasets show that DCFusion surpasses with existing representativity fusion approaches, achieving state-of-the-art performance in both subjective visual and objective evaluations. Our code will be publicly available at https://github.com/MinLila/DCFusion.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.