DMCM: Dwo-branch multilevel feature fusion with cross-attention mechanism for infrared and visible image fusion.

IF 2.6 3区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES PLoS ONE Pub Date : 2025-03-28 eCollection Date: 2025-01-01 DOI:10.1371/journal.pone.0318931

Xicheng Sun, Fu Lv, Yongan Feng, Xu Zhang

{"title":"DMCM: Dwo-branch multilevel feature fusion with cross-attention mechanism for infrared and visible image fusion.","authors":"Xicheng Sun, Fu Lv, Yongan Feng, Xu Zhang","doi":"10.1371/journal.pone.0318931","DOIUrl":null,"url":null,"abstract":"<p><p>In response to the limitations of current infrared and visible light image fusion algorithms-namely insufficient feature extraction, loss of detailed texture information, underutilization of differential and shared information, and the high number of model parameters-this paper proposes a novel multi-scale infrared and visible image fusion method with two-branch feature interaction. The proposed method introduces a lightweight multi-scale group convolution, based on GS convolution, which enhances multi-scale information interaction while reducing network parameters by incorporating group convolution and stacking multiple small convolutional kernels. Furthermore, the multi-level attention module is improved by integrating edge-enhanced branches and depthwise separable convolutions to preserve detailed texture information. Additionally, a lightweight cross-attention fusion module is introduced, optimizing the use of differential and shared features while minimizing computational complexity. Lastly, the efficiency of local attention is enhanced by adding a multi-dimensional fusion branch, which bolsters the interaction of information across multiple dimensions and facilitates comprehensive spatial information extraction from multimodal images. The proposed algorithm, along with seven others, was tested extensively on public datasets such as TNO and Roadscene. The experimental results demonstrate that the proposed method outperforms other algorithms in both subjective and objective evaluation results. Additionally, it demonstrates good performance in terms of operational efficiency. Moreover, target detection performance experiments conducted on the [Formula: see text] dataset confirm the superior performance of the proposed algorithm.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0318931"},"PeriodicalIF":2.6000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11952241/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0318931","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

In response to the limitations of current infrared and visible light image fusion algorithms-namely insufficient feature extraction, loss of detailed texture information, underutilization of differential and shared information, and the high number of model parameters-this paper proposes a novel multi-scale infrared and visible image fusion method with two-branch feature interaction. The proposed method introduces a lightweight multi-scale group convolution, based on GS convolution, which enhances multi-scale information interaction while reducing network parameters by incorporating group convolution and stacking multiple small convolutional kernels. Furthermore, the multi-level attention module is improved by integrating edge-enhanced branches and depthwise separable convolutions to preserve detailed texture information. Additionally, a lightweight cross-attention fusion module is introduced, optimizing the use of differential and shared features while minimizing computational complexity. Lastly, the efficiency of local attention is enhanced by adding a multi-dimensional fusion branch, which bolsters the interaction of information across multiple dimensions and facilitates comprehensive spatial information extraction from multimodal images. The proposed algorithm, along with seven others, was tested extensively on public datasets such as TNO and Roadscene. The experimental results demonstrate that the proposed method outperforms other algorithms in both subjective and objective evaluation results. Additionally, it demonstrates good performance in terms of operational efficiency. Moreover, target detection performance experiments conducted on the [Formula: see text] dataset confirm the superior performance of the proposed algorithm.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DMCM：基于交叉注意机制的二分支多水平特征融合红外与可见光图像融合。

针对当前红外与可见光图像融合算法特征提取不足、纹理细节信息缺失、差分信息和共享信息利用不足、模型参数过多等局限性，提出了一种基于两分支特征交互的红外与可见光图像多尺度融合方法。该方法在GS卷积的基础上引入了一种轻量级的多尺度群卷积，通过引入群卷积和叠加多个小卷积核，在减少网络参数的同时增强了多尺度信息交互。此外，通过融合边缘增强分支和深度可分离卷积对多级注意模块进行改进，以保留纹理的细节信息。此外，还引入了一个轻量级的跨注意力融合模块，优化了差分和共享特征的使用，同时最大限度地降低了计算复杂度。最后，通过增加多维融合分支，提高局部关注的效率，增强了多维信息的交互，便于从多模态图像中提取综合空间信息。该算法与其他7种算法一起，在TNO和Roadscene等公共数据集上进行了广泛的测试。实验结果表明，该方法在主客观评价结果上均优于其他算法。此外，它在操作效率方面表现出良好的性能。在[公式：见文本]数据集上进行的目标检测性能实验也证实了本文算法的优越性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

PLoS ONE 生物-生物学

CiteScore

6.20

自引率

5.40%

发文量

14242

审稿时长

3.7 months

期刊介绍： PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage