MCAFNet:多光谱目标检测的多尺度跨模态自适应融合网络

IF 3.4 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Digital Signal Processing Pub Date : 2025-04-01 Epub Date: 2025-01-14 DOI:10.1016/j.dsp.2025.104996
Shangpo Zheng , Liu Junfeng , Jun Zeng
{"title":"MCAFNet:多光谱目标检测的多尺度跨模态自适应融合网络","authors":"Shangpo Zheng ,&nbsp;Liu Junfeng ,&nbsp;Jun Zeng","doi":"10.1016/j.dsp.2025.104996","DOIUrl":null,"url":null,"abstract":"<div><div>Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detection capab-ilities, significant challenges remain in fully leveraging the specif-ic detail information of each single modality and accurately capt-uring cross-modality shared features information. To address th-ese challenges, we propose a Multiscale Cross-modality Adaptive Fusion Network (MCAFNet). This network incorporates Cross- modality interactive Transformer (CMIT) module, Multimodal Adaptive Weighted Fusion (MAWF) module, and a 3D-Integrated Attention Feature Enhancement (3D-IAFE) module. These components work together to comprehensively extract complementary feature between modalities and specific detailed feature within each modality, thereby enhancing the accuracy and robustness of multimodal object detection. Extensive experimental validation and in-depth ablation studies confirm the effectiveness of the proposed method, achieving state-of-the-art detection performance on multiple public datasets.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"159 ","pages":"Article 104996"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MCAFNet: Multiscale cross-modality adaptive fusion network for multispectral object detection\",\"authors\":\"Shangpo Zheng ,&nbsp;Liu Junfeng ,&nbsp;Jun Zeng\",\"doi\":\"10.1016/j.dsp.2025.104996\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detection capab-ilities, significant challenges remain in fully leveraging the specif-ic detail information of each single modality and accurately capt-uring cross-modality shared features information. To address th-ese challenges, we propose a Multiscale Cross-modality Adaptive Fusion Network (MCAFNet). This network incorporates Cross- modality interactive Transformer (CMIT) module, Multimodal Adaptive Weighted Fusion (MAWF) module, and a 3D-Integrated Attention Feature Enhancement (3D-IAFE) module. These components work together to comprehensively extract complementary feature between modalities and specific detailed feature within each modality, thereby enhancing the accuracy and robustness of multimodal object detection. Extensive experimental validation and in-depth ablation studies confirm the effectiveness of the proposed method, achieving state-of-the-art detection performance on multiple public datasets.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"159 \",\"pages\":\"Article 104996\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425000181\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/14 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425000181","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/14 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

多光谱目标检测技术集成了多种光谱模式的数据,如将热图像与RGB可见光图像相结合,以提高不同环境条件下目标检测的精度和鲁棒性。尽管这种方法提高了检测能力,但在充分利用每个单一模态的特定细节信息和准确捕获跨模态共享特征信息方面仍然存在重大挑战。为了解决这些挑战,我们提出了一个多尺度跨模态自适应融合网络(MCAFNet)。该网络集成了跨模态交互变压器(CMIT)模块、多模态自适应加权融合(MAWF)模块和3d集成注意力特征增强(3D-IAFE)模块。这些组件共同工作,全面提取模态之间的互补特征和每个模态内的特定细节特征,从而提高多模态目标检测的准确性和鲁棒性。广泛的实验验证和深入的消融研究证实了所提出方法的有效性,在多个公共数据集上实现了最先进的检测性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MCAFNet: Multiscale cross-modality adaptive fusion network for multispectral object detection
Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detection capab-ilities, significant challenges remain in fully leveraging the specif-ic detail information of each single modality and accurately capt-uring cross-modality shared features information. To address th-ese challenges, we propose a Multiscale Cross-modality Adaptive Fusion Network (MCAFNet). This network incorporates Cross- modality interactive Transformer (CMIT) module, Multimodal Adaptive Weighted Fusion (MAWF) module, and a 3D-Integrated Attention Feature Enhancement (3D-IAFE) module. These components work together to comprehensively extract complementary feature between modalities and specific detailed feature within each modality, thereby enhancing the accuracy and robustness of multimodal object detection. Extensive experimental validation and in-depth ablation studies confirm the effectiveness of the proposed method, achieving state-of-the-art detection performance on multiple public datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Digital Signal Processing
Digital Signal Processing 工程技术-工程:电子与电气
CiteScore
5.30
自引率
17.20%
发文量
435
审稿时长
66 days
期刊介绍: Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,
期刊最新文献
D3S-Net: A dense SAR ship detection network for dynamic denoising and structural sparsity deinterference Fine-grained adaptive distillation framework for cross-modal hashing retrieval A strategy of joint TR beam resource allocation and STSLF in netted radar system in the complex scenario SCP-based Kalman filter for multi-sensor networked systems with bandwidth allocation and uniform quantization A method for decoupling RFF from received cepstral-domain CIR in LTE-V2X communications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1