Haze-induced image degradation significantly degrades visual quality and impairs the performance of outdoor computer vision systems. Traditional single-image dehazing methods suffer from inherent limitations in dense haze scenarios due to the ill-posed nature of the problem. Leveraging complementary information from visible (RGB) and near-infrared (NIR) modalities offers a robust solution, as NIR signals exhibit superior penetration through atmospheric particles. This paper presents MMDehazeNet, a novel end-to-end multimodal fusion network for visible-infrared image dehazing. Adopting a U-Net-based dual-encoder architecture, it jointly processes hazy RGB and NIR images, with three key innovations: (1) a Gated Cross-Modality Attention (GCMA) module for efficient multi-level fusion; (2) a Multimodal Feature Correction (MMFC) module with a learned gating mechanism for adaptive inter-modal alignment; and (3) Multi-Scale Convolutional Layers (MSCL) for multi-receptive field feature extraction. Three variants (i.e., MMDehazeNet-S, -B, -L) are proposed. Extensive evaluations on the AirSim-VID, EPFL, and FANVID datasets demonstrate that MMDehazeNet achieves state-of-the-art performance. Quantitative and qualitative comparisons validate its significant superiority over existing single- and multi-modal methods, particularly under challenging medium and dense haze conditions.
扫码关注我们
求助内容:
应助结果提醒方式:
