The urgent need for advanced fire detection methods stems from the increased intensity of fire incidents, which cause massive property loss and irreversible damage. To overcome the limitations of traditional fire detection methods, such as those of smoke detectors, fire detection based on computer vision (CV) algorithms has been adopted to improve detection accuracy. Compared to single-modal fire detection, multi-modal fire detection has gained attention because it leverages the richer information present in both RGB and thermal images. However, prevalent multi-modal fire detection methods significantly increase model complexity by requiring two separate streams in the backbone to process RGB and thermal images independently. To address this issue, this paper proposes a four-channel single-stream fire detection method based on YOLOv5, which concatenates RGB and thermal images to form the required four-channel input. Comparison experiments with dual-stream YOLOv5 models using add fusion and transformer fusion demonstrate that the four-channel single-stream model reduces model complexity while improving detection accuracy. To further enhance detection accuracy and reduce model complexity, this study redesigned YOLOv5’s C3 module by integrating the convolutional block attention module (CBAM) to form the C3CBAM module and introduced the SCYLLA-Intersection over Union (SIoU) loss function. By comparing its performance with that of state-of-the-art (SOTA) models in multi-modal object detection, such as the YOLOv5-based dual-stream model, this study shows that the proposed approach improves detection in the diverse conditions presented in the selected dataset.
扫码关注我们
求助内容:
应助结果提醒方式:
