Early smoke and flame detection based on transformer

IF 3.7 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH 安全科学与韧性(英文) Pub Date : 2023-09-01 DOI:10.1016/j.jnlssr.2023.06.002

Xinzhi Wang , Mengyue Li , Mingke Gao , Quanyi Liu , Zhennan Li , Luyao Kou

{"title":"Early smoke and flame detection based on transformer","authors":"Xinzhi Wang , Mengyue Li , Mingke Gao , Quanyi Liu , Zhennan Li , Luyao Kou","doi":"10.1016/j.jnlssr.2023.06.002","DOIUrl":null,"url":null,"abstract":"<div><p>Fire-detection technology plays a critical role in ensuring public safety and facilitating the development of smart cities. Early fire detection is imperative to mitigate potential hazards and minimize associated losses. However, existing vision-based fire-detection methods exhibit limited generalizability and fail to adequately consider the effect of fire object size on detection accuracy. To address this issue, in this study a decoder-free fully transformer-based (DFFT) detector is used to achieve early smoke and flame detection, improving the detection performance for fires of different sizes. This method effectively captures multi-level and multi-scale fire features with rich semantic information while using two powerful encoders to maintain the accuracy of the single-feature map prediction. First, data augmentation is performed to enhance the generalizability of the model. Second, the detection-oriented transformer (DOT) backbone network is treated as a single-layer fire-feature extractor to obtain fire-related features on four scales, which are then fed into an encoder-only single-layer dense prediction module. Finally, the prediction module aggregates the multi-scale fire features into a single feature map using a scale-aggregated encoder (SAE). The prediction module then aligns the classification and regression features using a task-aligned encoder (TAE) to ensure the semantic interaction of the classification and regression predictions. Experimental results on one private dataset and one public dataset demonstrate that the adopted DFFT possesses high detection accuracy and a strong generalizability for fires of different sizes, particularly early small fires. The DFFT achieved mean average precision (mAP) values of 87.40% and 81.12% for the two datasets, outperforming other baseline models. It exhibits a better detection performance on flame objects than on smoke objects because of the prominence of flame features.</p></div>","PeriodicalId":62710,"journal":{"name":"安全科学与韧性(英文)","volume":"4 3","pages":"Pages 294-304"},"PeriodicalIF":3.7000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"安全科学与韧性(英文)","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666449623000282","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 3

Abstract

Fire-detection technology plays a critical role in ensuring public safety and facilitating the development of smart cities. Early fire detection is imperative to mitigate potential hazards and minimize associated losses. However, existing vision-based fire-detection methods exhibit limited generalizability and fail to adequately consider the effect of fire object size on detection accuracy. To address this issue, in this study a decoder-free fully transformer-based (DFFT) detector is used to achieve early smoke and flame detection, improving the detection performance for fires of different sizes. This method effectively captures multi-level and multi-scale fire features with rich semantic information while using two powerful encoders to maintain the accuracy of the single-feature map prediction. First, data augmentation is performed to enhance the generalizability of the model. Second, the detection-oriented transformer (DOT) backbone network is treated as a single-layer fire-feature extractor to obtain fire-related features on four scales, which are then fed into an encoder-only single-layer dense prediction module. Finally, the prediction module aggregates the multi-scale fire features into a single feature map using a scale-aggregated encoder (SAE). The prediction module then aligns the classification and regression features using a task-aligned encoder (TAE) to ensure the semantic interaction of the classification and regression predictions. Experimental results on one private dataset and one public dataset demonstrate that the adopted DFFT possesses high detection accuracy and a strong generalizability for fires of different sizes, particularly early small fires. The DFFT achieved mean average precision (mAP) values of 87.40% and 81.12% for the two datasets, outperforming other baseline models. It exhibits a better detection performance on flame objects than on smoke objects because of the prominence of flame features.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于变压器的早期烟雾和火焰检测

火灾探测技术在保障公共安全和促进智慧城市发展方面发挥着至关重要的作用。早期火灾探测对于减轻潜在危险和减少相关损失至关重要。然而，现有的基于视觉的火灾探测方法泛化能力有限，未能充分考虑火物大小对探测精度的影响。为了解决这一问题，本研究使用无解码器的全变压器检测器(DFFT)实现了早期的烟雾和火焰检测，提高了对不同大小火灾的检测性能。该方法能够有效捕获具有丰富语义信息的多层次、多尺度火灾特征，同时使用两个功能强大的编码器保持单特征地图预测的精度。首先，对数据进行扩充，增强模型的泛化能力。其次，将面向检测的变压器(DOT)骨干网作为单层火灾特征提取器，在四个尺度上获取火灾相关特征，然后将这些特征送入仅编码的单层密集预测模块。最后，预测模块使用比例聚合编码器(SAE)将多尺度火灾特征聚合到单个特征映射中。然后，预测模块使用任务对齐编码器(task-aligned encoder, TAE)来对齐分类和回归特征，以确保分类和回归预测的语义交互。在一个私有数据集和一个公共数据集上的实验结果表明，所采用的DFFT对不同规模的火灾，特别是早期小型火灾具有较高的检测精度和较强的泛化能力。DFFT在两个数据集上的平均精度(mAP)分别为87.40%和81.12%，优于其他基线模型。由于火焰特征的突出，该方法对火焰物体的检测性能优于对烟雾物体的检测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊