Infrared/Visible Light Fire Image Fusion Method Based on Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer

IF 2.4 2区农林科学 Q1 FORESTRY Forests Pub Date : 2024-06-01 DOI:10.3390/f15060976

Haicheng Wei, Xinping Fu, Zhuokang Wang, Jing Zhao

{"title":"Infrared/Visible Light Fire Image Fusion Method Based on Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer","authors":"Haicheng Wei, Xinping Fu, Zhuokang Wang, Jing Zhao","doi":"10.3390/f15060976","DOIUrl":null,"url":null,"abstract":"To address issues of detail loss, limited matching datasets, and low fusion accuracy in infrared/visible light fire image fusion, a novel method based on the Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer (VTW-GAN) is proposed. The algorithm employs a generator and discriminator network architecture, integrating the efficient global representation capability of Transformers with wavelet-guided pooling for extracting finer-grained features and reconstructing higher-quality fusion images. To overcome the shortage of image data, transfer learning is utilized to apply the well-trained model to fire image fusion, thereby improving fusion precision. The experimental results demonstrate that VTW-GAN outperforms the DenseFuse, IFCNN, U2Fusion, SwinFusion, and TGFuse methods in both objective and subjective aspects. Specifically, on the KAIST dataset, the fusion images show significant improvements in Entropy (EN), Mutual Information (MI), and Quality Assessment based on Gradient-based Fusion (Qabf) by 2.78%, 11.89%, and 10.45%, respectively, over the next-best values. On the Corsican Fire dataset, compared to data-limited fusion models, the transfer-learned fusion images enhance the Standard Deviation (SD) and MI by 10.69% and 11.73%, respectively, and compared to other methods, they perform well in Average Gradient (AG), SD, and MI, improving them by 3.43%, 4.84%, and 4.21%, respectively, from the next-best values. Compared with DenseFuse, the operation efficiency is improved by 78.3%. The method achieves favorable subjective image outcomes and is effective for fire-detection applications.","PeriodicalId":12339,"journal":{"name":"Forests","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forests","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3390/f15060976","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FORESTRY","Score":null,"Total":0}

引用次数: 0

Abstract

To address issues of detail loss, limited matching datasets, and low fusion accuracy in infrared/visible light fire image fusion, a novel method based on the Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer (VTW-GAN) is proposed. The algorithm employs a generator and discriminator network architecture, integrating the efficient global representation capability of Transformers with wavelet-guided pooling for extracting finer-grained features and reconstructing higher-quality fusion images. To overcome the shortage of image data, transfer learning is utilized to apply the well-trained model to fire image fusion, thereby improving fusion precision. The experimental results demonstrate that VTW-GAN outperforms the DenseFuse, IFCNN, U2Fusion, SwinFusion, and TGFuse methods in both objective and subjective aspects. Specifically, on the KAIST dataset, the fusion images show significant improvements in Entropy (EN), Mutual Information (MI), and Quality Assessment based on Gradient-based Fusion (Qabf) by 2.78%, 11.89%, and 10.45%, respectively, over the next-best values. On the Corsican Fire dataset, compared to data-limited fusion models, the transfer-learned fusion images enhance the Standard Deviation (SD) and MI by 10.69% and 11.73%, respectively, and compared to other methods, they perform well in Average Gradient (AG), SD, and MI, improving them by 3.43%, 4.84%, and 4.21%, respectively, from the next-best values. Compared with DenseFuse, the operation efficiency is improved by 78.3%. The method achieves favorable subjective image outcomes and is effective for fire-detection applications.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于小波引导集合视觉变换器生成式对抗网络的红外/可见光火灾图像融合方法

为了解决红外/可见光火灾图像融合中细节丢失、匹配数据集有限和融合精度低等问题，提出了一种基于小波引导集合视觉变换器生成对抗网络（VTW-GAN）的新方法。该算法采用生成器和判别器网络结构，将变换器的高效全局表示能力与小波引导池化技术相结合，以提取更细粒度的特征并重建更高质量的融合图像。为了克服图像数据不足的问题，利用迁移学习将训练有素的模型应用到火图像融合中，从而提高融合精度。实验结果表明，VTW-GAN 在客观和主观方面都优于 DenseFuse、IFCNN、U2Fusion、SwinFusion 和 TGFuse 方法。具体来说，在 KAIST 数据集上，融合图像在熵 (EN)、互信息 (MI) 和基于梯度融合的质量评估 (Qabf) 方面比次佳值分别提高了 2.78%、11.89% 和 10.45%。在科西嘉火灾数据集上，与数据有限的融合模型相比，迁移学习融合图像的标准偏差（SD）和MI分别提高了10.69%和11.73%；与其他方法相比，迁移学习融合图像在平均梯度（AG）、SD和MI方面表现良好，分别比次佳值提高了3.43%、4.84%和4.21%。与 DenseFuse 相比，操作效率提高了 78.3%。该方法获得了良好的主观图像效果，在火灾探测应用中非常有效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Forests FORESTRY-

CiteScore

4.40

自引率

17.20%

发文量

1823

审稿时长

19.02 days

期刊介绍： Forests (ISSN 1999-4907) is an international and cross-disciplinary scholarly journal of forestry and forest ecology. It publishes research papers, short communications and review papers. There is no restriction on the length of the papers. Our aim is to encourage scientists to publish their experimental and theoretical research in as much detail as possible. Full experimental and/or methodical details must be provided for research articles.