FreqGAN: Infrared and Visible Image Fusion via Unified Frequency Adversarial Learning

IF 11.1 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2024-09-13 DOI:10.1109/TCSVT.2024.3460172

Zhishe Wang;Zhuoqun Zhang;Wuqiang Qi;Fengbao Yang;Jiawei Xu

{"title":"FreqGAN: Infrared and Visible Image Fusion via Unified Frequency Adversarial Learning","authors":"Zhishe Wang;Zhuoqun Zhang;Wuqiang Qi;Fengbao Yang;Jiawei Xu","doi":"10.1109/TCSVT.2024.3460172","DOIUrl":null,"url":null,"abstract":"Traditional fusion methods based on deep learning mainly employ convolutional or self-attention operations to model local or global dependencies, which often lead to the oversight of frequency-domain information. To address this deficiency, we introduce a unified frequency adversarial learning network, termed FreqGAN. Our method involves a frequency-compensated generator that employs discrete wavelet transformation to decompose encoded spatial features into multiple frequency bands. Leveraging skip connections, low and high-frequency components are respectively directed into the encoder and decoder, compensating for additional outline and detail. Moreover, we construct a hybrid frequency aggregation module, which enables a progressive optimization of activity levels across multiple scales and makes the various frequency bands correlated. Complementing our generative model, we devise dual frequency-constrained discriminators. These discriminators are tasked with dynamically adjusting weights for each input frequency band, thereby obligating the generator to accurately reconstruct salient frequency information from different modality images. Additionally, a frequency-supervised function is formulated to further safeguard against the loss of frequency information. Our comprehensive experimental evaluations, encompassing a wide range of fusion tasks and subsequent applications, distinctly highlight FreqGAN’s superior performance, establishing it as a frontrunner in comparison to existing state-of-the-art alternatives. The source codes are forthcoming at: <uri>https://github.com/Zhishe-Wang/FreqGAN</uri>.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 1","pages":"728-740"},"PeriodicalIF":11.1000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10680110/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional fusion methods based on deep learning mainly employ convolutional or self-attention operations to model local or global dependencies, which often lead to the oversight of frequency-domain information. To address this deficiency, we introduce a unified frequency adversarial learning network, termed FreqGAN. Our method involves a frequency-compensated generator that employs discrete wavelet transformation to decompose encoded spatial features into multiple frequency bands. Leveraging skip connections, low and high-frequency components are respectively directed into the encoder and decoder, compensating for additional outline and detail. Moreover, we construct a hybrid frequency aggregation module, which enables a progressive optimization of activity levels across multiple scales and makes the various frequency bands correlated. Complementing our generative model, we devise dual frequency-constrained discriminators. These discriminators are tasked with dynamically adjusting weights for each input frequency band, thereby obligating the generator to accurately reconstruct salient frequency information from different modality images. Additionally, a frequency-supervised function is formulated to further safeguard against the loss of frequency information. Our comprehensive experimental evaluations, encompassing a wide range of fusion tasks and subsequent applications, distinctly highlight FreqGAN’s superior performance, establishing it as a frontrunner in comparison to existing state-of-the-art alternatives. The source codes are forthcoming at: https://github.com/Zhishe-Wang/FreqGAN.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

FreqGAN：通过统一频率对抗学习实现红外和可见光图像融合

传统的基于深度学习的融合方法主要采用卷积或自关注运算来建模局部或全局依赖关系，这往往导致频域信息的疏忽。为了解决这一缺陷，我们引入了一个统一的频率对抗学习网络，称为FreqGAN。我们的方法包括一个频率补偿发生器，它使用离散小波变换将编码的空间特征分解成多个频段。利用跳过连接，低频和高频组件分别直接进入编码器和解码器，补偿额外的轮廓和细节。此外，我们构建了一个混合频率聚合模块，该模块可以跨多个尺度逐步优化活动水平，并使各个频段相互关联。补充我们的生成模型，我们设计了双频率约束鉴别器。这些鉴别器的任务是动态调整每个输入频段的权重，从而要求生成器从不同的模态图像中准确地重建显着频率信息。此外，还制定了频率监督函数，以进一步防止频率信息的丢失。我们全面的实验评估，包括广泛的融合任务和后续应用，明显突出了FreqGAN的优越性能，与现有的最先进的替代品相比，使其成为领先者。源代码在：https://github.com/Zhishe-Wang/FreqGAN。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.

期刊最新文献

IEEE Circuits and Systems Society Information IEEE Circuits and Systems Society Information IEEE Circuits and Systems Society Information 2025 Index IEEE Transactions on Circuits and Systems for Video Technology IEEE Circuits and Systems Society Information