FreqGAN: Infrared and Visible Image Fusion via Unified Frequency Adversarial Learning

IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2024-09-13 DOI:10.1109/TCSVT.2024.3460172
Zhishe Wang;Zhuoqun Zhang;Wuqiang Qi;Fengbao Yang;Jiawei Xu
{"title":"FreqGAN: Infrared and Visible Image Fusion via Unified Frequency Adversarial Learning","authors":"Zhishe Wang;Zhuoqun Zhang;Wuqiang Qi;Fengbao Yang;Jiawei Xu","doi":"10.1109/TCSVT.2024.3460172","DOIUrl":null,"url":null,"abstract":"Traditional fusion methods based on deep learning mainly employ convolutional or self-attention operations to model local or global dependencies, which often lead to the oversight of frequency-domain information. To address this deficiency, we introduce a unified frequency adversarial learning network, termed FreqGAN. Our method involves a frequency-compensated generator that employs discrete wavelet transformation to decompose encoded spatial features into multiple frequency bands. Leveraging skip connections, low and high-frequency components are respectively directed into the encoder and decoder, compensating for additional outline and detail. Moreover, we construct a hybrid frequency aggregation module, which enables a progressive optimization of activity levels across multiple scales and makes the various frequency bands correlated. Complementing our generative model, we devise dual frequency-constrained discriminators. These discriminators are tasked with dynamically adjusting weights for each input frequency band, thereby obligating the generator to accurately reconstruct salient frequency information from different modality images. Additionally, a frequency-supervised function is formulated to further safeguard against the loss of frequency information. Our comprehensive experimental evaluations, encompassing a wide range of fusion tasks and subsequent applications, distinctly highlight FreqGAN’s superior performance, establishing it as a frontrunner in comparison to existing state-of-the-art alternatives. The source codes are forthcoming at: <uri>https://github.com/Zhishe-Wang/FreqGAN</uri>.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 1","pages":"728-740"},"PeriodicalIF":11.1000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10680110/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Traditional fusion methods based on deep learning mainly employ convolutional or self-attention operations to model local or global dependencies, which often lead to the oversight of frequency-domain information. To address this deficiency, we introduce a unified frequency adversarial learning network, termed FreqGAN. Our method involves a frequency-compensated generator that employs discrete wavelet transformation to decompose encoded spatial features into multiple frequency bands. Leveraging skip connections, low and high-frequency components are respectively directed into the encoder and decoder, compensating for additional outline and detail. Moreover, we construct a hybrid frequency aggregation module, which enables a progressive optimization of activity levels across multiple scales and makes the various frequency bands correlated. Complementing our generative model, we devise dual frequency-constrained discriminators. These discriminators are tasked with dynamically adjusting weights for each input frequency band, thereby obligating the generator to accurately reconstruct salient frequency information from different modality images. Additionally, a frequency-supervised function is formulated to further safeguard against the loss of frequency information. Our comprehensive experimental evaluations, encompassing a wide range of fusion tasks and subsequent applications, distinctly highlight FreqGAN’s superior performance, establishing it as a frontrunner in comparison to existing state-of-the-art alternatives. The source codes are forthcoming at: https://github.com/Zhishe-Wang/FreqGAN.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FreqGAN:通过统一频率对抗学习实现红外和可见光图像融合
传统的基于深度学习的融合方法主要采用卷积或自关注运算来建模局部或全局依赖关系,这往往导致频域信息的疏忽。为了解决这一缺陷,我们引入了一个统一的频率对抗学习网络,称为FreqGAN。我们的方法包括一个频率补偿发生器,它使用离散小波变换将编码的空间特征分解成多个频段。利用跳过连接,低频和高频组件分别直接进入编码器和解码器,补偿额外的轮廓和细节。此外,我们构建了一个混合频率聚合模块,该模块可以跨多个尺度逐步优化活动水平,并使各个频段相互关联。补充我们的生成模型,我们设计了双频率约束鉴别器。这些鉴别器的任务是动态调整每个输入频段的权重,从而要求生成器从不同的模态图像中准确地重建显着频率信息。此外,还制定了频率监督函数,以进一步防止频率信息的丢失。我们全面的实验评估,包括广泛的融合任务和后续应用,明显突出了FreqGAN的优越性能,与现有的最先进的替代品相比,使其成为领先者。源代码在:https://github.com/Zhishe-Wang/FreqGAN。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
13.80
自引率
27.40%
发文量
660
审稿时长
5 months
期刊介绍: The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.
期刊最新文献
IEEE Circuits and Systems Society Information IEEE Circuits and Systems Society Information IEEE Circuits and Systems Society Information 2025 Index IEEE Transactions on Circuits and Systems for Video Technology IEEE Circuits and Systems Society Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1