用于视频去噪的包含多重融合的实用门控循环变压器网络

Kai Guo, Seungwon Choi, Jongseong Choi, Lae-Hoon Kim
{"title":"用于视频去噪的包含多重融合的实用门控循环变压器网络","authors":"Kai Guo, Seungwon Choi, Jongseong Choi, Lae-Hoon Kim","doi":"arxiv-2409.06603","DOIUrl":null,"url":null,"abstract":"State-of-the-art (SOTA) video denoising methods employ multi-frame\nsimultaneous denoising mechanisms, resulting in significant delays (e.g., 16\nframes), making them impractical for real-time cameras. To overcome this\nlimitation, we propose a multi-fusion gated recurrent Transformer network\n(GRTN) that achieves SOTA denoising performance with only a single-frame delay.\nSpecifically, the spatial denoising module extracts features from the current\nframe, while the reset gate selects relevant information from the previous\nframe and fuses it with current frame features via the temporal denoising\nmodule. The update gate then further blends this result with the previous frame\nfeatures, and the reconstruction module integrates it with the current frame.\nTo robustly compute attention for noisy features, we propose a residual\nsimplified Swin Transformer with Euclidean distance (RSSTE) in the spatial and\ntemporal denoising modules. Comparative objective and subjective results show\nthat our GRTN achieves denoising performance comparable to SOTA multi-frame\ndelay networks, with only a single-frame delay.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Practical Gated Recurrent Transformer Network Incorporating Multiple Fusions for Video Denoising\",\"authors\":\"Kai Guo, Seungwon Choi, Jongseong Choi, Lae-Hoon Kim\",\"doi\":\"arxiv-2409.06603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"State-of-the-art (SOTA) video denoising methods employ multi-frame\\nsimultaneous denoising mechanisms, resulting in significant delays (e.g., 16\\nframes), making them impractical for real-time cameras. To overcome this\\nlimitation, we propose a multi-fusion gated recurrent Transformer network\\n(GRTN) that achieves SOTA denoising performance with only a single-frame delay.\\nSpecifically, the spatial denoising module extracts features from the current\\nframe, while the reset gate selects relevant information from the previous\\nframe and fuses it with current frame features via the temporal denoising\\nmodule. The update gate then further blends this result with the previous frame\\nfeatures, and the reconstruction module integrates it with the current frame.\\nTo robustly compute attention for noisy features, we propose a residual\\nsimplified Swin Transformer with Euclidean distance (RSSTE) in the spatial and\\ntemporal denoising modules. Comparative objective and subjective results show\\nthat our GRTN achieves denoising performance comparable to SOTA multi-frame\\ndelay networks, with only a single-frame delay.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06603\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

最先进的(SOTA)视频去噪方法采用了多帧同时去噪机制,导致显著的延迟(例如 16 帧),使其不适用于实时摄像机。为了克服这一限制,我们提出了一种多融合门控递归变换器网络(GRTN),它只需单帧延迟就能实现 SOTA 去噪性能。具体来说,空间去噪模块从当前帧中提取特征,而重置门则从先前帧中选择相关信息,并通过时间去噪模块将其与当前帧特征融合。为了稳健地计算噪声特征的关注度,我们在空间和时间去噪模块中提出了带欧氏距离的残差简化斯文变换器(RSSTE)。客观和主观的比较结果表明,我们的 GRTN 在仅有单帧延迟的情况下实现了与 SOTA 多帧延迟网络相当的去噪性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Practical Gated Recurrent Transformer Network Incorporating Multiple Fusions for Video Denoising
State-of-the-art (SOTA) video denoising methods employ multi-frame simultaneous denoising mechanisms, resulting in significant delays (e.g., 16 frames), making them impractical for real-time cameras. To overcome this limitation, we propose a multi-fusion gated recurrent Transformer network (GRTN) that achieves SOTA denoising performance with only a single-frame delay. Specifically, the spatial denoising module extracts features from the current frame, while the reset gate selects relevant information from the previous frame and fuses it with current frame features via the temporal denoising module. The update gate then further blends this result with the previous frame features, and the reconstruction module integrates it with the current frame. To robustly compute attention for noisy features, we propose a residual simplified Swin Transformer with Euclidean distance (RSSTE) in the spatial and temporal denoising modules. Comparative objective and subjective results show that our GRTN achieves denoising performance comparable to SOTA multi-frame delay networks, with only a single-frame delay.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
multiPI-TransBTS: A Multi-Path Learning Framework for Brain Tumor Image Segmentation Based on Multi-Physical Information Autopet III challenge: Incorporating anatomical knowledge into nnUNet for lesion segmentation in PET/CT Denoising diffusion models for high-resolution microscopy image restoration Tumor aware recurrent inter-patient deformable image registration of computed tomography scans with lung cancer Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation using Rein to Fine-tune Vision Foundation Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1