使用带有特征传播功能的时空上下文变换器对视频进行去模糊处理

Liyan Zhang;Boming Xu;Zhongbao Yang;Jinshan Pan
{"title":"使用带有特征传播功能的时空上下文变换器对视频进行去模糊处理","authors":"Liyan Zhang;Boming Xu;Zhongbao Yang;Jinshan Pan","doi":"10.1109/TIP.2024.3482176","DOIUrl":null,"url":null,"abstract":"We present a simple and effective approach to explore both local spatial-temporal contexts and non-local temporal information for video deblurring. First, we develop an effective spatial-temporal contextual transformer to explore local spatial-temporal contexts from videos. As the features extracted by the spatial-temporal contextual transformer does not model the non-local temporal information of video well, we then develop a feature propagation method to aggregate useful features from the long-range frames so that both local spatial-temporal contexts and non-local temporal information can be better utilized for video deblurring. Finally, we formulate the spatial-temporal contextual transformer with the feature propagation into a unified deep convolutional neural network (CNN) and train it in an end-to-end manner. We show that using the spatial-temporal contextual transformer with the feature propagation is able to generate useful features and makes the deep CNN model more compact and effective for video deblurring. Extensive experimental results show that the proposed method performs favorably against state-of-the-art ones on the benchmark datasets in terms of accuracy and model parameters.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6354-6366"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deblurring Videos Using Spatial-Temporal Contextual Transformer With Feature Propagation\",\"authors\":\"Liyan Zhang;Boming Xu;Zhongbao Yang;Jinshan Pan\",\"doi\":\"10.1109/TIP.2024.3482176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a simple and effective approach to explore both local spatial-temporal contexts and non-local temporal information for video deblurring. First, we develop an effective spatial-temporal contextual transformer to explore local spatial-temporal contexts from videos. As the features extracted by the spatial-temporal contextual transformer does not model the non-local temporal information of video well, we then develop a feature propagation method to aggregate useful features from the long-range frames so that both local spatial-temporal contexts and non-local temporal information can be better utilized for video deblurring. Finally, we formulate the spatial-temporal contextual transformer with the feature propagation into a unified deep convolutional neural network (CNN) and train it in an end-to-end manner. We show that using the spatial-temporal contextual transformer with the feature propagation is able to generate useful features and makes the deep CNN model more compact and effective for video deblurring. Extensive experimental results show that the proposed method performs favorably against state-of-the-art ones on the benchmark datasets in terms of accuracy and model parameters.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":\"33 \",\"pages\":\"6354-6366\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10735093/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10735093/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们提出了一种简单有效的方法,既能探索视频去模糊的局部时空背景,又能探索非局部时空信息。首先,我们开发了一种有效的时空上下文转换器来探索视频中的局部时空上下文。由于空间-时间上下文变换器提取的特征不能很好地模拟视频的非局部时间信息,因此我们开发了一种特征传播方法,从远距离帧中汇集有用的特征,从而更好地利用局部空间-时间上下文和非局部时间信息进行视频去模糊。最后,我们将空间-时间上下文转换器与特征传播技术整合为一个统一的深度卷积神经网络(CNN),并以端到端的方式对其进行训练。我们的研究表明,将时空上下文变换器与特征传播相结合能够生成有用的特征,并使深度卷积神经网络模型更紧凑、更有效地用于视频去模糊。广泛的实验结果表明,在基准数据集上,所提出的方法在准确性和模型参数方面都优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Deblurring Videos Using Spatial-Temporal Contextual Transformer With Feature Propagation
We present a simple and effective approach to explore both local spatial-temporal contexts and non-local temporal information for video deblurring. First, we develop an effective spatial-temporal contextual transformer to explore local spatial-temporal contexts from videos. As the features extracted by the spatial-temporal contextual transformer does not model the non-local temporal information of video well, we then develop a feature propagation method to aggregate useful features from the long-range frames so that both local spatial-temporal contexts and non-local temporal information can be better utilized for video deblurring. Finally, we formulate the spatial-temporal contextual transformer with the feature propagation into a unified deep convolutional neural network (CNN) and train it in an end-to-end manner. We show that using the spatial-temporal contextual transformer with the feature propagation is able to generate useful features and makes the deep CNN model more compact and effective for video deblurring. Extensive experimental results show that the proposed method performs favorably against state-of-the-art ones on the benchmark datasets in terms of accuracy and model parameters.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning Cross-Attention Point Transformer With Global Porous Sampling Salient Object Detection From Arbitrary Modalities GSSF: Generalized Structural Sparse Function for Deep Cross-Modal Metric Learning AnlightenDiff: Anchoring Diffusion Probabilistic Model on Low Light Image Enhancement Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1