Deblurring Videos Using Spatial-Temporal Contextual Transformer With Feature Propagation

IF 13.7 IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-10-24 DOI:10.1109/TIP.2024.3482176

Liyan Zhang;Boming Xu;Zhongbao Yang;Jinshan Pan

{"title":"Deblurring Videos Using Spatial-Temporal Contextual Transformer With Feature Propagation","authors":"Liyan Zhang;Boming Xu;Zhongbao Yang;Jinshan Pan","doi":"10.1109/TIP.2024.3482176","DOIUrl":null,"url":null,"abstract":"We present a simple and effective approach to explore both local spatial-temporal contexts and non-local temporal information for video deblurring. First, we develop an effective spatial-temporal contextual transformer to explore local spatial-temporal contexts from videos. As the features extracted by the spatial-temporal contextual transformer does not model the non-local temporal information of video well, we then develop a feature propagation method to aggregate useful features from the long-range frames so that both local spatial-temporal contexts and non-local temporal information can be better utilized for video deblurring. Finally, we formulate the spatial-temporal contextual transformer with the feature propagation into a unified deep convolutional neural network (CNN) and train it in an end-to-end manner. We show that using the spatial-temporal contextual transformer with the feature propagation is able to generate useful features and makes the deep CNN model more compact and effective for video deblurring. Extensive experimental results show that the proposed method performs favorably against state-of-the-art ones on the benchmark datasets in terms of accuracy and model parameters.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6354-6366"},"PeriodicalIF":13.7000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10735093/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We present a simple and effective approach to explore both local spatial-temporal contexts and non-local temporal information for video deblurring. First, we develop an effective spatial-temporal contextual transformer to explore local spatial-temporal contexts from videos. As the features extracted by the spatial-temporal contextual transformer does not model the non-local temporal information of video well, we then develop a feature propagation method to aggregate useful features from the long-range frames so that both local spatial-temporal contexts and non-local temporal information can be better utilized for video deblurring. Finally, we formulate the spatial-temporal contextual transformer with the feature propagation into a unified deep convolutional neural network (CNN) and train it in an end-to-end manner. We show that using the spatial-temporal contextual transformer with the feature propagation is able to generate useful features and makes the deep CNN model more compact and effective for video deblurring. Extensive experimental results show that the proposed method performs favorably against state-of-the-art ones on the benchmark datasets in terms of accuracy and model parameters.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用带有特征传播功能的时空上下文变换器对视频进行去模糊处理

我们提出了一种简单有效的方法，既能探索视频去模糊的局部时空背景，又能探索非局部时空信息。首先，我们开发了一种有效的时空上下文转换器来探索视频中的局部时空上下文。由于空间-时间上下文变换器提取的特征不能很好地模拟视频的非局部时间信息，因此我们开发了一种特征传播方法，从远距离帧中汇集有用的特征，从而更好地利用局部空间-时间上下文和非局部时间信息进行视频去模糊。最后，我们将空间-时间上下文转换器与特征传播技术整合为一个统一的深度卷积神经网络（CNN），并以端到端的方式对其进行训练。我们的研究表明，将时空上下文变换器与特征传播相结合能够生成有用的特征，并使深度卷积神经网络模型更紧凑、更有效地用于视频去模糊。广泛的实验结果表明，在基准数据集上，所提出的方法在准确性和模型参数方面都优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量

期刊最新文献

Sharpness-aware Fine-Tuning for OOD Detection. Match Any Keypoints. USIGAN: Unbalanced Self-Information Feature Transport for Weakly Paired Image IHC Virtual Staining. SynPO: Synergizing Descriptiveness and Preference Optimization for Video Detailed Captioning. JDPNet: A Network Based on Joint Degradation Processing for Underwater Image Enhancement