用于视频压缩成像的具有变压器先验的降级感知深度展开网络

IF 3.4 2区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Signal Processing Pub Date : 2024-08-30 DOI:10.1016/j.sigpro.2024.109660

Jianfu Yin , Nan Wang , Binliang Hu , Yao Wang , Quan Wang

{"title":"用于视频压缩成像的具有变压器先验的降级感知深度展开网络","authors":"Jianfu Yin , Nan Wang , Binliang Hu , Yao Wang , Quan Wang","doi":"10.1016/j.sigpro.2024.109660","DOIUrl":null,"url":null,"abstract":"<div><p>In video snapshot compressive imaging (SCI) systems, video reconstruction methods are used to recover spatial–temporal-correlated video frame signals from a compressed measurement. While unfolding methods have demonstrated promising performance, they encounter two challenges: (1) They lack the ability to estimate degradation patterns and the degree of ill-posedness from video SCI, which hampers guiding and supervising the iterative learning process. (2) The prevailing reliance on 3D-CNNs in these methods limits their capacity to capture long-range dependencies. To address these concerns, this paper introduces the Degradation-Aware Deep Unfolding Network (DADUN). DADUN leverages estimated priors from compressed frames and the physical mask to guide and control each iteration. We also develop a novel Bidirectional Propagation Convolutional Recurrent Neural Network (BiP-CRNN) that simultaneously captures both intra-frame contents and inter-frame dependencies. By plugging BiP-CRNN into DADUN, we establish a novel end-to-end (E2E) and data-dependent deep unfolding method, DADUN with transformer prior (TP), for video sequence reconstruction. Experimental results on various video sequences show the effectiveness of our proposed approach, which is also robust to random masks and has wide generalization bounds.</p></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"227 ","pages":"Article 109660"},"PeriodicalIF":3.4000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Degradation-aware deep unfolding network with transformer prior for video compressive imaging\",\"authors\":\"Jianfu Yin , Nan Wang , Binliang Hu , Yao Wang , Quan Wang\",\"doi\":\"10.1016/j.sigpro.2024.109660\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In video snapshot compressive imaging (SCI) systems, video reconstruction methods are used to recover spatial–temporal-correlated video frame signals from a compressed measurement. While unfolding methods have demonstrated promising performance, they encounter two challenges: (1) They lack the ability to estimate degradation patterns and the degree of ill-posedness from video SCI, which hampers guiding and supervising the iterative learning process. (2) The prevailing reliance on 3D-CNNs in these methods limits their capacity to capture long-range dependencies. To address these concerns, this paper introduces the Degradation-Aware Deep Unfolding Network (DADUN). DADUN leverages estimated priors from compressed frames and the physical mask to guide and control each iteration. We also develop a novel Bidirectional Propagation Convolutional Recurrent Neural Network (BiP-CRNN) that simultaneously captures both intra-frame contents and inter-frame dependencies. By plugging BiP-CRNN into DADUN, we establish a novel end-to-end (E2E) and data-dependent deep unfolding method, DADUN with transformer prior (TP), for video sequence reconstruction. Experimental results on various video sequences show the effectiveness of our proposed approach, which is also robust to random masks and has wide generalization bounds.</p></div>\",\"PeriodicalId\":49523,\"journal\":{\"name\":\"Signal Processing\",\"volume\":\"227 \",\"pages\":\"Article 109660\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165168424002809\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168424002809","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

在视频快照压缩成像（SCI）系统中，视频重建方法用于从压缩测量中恢复空间-时间相关的视频帧信号。虽然展开方法表现出了良好的性能，但它们也遇到了两个挑战：（1）它们缺乏从视频 SCI 中估计退化模式和不确定性程度的能力，这妨碍了对迭代学习过程的指导和监督。(2) 这些方法普遍依赖 3D-CNN，这限制了它们捕捉长距离依赖关系的能力。为了解决这些问题，本文介绍了降解感知深度展开网络（DADUN）。DADUN 利用压缩帧和物理遮罩的估计先验来指导和控制每次迭代。我们还开发了一种新型双向传播卷积递归神经网络（BiP-CRNN），可同时捕捉帧内内容和帧间依赖关系。通过将 BiP-CRNN 插入 DADUN，我们建立了一种用于视频序列重建的新型端到端（E2E）和依赖数据的深度展开方法，即带有变压器先验（TP）的 DADUN。在各种视频序列上的实验结果表明了我们提出的方法的有效性，该方法对随机掩码也具有鲁棒性，并具有广泛的泛化边界。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Degradation-aware deep unfolding network with transformer prior for video compressive imaging

In video snapshot compressive imaging (SCI) systems, video reconstruction methods are used to recover spatial–temporal-correlated video frame signals from a compressed measurement. While unfolding methods have demonstrated promising performance, they encounter two challenges: (1) They lack the ability to estimate degradation patterns and the degree of ill-posedness from video SCI, which hampers guiding and supervising the iterative learning process. (2) The prevailing reliance on 3D-CNNs in these methods limits their capacity to capture long-range dependencies. To address these concerns, this paper introduces the Degradation-Aware Deep Unfolding Network (DADUN). DADUN leverages estimated priors from compressed frames and the physical mask to guide and control each iteration. We also develop a novel Bidirectional Propagation Convolutional Recurrent Neural Network (BiP-CRNN) that simultaneously captures both intra-frame contents and inter-frame dependencies. By plugging BiP-CRNN into DADUN, we establish a novel end-to-end (E2E) and data-dependent deep unfolding method, DADUN with transformer prior (TP), for video sequence reconstruction. Experimental results on various video sequences show the effectiveness of our proposed approach, which is also robust to random masks and has wide generalization bounds.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Signal Processing 工程技术-工程：电子与电气

CiteScore

9.20

自引率

9.10%

发文量

309

审稿时长

41 days

期刊介绍： Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing. Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.