端到端视频快照压缩成像使用视频变压器

2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA) Pub Date : 2022-04-19 DOI:10.1109/IPTA54936.2022.9784128

Wael Saideni, F. Courrèges, D. Helbert, J. Cances

{"title":"端到端视频快照压缩成像使用视频变压器","authors":"Wael Saideni, F. Courrèges, D. Helbert, J. Cances","doi":"10.1109/IPTA54936.2022.9784128","DOIUrl":null,"url":null,"abstract":"This paper presents a novel reconstruction algorithm for video Snapshot Compressive Imaging (SCI). Inspired by recent research works on Transformers and Self-Attention mechanism in computer vision, we propose the first video SCI reconstruction algorithm built upon Transformers to capture long-range spatio-temporal dependencies enabling the deep learning of feature maps. Our approach is based on a Spatiotempo-ral Convolutional Multi-head Attention (ST-ConvMHA) which enable to exploit the spatial and temporal information of the video scenes instead of using fully-connected attention layers. To evaluate the performances of our approach, we train our algorithm on DAVIS2017 dataset and we test the trained models on six benchmark datasets. The obtained results in terms of PSNR, SSIM and especially reconstruction time prove the ability of using our reconstruction approach for real-time applications. We truly believe that our research will motivate future works for more video reconstruction approaches.","PeriodicalId":381729,"journal":{"name":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"End-to-End Video Snapshot Compressive Imaging using Video Transformers\",\"authors\":\"Wael Saideni, F. Courrèges, D. Helbert, J. Cances\",\"doi\":\"10.1109/IPTA54936.2022.9784128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel reconstruction algorithm for video Snapshot Compressive Imaging (SCI). Inspired by recent research works on Transformers and Self-Attention mechanism in computer vision, we propose the first video SCI reconstruction algorithm built upon Transformers to capture long-range spatio-temporal dependencies enabling the deep learning of feature maps. Our approach is based on a Spatiotempo-ral Convolutional Multi-head Attention (ST-ConvMHA) which enable to exploit the spatial and temporal information of the video scenes instead of using fully-connected attention layers. To evaluate the performances of our approach, we train our algorithm on DAVIS2017 dataset and we test the trained models on six benchmark datasets. The obtained results in terms of PSNR, SSIM and especially reconstruction time prove the ability of using our reconstruction approach for real-time applications. We truly believe that our research will motivate future works for more video reconstruction approaches.\",\"PeriodicalId\":381729,\"journal\":{\"name\":\"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPTA54936.2022.9784128\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPTA54936.2022.9784128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

提出了一种新的视频快照压缩成像(SCI)重构算法。受最近计算机视觉中变形金刚和自注意机制的研究工作的启发，我们提出了第一个基于变形金刚的视频SCI重建算法，以捕获远程时空依赖关系，实现特征图的深度学习。我们的方法是基于时空卷积多头注意(ST-ConvMHA)，它能够利用视频场景的空间和时间信息，而不是使用完全连接的注意层。为了评估我们的方法的性能，我们在DAVIS2017数据集上训练我们的算法，并在六个基准数据集上测试训练好的模型。在PSNR, SSIM，特别是重构时间方面的结果证明了我们的重构方法在实时应用中的能力。我们真的相信我们的研究将激励更多的视频重建方法的未来工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

End-to-End Video Snapshot Compressive Imaging using Video Transformers

This paper presents a novel reconstruction algorithm for video Snapshot Compressive Imaging (SCI). Inspired by recent research works on Transformers and Self-Attention mechanism in computer vision, we propose the first video SCI reconstruction algorithm built upon Transformers to capture long-range spatio-temporal dependencies enabling the deep learning of feature maps. Our approach is based on a Spatiotempo-ral Convolutional Multi-head Attention (ST-ConvMHA) which enable to exploit the spatial and temporal information of the video scenes instead of using fully-connected attention layers. To evaluate the performances of our approach, we train our algorithm on DAVIS2017 dataset and we test the trained models on six benchmark datasets. The obtained results in terms of PSNR, SSIM and especially reconstruction time prove the ability of using our reconstruction approach for real-time applications. We truly believe that our research will motivate future works for more video reconstruction approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 Eleventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

自引率

0.00%

发文量