具有自适应时序先验和解码运动辅助质量增强功能的学习视频压缩技术

IF 5.2 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-27 DOI:10.1145/3661824
Jiayu Yang, Chunhui Yang, Fei Xiong, Yongqi Zhai, Ronggang Wang
{"title":"具有自适应时序先验和解码运动辅助质量增强功能的学习视频压缩技术","authors":"Jiayu Yang, Chunhui Yang, Fei Xiong, Yongqi Zhai, Ronggang Wang","doi":"10.1145/3661824","DOIUrl":null,"url":null,"abstract":"<p>Learned video compression has drawn great attention and shown promising compression performance recently. In this paper, we focus on the two components in learned video compression framework, i.e., conditional entropy model and quality enhancement module, to improve compression performance. Specifically, we propose an adaptive spatial-temporal entropy model for image, motion and residual compression, which introduces temporal prior to reduce temporal redundancy of latents and an additional modulated mask to evaluate the similarity and perform refinement. Besides, a quality enhancement module is proposed for predicted frame and reconstructed frame to improve frame quality and reduce bitrate cost of residual coding. The module reuses decoded optical flow as motion prior and utilizes deformable convolution to mine high-quality information from reference frame in a bit-free manner. The two proposed coding tools are integrated into a pixel-domain residual-coding based compression framework to evaluate their effectiveness. Experimental results demonstrate that our framework achieves competitive compression performance in low-delay scenario, compared with recent learning-based methods and traditional H.265/HEVC in terms of PSNR and MS-SSIM. The code is available at OpenLVC.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"50 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learned Video Compression with Adaptive Temporal Prior and Decoded Motion-aided Quality Enhancement\",\"authors\":\"Jiayu Yang, Chunhui Yang, Fei Xiong, Yongqi Zhai, Ronggang Wang\",\"doi\":\"10.1145/3661824\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Learned video compression has drawn great attention and shown promising compression performance recently. In this paper, we focus on the two components in learned video compression framework, i.e., conditional entropy model and quality enhancement module, to improve compression performance. Specifically, we propose an adaptive spatial-temporal entropy model for image, motion and residual compression, which introduces temporal prior to reduce temporal redundancy of latents and an additional modulated mask to evaluate the similarity and perform refinement. Besides, a quality enhancement module is proposed for predicted frame and reconstructed frame to improve frame quality and reduce bitrate cost of residual coding. The module reuses decoded optical flow as motion prior and utilizes deformable convolution to mine high-quality information from reference frame in a bit-free manner. The two proposed coding tools are integrated into a pixel-domain residual-coding based compression framework to evaluate their effectiveness. Experimental results demonstrate that our framework achieves competitive compression performance in low-delay scenario, compared with recent learning-based methods and traditional H.265/HEVC in terms of PSNR and MS-SSIM. The code is available at OpenLVC.</p>\",\"PeriodicalId\":50937,\"journal\":{\"name\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"volume\":\"50 1\",\"pages\":\"\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-04-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3661824\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3661824","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

最近,学习视频压缩引起了广泛关注,并显示出良好的压缩性能。在本文中,我们将重点关注学习视频压缩框架中的两个组件,即条件熵模型和质量增强模块,以提高压缩性能。具体来说,我们提出了一种用于图像、运动和残差压缩的自适应时空熵模型,该模型引入了时间先验来减少潜变量的时间冗余,并引入了一个额外的调制掩码来评估相似性并进行细化。此外,还针对预测帧和重建帧提出了质量增强模块,以提高帧质量并降低残差编码的比特率成本。该模块重新使用解码光流作为运动先验,并利用可变形卷积以无比特方式从参考帧中挖掘高质量信息。为了评估这两种编码工具的有效性,我们将它们集成到一个基于像素域残差编码的压缩框架中。实验结果表明,就 PSNR 和 MS-SSIM 而言,与最新的基于学习的方法和传统 H.265/HEVC 相比,我们的框架在低延迟场景下实现了有竞争力的压缩性能。代码可在 OpenLVC 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learned Video Compression with Adaptive Temporal Prior and Decoded Motion-aided Quality Enhancement

Learned video compression has drawn great attention and shown promising compression performance recently. In this paper, we focus on the two components in learned video compression framework, i.e., conditional entropy model and quality enhancement module, to improve compression performance. Specifically, we propose an adaptive spatial-temporal entropy model for image, motion and residual compression, which introduces temporal prior to reduce temporal redundancy of latents and an additional modulated mask to evaluate the similarity and perform refinement. Besides, a quality enhancement module is proposed for predicted frame and reconstructed frame to improve frame quality and reduce bitrate cost of residual coding. The module reuses decoded optical flow as motion prior and utilizes deformable convolution to mine high-quality information from reference frame in a bit-free manner. The two proposed coding tools are integrated into a pixel-domain residual-coding based compression framework to evaluate their effectiveness. Experimental results demonstrate that our framework achieves competitive compression performance in low-delay scenario, compared with recent learning-based methods and traditional H.265/HEVC in terms of PSNR and MS-SSIM. The code is available at OpenLVC.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.50
自引率
5.90%
发文量
285
审稿时长
7.5 months
期刊介绍: The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.
期刊最新文献
TA-Detector: A GNN-based Anomaly Detector via Trust Relationship KF-VTON: Keypoints-Driven Flow Based Virtual Try-On Network Unified View Empirical Study for Large Pretrained Model on Cross-Domain Few-Shot Learning Multimodal Fusion for Talking Face Generation Utilizing Speech-related Facial Action Units Compressed Point Cloud Quality Index by Combining Global Appearance and Local Details
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1