Efficient Feature Extraction for High-resolution Video Frame Interpolation

M. Nottebaum, S. Roth, Simone Schaub-Meyer
{"title":"Efficient Feature Extraction for High-resolution Video Frame Interpolation","authors":"M. Nottebaum, S. Roth, Simone Schaub-Meyer","doi":"10.48550/arXiv.2211.14005","DOIUrl":null,"url":null,"abstract":"Most deep learning methods for video frame interpolation consist of three main components: feature extraction, motion estimation, and image synthesis. Existing approaches are mainly distinguishable in terms of how these modules are designed. However, when interpolating high-resolution images, e.g. at 4K, the design choices for achieving high accuracy within reasonable memory requirements are limited. The feature extraction layers help to compress the input and extract relevant information for the latter stages, such as motion estimation. However, these layers are often costly in parameters, computation time, and memory. We show how ideas from dimensionality reduction combined with a lightweight optimization can be used to compress the input representation while keeping the extracted information suitable for frame interpolation. Further, we require neither a pretrained flow network nor a synthesis network, additionally reducing the number of trainable parameters and required memory. When evaluating on three 4K benchmarks, we achieve state-of-the-art image quality among the methods without pretrained flow while having the lowest network complexity and memory requirements overall.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"24 1","pages":"825"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.14005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Most deep learning methods for video frame interpolation consist of three main components: feature extraction, motion estimation, and image synthesis. Existing approaches are mainly distinguishable in terms of how these modules are designed. However, when interpolating high-resolution images, e.g. at 4K, the design choices for achieving high accuracy within reasonable memory requirements are limited. The feature extraction layers help to compress the input and extract relevant information for the latter stages, such as motion estimation. However, these layers are often costly in parameters, computation time, and memory. We show how ideas from dimensionality reduction combined with a lightweight optimization can be used to compress the input representation while keeping the extracted information suitable for frame interpolation. Further, we require neither a pretrained flow network nor a synthesis network, additionally reducing the number of trainable parameters and required memory. When evaluating on three 4K benchmarks, we achieve state-of-the-art image quality among the methods without pretrained flow while having the lowest network complexity and memory requirements overall.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高分辨率视频帧插值的高效特征提取
大多数视频帧插值的深度学习方法包括三个主要部分:特征提取、运动估计和图像合成。现有方法的主要区别在于如何设计这些模块。然而,当插值高分辨率图像时,例如4K,在合理的内存要求下实现高精度的设计选择是有限的。特征提取层有助于压缩输入并提取后期阶段的相关信息,例如运动估计。然而,这些层通常在参数、计算时间和内存方面代价高昂。我们展示了如何使用降维与轻量级优化相结合的思想来压缩输入表示,同时保持提取的信息适合帧插值。此外,我们既不需要预训练的流网络,也不需要合成网络,另外减少了可训练参数的数量和所需的内存。在三个4K基准测试上进行评估时,我们在没有预训练流的方法中获得了最先进的图像质量,同时具有最低的网络复杂性和总体内存要求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Learning Anatomically Consistent Embedding for Chest Radiography. Single Pixel Spectral Color Constancy DiffSketching: Sketch Control Image Synthesis with Diffusion Models Defect Transfer GAN: Diverse Defect Synthesis for Data Augmentation Mitigating Bias in Visual Transformers via Targeted Alignment
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1