Machine Learning Accelerated Transform Search For AV1

Hui Su, Mingliang Chen, A. Bokov, D. Mukherjee, Yunqing Wang, Yue Chen
{"title":"Machine Learning Accelerated Transform Search For AV1","authors":"Hui Su, Mingliang Chen, A. Bokov, D. Mukherjee, Yunqing Wang, Yue Chen","doi":"10.1109/PCS48520.2019.8954514","DOIUrl":null,"url":null,"abstract":"AV1 is the state-of-the-art open and royalty-free video compression format that achieves significant bitrate savings over previous generation of video codecs. One of AV1’s major improvement over its predecessor VP9 is the support of more diverse and flexible transform size and kernel selection. However, it also drastically increases the search space for transform unit rate-distortion optimization in AV1 encoders. Unlike conventional encoder speed features that are based on heuristics, we propose a machine learning (ML) based approach to accelerate the transform size and kernel search for AV1. The ML models use input features extracted from the prediction residue block such as standard deviation, correlation and energy distribution. The output of the models indicates the estimated likelihood of which transform size and kernel would be selected as the optimal choice. Based on the ML models, the encoder can prune out the transform size and kernel candidates that are unlikely to be selected and save unnecessary computation to compute their rate-distortion cost. The proposed approach is implemented and tested on the AV1 reference library libaom. The experimental results show that satisfactory encoding speed improvement can be achieved with extremely low compression performance loss. The framework and methodology can also be easily migrated to other video codecs and implementations.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Picture Coding Symposium (PCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PCS48520.2019.8954514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

AV1 is the state-of-the-art open and royalty-free video compression format that achieves significant bitrate savings over previous generation of video codecs. One of AV1’s major improvement over its predecessor VP9 is the support of more diverse and flexible transform size and kernel selection. However, it also drastically increases the search space for transform unit rate-distortion optimization in AV1 encoders. Unlike conventional encoder speed features that are based on heuristics, we propose a machine learning (ML) based approach to accelerate the transform size and kernel search for AV1. The ML models use input features extracted from the prediction residue block such as standard deviation, correlation and energy distribution. The output of the models indicates the estimated likelihood of which transform size and kernel would be selected as the optimal choice. Based on the ML models, the encoder can prune out the transform size and kernel candidates that are unlikely to be selected and save unnecessary computation to compute their rate-distortion cost. The proposed approach is implemented and tested on the AV1 reference library libaom. The experimental results show that satisfactory encoding speed improvement can be achieved with extremely low compression performance loss. The framework and methodology can also be easily migrated to other video codecs and implementations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
AV1是最先进的开放和免版税的视频压缩格式,实现显着比特率节省比上一代视频编解码器。AV1相对于其前身VP9的主要改进之一是支持更多样化和灵活的转换大小和内核选择。然而,它也大大增加了AV1编码器中变换单位率失真优化的搜索空间。与传统的基于启发式的编码器速度特征不同,我们提出了一种基于机器学习(ML)的方法来加速AV1的变换大小和内核搜索。机器学习模型使用从预测残差块中提取的输入特征,如标准差、相关性和能量分布。模型的输出表明了哪种变换大小和核被选择为最优选择的估计可能性。基于机器学习模型,编码器可以剔除不太可能被选中的变换大小和候选核,节省不必要的计算来计算它们的率失真代价。该方法在AV1参考图书馆图书馆上进行了实现和测试。实验结果表明,该方法可以在极低的压缩性能损失下实现令人满意的编码速度提升。该框架和方法也可以很容易地移植到其他视频编解码器和实现中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Efficient Delivery of Very High Dynamic Range Compressed Imagery by Dynamic-Range-of-Interest Novel Coding Tools Based on Characteristics for Short Videos Extending Video Decoding Energy Models for 360° and HDR Video Formats in HEVC Generalized binary splits: A versatile partitioning scheme for block-based hybrid video coding An IBP-CNN Based Fast Block Partition For Intra Prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1