Learn to overfit better: finding the important parameters for learned image compression

2021 International Conference on Visual Communications and Image Processing (VCIP) Pub Date : 2021-12-05 DOI:10.1109/VCIP53242.2021.9675360

Honglei Zhang, Francesco Cricri, H. R. Tavakoli, M. Santamaría, Y. Lam, M. Hannuksela

{"title":"Learn to overfit better: finding the important parameters for learned image compression","authors":"Honglei Zhang, Francesco Cricri, H. R. Tavakoli, M. Santamaría, Y. Lam, M. Hannuksela","doi":"10.1109/VCIP53242.2021.9675360","DOIUrl":null,"url":null,"abstract":"For most machine learning systems, overfitting is an undesired behavior. However, overfitting a model to a test image or a video at inference time is a favorable and effective technique to improve the coding efficiency of learning-based image and video codecs. At the encoding stage, one or more neural networks that are part of the codec are finetuned using the input image or video to achieve a better coding performance. The encoder en-codes the input content into a content bitstream. If the finetuned neural network is part (also) of the decoder, the encoder signals the weight update of the finetuned model to the decoder along with the content bitstream. At the decoding stage, the decoder first updates its neural network model according to the received weight update, and then proceeds with decoding the content bitstream. Since a neural network contains a large number of parameters, compressing the weight update is critical to reducing bitrate overhead. In this paper, we propose learning-based methods to find the important parameters to be overfitted, in terms of rate-distortion performance. Based on simple distribution models for variables in the weight update, we derive two objective functions. By optimizing the proposed objective functions, the importance scores of the parameters can be calculated and the important parameters can be determined. Our experiments on lossless image compression codec show that the proposed method significantly outperforms a prior-art method where overfitted parameters were selected based on heuristics. Furthermore, our technique improved the compression performance of the state-of-the-art lossless image compression codec by 0.1 bit per pixel.","PeriodicalId":114062,"journal":{"name":"2021 International Conference on Visual Communications and Image Processing (VCIP)","volume":"37 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP53242.2021.9675360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

For most machine learning systems, overfitting is an undesired behavior. However, overfitting a model to a test image or a video at inference time is a favorable and effective technique to improve the coding efficiency of learning-based image and video codecs. At the encoding stage, one or more neural networks that are part of the codec are finetuned using the input image or video to achieve a better coding performance. The encoder en-codes the input content into a content bitstream. If the finetuned neural network is part (also) of the decoder, the encoder signals the weight update of the finetuned model to the decoder along with the content bitstream. At the decoding stage, the decoder first updates its neural network model according to the received weight update, and then proceeds with decoding the content bitstream. Since a neural network contains a large number of parameters, compressing the weight update is critical to reducing bitrate overhead. In this paper, we propose learning-based methods to find the important parameters to be overfitted, in terms of rate-distortion performance. Based on simple distribution models for variables in the weight update, we derive two objective functions. By optimizing the proposed objective functions, the importance scores of the parameters can be calculated and the important parameters can be determined. Our experiments on lossless image compression codec show that the proposed method significantly outperforms a prior-art method where overfitted parameters were selected based on heuristics. Furthermore, our technique improved the compression performance of the state-of-the-art lossless image compression codec by 0.1 bit per pixel.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

更好地学习过拟合:找到学习图像压缩的重要参数

对于大多数机器学习系统来说，过拟合是一种不希望出现的行为。然而，在推理时对测试图像或视频进行过拟合是提高基于学习的图像和视频编解码器编码效率的一种有利而有效的技术。在编码阶段，使用输入图像或视频对编解码器中的一个或多个神经网络进行微调，以获得更好的编码性能。编码器将输入内容编码为内容比特流。如果经过微调的神经网络是解码器的一部分，则编码器将经过微调的模型的权重更新连同内容比特流一起发送给解码器。在解码阶段，解码器首先根据接收到的权值更新其神经网络模型，然后对内容比特流进行解码。由于神经网络包含大量参数，压缩权重更新对于减少比特率开销至关重要。在本文中，我们提出了基于学习的方法来找到需要过拟合的重要参数，在率失真性能方面。基于权重更新中变量的简单分布模型，导出了两个目标函数。通过对提出的目标函数进行优化，计算出各参数的重要度得分，确定各参数的重要程度。我们对无损图像压缩编解码器的实验表明，所提出的方法明显优于基于启发式选择过拟合参数的现有技术方法。此外，我们的技术将最先进的无损图像压缩编解码器的压缩性能提高了每像素0.1位。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 International Conference on Visual Communications and Image Processing (VCIP)

自引率

0.00%

发文量