Jianping Lin, Mohammad Akbari, H. Fu, Qian Zhang, Shang Wang, Jie Liang, Dong Liu, F. Liang, Guohe Zhang, Chengjie Tu
{"title":"基于调制广义倍频卷积的变速率多频图像压缩","authors":"Jianping Lin, Mohammad Akbari, H. Fu, Qian Zhang, Shang Wang, Jie Liang, Dong Liu, F. Liang, Guohe Zhang, Chengjie Tu","doi":"10.1109/MMSP48831.2020.9287082","DOIUrl":null,"url":null,"abstract":"In this proposal, we design a learned multi-frequency image compression approach that uses generalized octave convolutions to factorize the latent representations into high-frequency (HF) and low-frequency (LF) components, and the LF components have lower resolution than HF components, which can improve the rate-distortion performance, similar to wavelet transform. Moreover, compared to the original octave convolution, the proposed generalized octave convolution (GoConv) and octave transposed-convolution (GoTConv) with internal activation layers preserve more spatial structure of the information, and enable more effective filtering between the HF and LF components, which further improve the performance. In addition, we develop a variable-rate scheme using the Lagrangian parameter to modulate all the internal feature maps in the autoencoder, which allows the scheme to achieve the large bitrate range of the JPEG AI with only three models. Experiments show that the proposed scheme achieves much better Y MS-SSIM than VVC. In terms of YUV PSNR, our scheme is very similar to HEVC.","PeriodicalId":188283,"journal":{"name":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Variable-Rate Multi-Frequency Image Compression using Modulated Generalized Octave Convolution\",\"authors\":\"Jianping Lin, Mohammad Akbari, H. Fu, Qian Zhang, Shang Wang, Jie Liang, Dong Liu, F. Liang, Guohe Zhang, Chengjie Tu\",\"doi\":\"10.1109/MMSP48831.2020.9287082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this proposal, we design a learned multi-frequency image compression approach that uses generalized octave convolutions to factorize the latent representations into high-frequency (HF) and low-frequency (LF) components, and the LF components have lower resolution than HF components, which can improve the rate-distortion performance, similar to wavelet transform. Moreover, compared to the original octave convolution, the proposed generalized octave convolution (GoConv) and octave transposed-convolution (GoTConv) with internal activation layers preserve more spatial structure of the information, and enable more effective filtering between the HF and LF components, which further improve the performance. In addition, we develop a variable-rate scheme using the Lagrangian parameter to modulate all the internal feature maps in the autoencoder, which allows the scheme to achieve the large bitrate range of the JPEG AI with only three models. Experiments show that the proposed scheme achieves much better Y MS-SSIM than VVC. In terms of YUV PSNR, our scheme is very similar to HEVC.\",\"PeriodicalId\":188283,\"journal\":{\"name\":\"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP48831.2020.9287082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP48831.2020.9287082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Variable-Rate Multi-Frequency Image Compression using Modulated Generalized Octave Convolution
In this proposal, we design a learned multi-frequency image compression approach that uses generalized octave convolutions to factorize the latent representations into high-frequency (HF) and low-frequency (LF) components, and the LF components have lower resolution than HF components, which can improve the rate-distortion performance, similar to wavelet transform. Moreover, compared to the original octave convolution, the proposed generalized octave convolution (GoConv) and octave transposed-convolution (GoTConv) with internal activation layers preserve more spatial structure of the information, and enable more effective filtering between the HF and LF components, which further improve the performance. In addition, we develop a variable-rate scheme using the Lagrangian parameter to modulate all the internal feature maps in the autoencoder, which allows the scheme to achieve the large bitrate range of the JPEG AI with only three models. Experiments show that the proposed scheme achieves much better Y MS-SSIM than VVC. In terms of YUV PSNR, our scheme is very similar to HEVC.