Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456259
Nils Genser, Jürgen Seiler, Franz Schilling, André Kaup
The concealment of errors is an important task in image and video signal processing. Often, complex models are calculated to reconstruct the missing samples, which results in a long computation time. One method that achieves a very high reconstruction quality, but demands a moderate computational complexity only, is the block based Frequency Selective Extrapolation. Nevertheless, the reconstruction of a Full HD image can still take several minutes depending on the error pattern. To accelerate the computation, a novel algorithm is introduced in this paper that analyzes the adjacent, undistorted samples and optimizes the reconstruction parameters accordingly. Moreover, the analyzation is further used to adapt the partitioning of the blocks and the processing order. Similar to modern video codecs, e.g., High Efficiency Video Coding, a content based partitioning and processing is proposed as it takes the signal characteristics into account. Thus, the novel algorithm is on average four times faster than the state-of-the-art method and up to $25times $ quicker at best, while achieving a slightly higher reconstruction quality as well.
错误隐藏是图像和视频信号处理中的一项重要任务。通常需要计算复杂的模型来重建缺失的样本,这导致计算时间很长。基于分块的频率选择外推法是一种实现高质量重构的方法,但只需要适度的计算复杂度。然而,全高清图像的重建仍然需要几分钟,这取决于错误模式。为了加快计算速度,本文介绍了一种新的算法,该算法对相邻的、未失真的样本进行分析,并相应地优化重建参数。此外,还进一步利用分析来调整块的划分和处理顺序。与高效视频编码(High Efficiency video Coding)等现代视频编解码器类似,本文提出了一种考虑信号特性的基于内容的分割和处理方法。因此,新算法的平均速度比最先进的方法快4倍,最多快25倍,同时实现了略高的重建质量。
{"title":"Signal and Loss Geometry Aware Frequency Selective Extrapolation for Error Concealment","authors":"Nils Genser, Jürgen Seiler, Franz Schilling, André Kaup","doi":"10.1109/PCS.2018.8456259","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456259","url":null,"abstract":"The concealment of errors is an important task in image and video signal processing. Often, complex models are calculated to reconstruct the missing samples, which results in a long computation time. One method that achieves a very high reconstruction quality, but demands a moderate computational complexity only, is the block based Frequency Selective Extrapolation. Nevertheless, the reconstruction of a Full HD image can still take several minutes depending on the error pattern. To accelerate the computation, a novel algorithm is introduced in this paper that analyzes the adjacent, undistorted samples and optimizes the reconstruction parameters accordingly. Moreover, the analyzation is further used to adapt the partitioning of the blocks and the processing order. Similar to modern video codecs, e.g., High Efficiency Video Coding, a content based partitioning and processing is proposed as it takes the signal characteristics into account. Thus, the novel algorithm is on average four times faster than the state-of-the-art method and up to $25times $ quicker at best, while achieving a slightly higher reconstruction quality as well.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127569815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-06-01DOI: 10.1109/PCS.2018.8456311
I. Schiopu, Yu Liu, A. Munteanu
The paper proposes a novel prediction paradigm in image coding based on Convolutional Neural Networks (CNN). A deep neural network is designed to provide accurate pixel-wise prediction based on a causal neighbourhood. The proposed CNN prediction method is trained on the high-activity areas in the image and it is incorporated in a lossless compression system for high-resolution photographic images. The system uses the proposed CNN-based prediction paradigm as well as LOCO-I, whereby the predictor selection is performed using a local entropy-based descriptor. The prediction errors are encoded using a CALIC-based reference codec. The experimental results show a good performance for the proposed prediction scheme compared to state-of-the-art predictors. To our knowledge, the paper is the first to introduce CNN-based prediction in image coding, and demonstrates the potential offered by machine learning methods in coding applications.
{"title":"CNN-based Prediction for Lossless Coding of Photographic Images","authors":"I. Schiopu, Yu Liu, A. Munteanu","doi":"10.1109/PCS.2018.8456311","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456311","url":null,"abstract":"The paper proposes a novel prediction paradigm in image coding based on Convolutional Neural Networks (CNN). A deep neural network is designed to provide accurate pixel-wise prediction based on a causal neighbourhood. The proposed CNN prediction method is trained on the high-activity areas in the image and it is incorporated in a lossless compression system for high-resolution photographic images. The system uses the proposed CNN-based prediction paradigm as well as LOCO-I, whereby the predictor selection is performed using a local entropy-based descriptor. The prediction errors are encoded using a CALIC-based reference codec. The experimental results show a good performance for the proposed prediction scheme compared to state-of-the-art predictors. To our knowledge, the paper is the first to introduce CNN-based prediction in image coding, and demonstrates the potential offered by machine learning methods in coding applications.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128440750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-04-25DOI: 10.1109/PCS.2018.8456308
Zhengxue Cheng, Heming Sun, Masaru Takeuchi, J. Katto
Image compression has been investigated as a fundamental research topic for many decades. Recently, deep learning has achieved great success in many computer vision tasks, and is gradually being used in image compression. In this paper, we present a lossy image compression architecture, which utilizes the advantages of convolutional autoencoder (CAE) to achieve a high coding efficiency. First, we design a novel CAE architecture to replace the conventional transforms and train this CAE using a rate-distortion loss function. Second, to generate a more energy-compact representation, we utilize the principal components analysis (PCA) to rotate the feature maps produced by the CAE, and then apply the quantization and entropy coder to generate the codes. Experimental results demonstrate that our method outperforms traditional image coding algorithms, by achieving a 13.7% BD-rate decrement on the Kodak database images compared to JPEG2000. Besides, our method maintains a moderate complexity similar to JPEG2000.
{"title":"Deep Convolutional AutoEncoder-based Lossy Image Compression","authors":"Zhengxue Cheng, Heming Sun, Masaru Takeuchi, J. Katto","doi":"10.1109/PCS.2018.8456308","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456308","url":null,"abstract":"Image compression has been investigated as a fundamental research topic for many decades. Recently, deep learning has achieved great success in many computer vision tasks, and is gradually being used in image compression. In this paper, we present a lossy image compression architecture, which utilizes the advantages of convolutional autoencoder (CAE) to achieve a high coding efficiency. First, we design a novel CAE architecture to replace the conventional transforms and train this CAE using a rate-distortion loss function. Second, to generate a more energy-compact representation, we utilize the principal components analysis (PCA) to rotate the feature maps produced by the CAE, and then apply the quantization and entropy coder to generate the codes. Experimental results demonstrate that our method outperforms traditional image coding algorithms, by achieving a 13.7% BD-rate decrement on the Kodak database images compared to JPEG2000. Besides, our method maintains a moderate complexity similar to JPEG2000.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123588538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-01-31DOI: 10.1109/PCS.2018.8456272
J. Ballé
We assess the performance of two techniques in the context of nonlinear transform coding with artificial neural networks, Sadam and GDN. Both techniques have been success- fully used in state-of-the-art image compression methods, but their performance has not been individually assessed to this point. Together, the techniques stabilize the training procedure of nonlinear image transforms and increase their capacity to approximate the (unknown) rate-distortion optimal transform functions. Besides comparing their performance to established alternatives, we detail the implementation of both methods and provide open-source code along with the paper.
{"title":"Efficient Nonlinear Transforms for Lossy Image Compression","authors":"J. Ballé","doi":"10.1109/PCS.2018.8456272","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456272","url":null,"abstract":"We assess the performance of two techniques in the context of nonlinear transform coding with artificial neural networks, Sadam and GDN. Both techniques have been success- fully used in state-of-the-art image compression methods, but their performance has not been individually assessed to this point. Together, the techniques stabilize the training procedure of nonlinear image transforms and increase their capacity to approximate the (unknown) rate-distortion optimal transform functions. Besides comparing their performance to established alternatives, we detail the implementation of both methods and provide open-source code along with the paper.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126228391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-29DOI: 10.1109/PCS.2018.8456277
Yueru Chen, Zhuwei Xu, Shanshan Cai, Yujian Lang, C.-C. Jay Kuo
An efficient, scalable and robust approach to the handwritten digits recognition problem based on the Saak transform is proposed in this work. First, multi-stage Saak transforms are used to extract a family of joint spatial-spectral representations of input images. Then, the Saak coefficients are used as features and fed into the SVM classifier for the classification task. In order to control the size of Saak coefficients, we adopt a lossy Saak transform that uses the principal component analysis (PCA) to select a smaller set of transform kernels. The handwritten digits recognition problem is well solved by the convolutional neural network (CNN) such as the LeNet-5. We conduct a comparative study on the performance of the LeNet-5 and the Saak-transform-based solutions in terms of scalability and robustness as well as the efficiency of lossless and lossy Saak transforms under a comparable accuracy level.
{"title":"A Saak Transform Approach to Efficient, Scalable and Robust Handwritten Digits Recognition","authors":"Yueru Chen, Zhuwei Xu, Shanshan Cai, Yujian Lang, C.-C. Jay Kuo","doi":"10.1109/PCS.2018.8456277","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456277","url":null,"abstract":"An efficient, scalable and robust approach to the handwritten digits recognition problem based on the Saak transform is proposed in this work. First, multi-stage Saak transforms are used to extract a family of joint spatial-spectral representations of input images. Then, the Saak coefficients are used as features and fed into the SVM classifier for the classification task. In order to control the size of Saak coefficients, we adopt a lossy Saak transform that uses the principal component analysis (PCA) to select a smaller set of transform kernels. The handwritten digits recognition problem is well solved by the convolutional neural network (CNN) such as the LeNet-5. We conduct a comparative study on the performance of the LeNet-5 and the Saak-transform-based solutions in terms of scalability and robustness as well as the efficiency of lossless and lossy Saak transforms under a comparable accuracy level.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122521368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-03-04DOI: 10.1109/PCS.2018.8456298
Shibani Santurkar, D. Budden, N. Shavit
Traditional image and video compression algorithms rely on hand-crafted encoder/decoder pairs (codecs) that lack adaptability and are agnostic to the data being compressed. We describe the concept of generative compression, the compression of data using generative models, and suggest that it is a direction worth pursuing to produce more accurate and visually pleasing reconstructions at deeper compression levels for both image and video data. We also show that generative compression is orders- of-magnitude more robust to bit errors (e.g., from noisy channels) than traditional variable-length coding schemes.
{"title":"Generative Compression","authors":"Shibani Santurkar, D. Budden, N. Shavit","doi":"10.1109/PCS.2018.8456298","DOIUrl":"https://doi.org/10.1109/PCS.2018.8456298","url":null,"abstract":"Traditional image and video compression algorithms rely on hand-crafted encoder/decoder pairs (codecs) that lack adaptability and are agnostic to the data being compressed. We describe the concept of generative compression, the compression of data using generative models, and suggest that it is a direction worth pursuing to produce more accurate and visually pleasing reconstructions at deeper compression levels for both image and video data. We also show that generative compression is orders- of-magnitude more robust to bit errors (e.g., from noisy channels) than traditional variable-length coding schemes.","PeriodicalId":433667,"journal":{"name":"2018 Picture Coding Symposium (PCS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117351791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}