Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702518
Marijn J. H. Loomans, Cornelis J. Koeleman, P. D. With
In this paper, we address two problems found in embedded implementations of Scalable Video Codecs (SVCs): the temporal signal energy distribution and frame-to-frame quality fluctuations. The unequal energy distribution between the low- and high-pass band with integer-based wavelets leads to sub-optimal rate-distortion choices coupled with quantization-error accumulations. The second problem is the quality fluctuation between frames within a Group Of Pictures (GOP). To solve these two problems, we present two modifications to the SVC. The first modification aims at a temporal energy correction of the lifting scheme in the temporal wavelet decomposition. By moving this energy correction to the leaves of the temporal tree, we can save on required memory size, bandwidth and computations, while reducing floating/fixed-point conversion errors. The second modification feeds back the decoded first frame of the GOP (the temporal low-pass) into the temporal coding chain. The decoding of the first frame is achieved without entropy decoding while avoiding any required modifications at the decoder. Experiments show that quality fluctuations within the GOP are significantly reduced, thereby significantly increasing the subjective visual quality. On top of this, a small quality improvement is achieved on average.
{"title":"Temporal signal energy correction and low-complexity encoder feedback for lossy scalable video coding","authors":"Marijn J. H. Loomans, Cornelis J. Koeleman, P. D. With","doi":"10.1109/PCS.2010.5702518","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702518","url":null,"abstract":"In this paper, we address two problems found in embedded implementations of Scalable Video Codecs (SVCs): the temporal signal energy distribution and frame-to-frame quality fluctuations. The unequal energy distribution between the low- and high-pass band with integer-based wavelets leads to sub-optimal rate-distortion choices coupled with quantization-error accumulations. The second problem is the quality fluctuation between frames within a Group Of Pictures (GOP). To solve these two problems, we present two modifications to the SVC. The first modification aims at a temporal energy correction of the lifting scheme in the temporal wavelet decomposition. By moving this energy correction to the leaves of the temporal tree, we can save on required memory size, bandwidth and computations, while reducing floating/fixed-point conversion errors. The second modification feeds back the decoded first frame of the GOP (the temporal low-pass) into the temporal coding chain. The decoding of the first frame is achieved without entropy decoding while avoiding any required modifications at the decoder. Experiments show that quality fluctuations within the GOP are significantly reduced, thereby significantly increasing the subjective visual quality. On top of this, a small quality improvement is achieved on average.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116295053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702552
Xin Zhao, Li Zhang, Siwei Ma, Wen Gao
In our previous work, the rate-distortion optimized transform (RDOT) is introduced for Intra coding, which is featured by the usage of multiple offline-trained transform matrix candidates. The proposed RDOT achieves remarkable coding gain for KTA Intra coding, while maintaining almost the same computational complexity at the decoder. However, at the encoder, the computational complexity is increased drastically by the expensive ratedistortion (R-D) optimized selection of transform matrix. To resolve this problem, in this paper, we propose a fast RDOT scheme using macroblock- and block-level R-D cost thresholding. With the proposed method, unnecessary mode trials and R-D evaluations of transform matrices can be efficiently skipped from the mode decision process. Extensive experimental results show that, with negligible coding performance degradation, about 88.9% of the total encoding time is saved by the proposed method.
{"title":"Fast rate-distortion optimized transform for Intra coding","authors":"Xin Zhao, Li Zhang, Siwei Ma, Wen Gao","doi":"10.1109/PCS.2010.5702552","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702552","url":null,"abstract":"In our previous work, the rate-distortion optimized transform (RDOT) is introduced for Intra coding, which is featured by the usage of multiple offline-trained transform matrix candidates. The proposed RDOT achieves remarkable coding gain for KTA Intra coding, while maintaining almost the same computational complexity at the decoder. However, at the encoder, the computational complexity is increased drastically by the expensive ratedistortion (R-D) optimized selection of transform matrix. To resolve this problem, in this paper, we propose a fast RDOT scheme using macroblock- and block-level R-D cost thresholding. With the proposed method, unnecessary mode trials and R-D evaluations of transform matrices can be efficiently skipped from the mode decision process. Extensive experimental results show that, with negligible coding performance degradation, about 88.9% of the total encoding time is saved by the proposed method.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125471426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702494
Yin Zhao, Zhenzhong Chen, Dong Tian, Ce Zhu, Lu Yu
During view synthesis based on depth maps, also known as Depth-Image-Based Rendering (DIBR), annoying artifacts are often generated around foreground objects, yielding the visual effects that slim silhouettes of foreground objects are scattered into the background. The artifacts are referred as the boundary noises. We investigate the cause of boundary noises, and find out that they result from the misalignment between texture and depth information along object boundaries. Accordingly, we propose a novel solution to remove such boundary noises by applying restrictions during forward warping on the pixels within the texture-depth misalignment regions. Experiments show this algorithm can effectively eliminate most boundary noises and it is also robust for view synthesis with compressed depth and texture information.
在基于深度图的视图合成过程中,也称为基于深度图像的渲染(deep - image - based Rendering, DIBR),前景物体周围经常会产生令人讨厌的伪影,从而产生前景物体的细长轮廓分散到背景中的视觉效果。这些伪影被称为边界噪声。研究了边界噪声产生的原因,发现边界噪声的产生是由于物体边界上纹理和深度信息不一致造成的。因此,我们提出了一种新的解决方案,通过对纹理深度不对齐区域内的像素在前向翘曲期间施加限制来去除这种边界噪声。实验表明,该算法能够有效地消除大部分边界噪声,对压缩深度和纹理信息的图像合成具有较强的鲁棒性。
{"title":"Suppressing texture-depth misalignment for boundary noise removal in view synthesis","authors":"Yin Zhao, Zhenzhong Chen, Dong Tian, Ce Zhu, Lu Yu","doi":"10.1109/PCS.2010.5702494","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702494","url":null,"abstract":"During view synthesis based on depth maps, also known as Depth-Image-Based Rendering (DIBR), annoying artifacts are often generated around foreground objects, yielding the visual effects that slim silhouettes of foreground objects are scattered into the background. The artifacts are referred as the boundary noises. We investigate the cause of boundary noises, and find out that they result from the misalignment between texture and depth information along object boundaries. Accordingly, we propose a novel solution to remove such boundary noises by applying restrictions during forward warping on the pixels within the texture-depth misalignment regions. Experiments show this algorithm can effectively eliminate most boundary noises and it is also robust for view synthesis with compressed depth and texture information.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126224952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702473
Shunsuke Ono, T. Miyata, Y. Sakai
Colorization is a method that adds color components to a grayscale image using only a few representative pixels provided by the user. A novel approach to image compression called colorization-based coding has recently been proposed. It automatically extracts representative pixels from an original color image at an encoder and restores a full color image by using colorization at a decoder. However, previous studies on colorization-based coding extract redundant representative pixels and do not extract the pixels required for suppressing coding error. This paper focuses on the colorization basis that restricts the decoded color components. From this viewpoint, we propose a new colorization-based coding method. Experimental results revealed that our method can drastically suppress the information amount (number of representative pixels) compared conventional colorization based-coding while objective quality is maintained.
{"title":"Colorization-based coding by focusing on characteristics of colorization bases","authors":"Shunsuke Ono, T. Miyata, Y. Sakai","doi":"10.1109/PCS.2010.5702473","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702473","url":null,"abstract":"Colorization is a method that adds color components to a grayscale image using only a few representative pixels provided by the user. A novel approach to image compression called colorization-based coding has recently been proposed. It automatically extracts representative pixels from an original color image at an encoder and restores a full color image by using colorization at a decoder. However, previous studies on colorization-based coding extract redundant representative pixels and do not extract the pixels required for suppressing coding error. This paper focuses on the colorization basis that restricts the decoded color components. From this viewpoint, we propose a new colorization-based coding method. Experimental results revealed that our method can drastically suppress the information amount (number of representative pixels) compared conventional colorization based-coding while objective quality is maintained.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121800821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702580
D. Marpe, H. Schwarz, T. Wiegand
We present a novel approach to entropy coding, which provides the coding efficiency and simple probability modeling capability of arithmetic coding at the complexity level of Huffman coding. The key element of the proposed approach is a partitioning of the unit interval into a small set of probability intervals. An input sequence of discrete source symbols is mapped to a sequence of binary symbols and each of the binary symbols is assigned to one of the probability intervals. The binary symbols that are assigned to a particular probability interval are coded at a fixed probability using a simple code that maps a variable number of binary symbols to variable length codewords. The probability modeling is decoupled from the actual binary entropy coding. The coding efficiency of the probability interval partitioning entropy (PIPE) coding is comparable to that of arithmetic coding.
{"title":"Entropy coding in video compression using probability interval partitioning","authors":"D. Marpe, H. Schwarz, T. Wiegand","doi":"10.1109/PCS.2010.5702580","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702580","url":null,"abstract":"We present a novel approach to entropy coding, which provides the coding efficiency and simple probability modeling capability of arithmetic coding at the complexity level of Huffman coding. The key element of the proposed approach is a partitioning of the unit interval into a small set of probability intervals. An input sequence of discrete source symbols is mapped to a sequence of binary symbols and each of the binary symbols is assigned to one of the probability intervals. The binary symbols that are assigned to a particular probability interval are coded at a fixed probability using a simple code that maps a variable number of binary symbols to variable length codewords. The probability modeling is decoupled from the actual binary entropy coding. The coding efficiency of the probability interval partitioning entropy (PIPE) coding is comparable to that of arithmetic coding.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128274422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702500
Kazuma Suzuki, Norishige Fukushima, T. Yendo, M. P. Tehrani, T. Fujii, M. Tanimoto
In this paper, we propose a parallel processing method to generate free viewpoint image in realtime. It is impossible to arrange the cameras in a high density realistically though it is necessary to capture images of the scene from innumerable cameras to express the free viewpoint image. Therefore, it is necessary to interpolate the image of arbitrary viewpoint from limited captured images. However, this process has the relation of the trade-off between the image quality and the computing time. In proposed method, it aimed to generate the high-quality free viewpoint image in realtime by applying the parallel processing to time-consuming interpolation part.
{"title":"Parallel processing method for realtime FTV","authors":"Kazuma Suzuki, Norishige Fukushima, T. Yendo, M. P. Tehrani, T. Fujii, M. Tanimoto","doi":"10.1109/PCS.2010.5702500","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702500","url":null,"abstract":"In this paper, we propose a parallel processing method to generate free viewpoint image in realtime. It is impossible to arrange the cameras in a high density realistically though it is necessary to capture images of the scene from innumerable cameras to express the free viewpoint image. Therefore, it is necessary to interpolate the image of arbitrary viewpoint from limited captured images. However, this process has the relation of the trade-off between the image quality and the computing time. In proposed method, it aimed to generate the high-quality free viewpoint image in realtime by applying the parallel processing to time-consuming interpolation part.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114988078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702443
K. Wegner, O. Stankiewicz, M. Domański
Stereo matching techniques usually match segments or blocks of pixels. This paper proposes to match segments defined as fuzzy sets of pixels. The proposed matching method is applicable to various techniques of stereo matching as well as to different measures of differences between pixels. In the paper, embedment of this approach into the state-of-the-art depth estimation software is described. Obtained experimental results show that the proposed way of stereo matching increases reliability of various depth estimation techniques.
{"title":"Stereoscopic depth estimation using fuzzy segment matching","authors":"K. Wegner, O. Stankiewicz, M. Domański","doi":"10.1109/PCS.2010.5702443","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702443","url":null,"abstract":"Stereo matching techniques usually match segments or blocks of pixels. This paper proposes to match segments defined as fuzzy sets of pixels. The proposed matching method is applicable to various techniques of stereo matching as well as to different measures of differences between pixels. In the paper, embedment of this approach into the state-of-the-art depth estimation software is described. Obtained experimental results show that the proposed way of stereo matching increases reliability of various depth estimation techniques.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132455746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702433
C. Reader
Royalty-free standards for image and video coding have been actively discussed for over 20 years. This paper breaks down the issues of designing royalty-free codecs into the major topics of requirements, video coding tools, classes of patents and performance. By dissecting the codec using a hierarchy of major to minor coding tools, it is possible to pinpoint where a patent impacts the video coding, and what the consequence will be of avoiding the patented tool.
{"title":"Technical design & IPR analysis for royalty-free video codecs","authors":"C. Reader","doi":"10.1109/PCS.2010.5702433","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702433","url":null,"abstract":"Royalty-free standards for image and video coding have been actively discussed for over 20 years. This paper breaks down the issues of designing royalty-free codecs into the major topics of requirements, video coding tools, classes of patents and performance. By dissecting the codec using a hierarchy of major to minor coding tools, it is possible to pinpoint where a patent impacts the video coding, and what the consequence will be of avoiding the patented tool.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131063036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702508
Kengo Ando, Norishige Fukushima, T. Yendo, M. P. Tehrani, T. Fujii, M. Tanimoto
The availability of multi-view images of a scene makes new and exciting applications possible, including Free-Viewpoint TV (FTV). FTV allows us to change viewpoint freely in a 3D world, where the virtual viewpoint images are synthesized by Image-Based Rendering (IBR). In this paper, we introduce a FTV depth estimation method for forward virtual viewpoints. Moreover, we introduce a view generation method by using a zoom camera in our camera setup to improve virtual viewpoint-ts' image quality. Simulation results confirm reduced error during depth estimation using our proposed method in comparison with conventional stereo matching scheme. We have demonstrated the improvement in image resolution of virtually moved forward camera using a zoom camera setup.
{"title":"Free-viewpoint image generation using different focal length camera array","authors":"Kengo Ando, Norishige Fukushima, T. Yendo, M. P. Tehrani, T. Fujii, M. Tanimoto","doi":"10.1109/PCS.2010.5702508","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702508","url":null,"abstract":"The availability of multi-view images of a scene makes new and exciting applications possible, including Free-Viewpoint TV (FTV). FTV allows us to change viewpoint freely in a 3D world, where the virtual viewpoint images are synthesized by Image-Based Rendering (IBR). In this paper, we introduce a FTV depth estimation method for forward virtual viewpoints. Moreover, we introduce a view generation method by using a zoom camera in our camera setup to improve virtual viewpoint-ts' image quality. Simulation results confirm reduced error during depth estimation using our proposed method in comparison with conventional stereo matching scheme. We have demonstrated the improvement in image resolution of virtually moved forward camera using a zoom camera setup.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134139835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702577
Sz-Hsien Wu, Wen-Hsiao Peng, Tihao Chiang
This paper addresses the problem of reconstructing a com-pressively sampled sparse signal from its lossy and possibly insufficient measurements. The process involves estimations of sparsity pattern and sparse representation, for which we derived a vector estimator based on the Maximum a Posteriori Probability (MAP) rule. By making full use of signal prior knowledge, our scheme can use a measurement number close to sparsity to achieve perfect reconstruction. It also shows a much lower error probability of sparsity pattern than prior work, given insufficient measurements. To better recover the most significant part of the sparse representation, we further introduce the notion of bit-plane separation. When applied to image compression, the technique in combination with our MAP estimator shows promising results as compared to JPEG: the difference in compression ratio is seen to be within a factor of two, given the same decoded quality.
{"title":"Bit-plane compressive sensing with Bayesian decoding for lossy compression","authors":"Sz-Hsien Wu, Wen-Hsiao Peng, Tihao Chiang","doi":"10.1109/PCS.2010.5702577","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702577","url":null,"abstract":"This paper addresses the problem of reconstructing a com-pressively sampled sparse signal from its lossy and possibly insufficient measurements. The process involves estimations of sparsity pattern and sparse representation, for which we derived a vector estimator based on the Maximum a Posteriori Probability (MAP) rule. By making full use of signal prior knowledge, our scheme can use a measurement number close to sparsity to achieve perfect reconstruction. It also shows a much lower error probability of sparsity pattern than prior work, given insufficient measurements. To better recover the most significant part of the sparse representation, we further introduce the notion of bit-plane separation. When applied to image compression, the technique in combination with our MAP estimator shows promising results as compared to JPEG: the difference in compression ratio is seen to be within a factor of two, given the same decoded quality.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132136959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}