Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093804
K. Chono, Y. Senda
This paper proposes an in-loop noise shaping method based on the combination of pseudo noise injection and Wiener filtering for High Efficiency Video Coding (HEVC). In deblocking process, the proposed method injects pseudo noise into the vicinities of deblocked edges where signal-dependent coding noises are supposed to appear. The pseudo noise injection introduces a masking effect of signal-dependent noise with signal-independent noise. Since subsequently applied Wiener filtering accomplishes optimal noise reduction in a minimum-mean-squared-errors sense, it minimizes the deleterious impact of the pseudo noise injection on coding performance. Simulation results using HEVC Test Model software show that the proposed method successfully suppresses banding artifacts with a negligible impact on PSNR values, bit rates, and encoder/decoder runtime measures.
{"title":"In-loop noise shaping based on pseudo noise injection and Wiener filtering","authors":"K. Chono, Y. Senda","doi":"10.1109/MMSP.2011.6093804","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093804","url":null,"abstract":"This paper proposes an in-loop noise shaping method based on the combination of pseudo noise injection and Wiener filtering for High Efficiency Video Coding (HEVC). In deblocking process, the proposed method injects pseudo noise into the vicinities of deblocked edges where signal-dependent coding noises are supposed to appear. The pseudo noise injection introduces a masking effect of signal-dependent noise with signal-independent noise. Since subsequently applied Wiener filtering accomplishes optimal noise reduction in a minimum-mean-squared-errors sense, it minimizes the deleterious impact of the pseudo noise injection on coding performance. Simulation results using HEVC Test Model software show that the proposed method successfully suppresses banding artifacts with a negligible impact on PSNR values, bit rates, and encoder/decoder runtime measures.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114824989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093837
N. Tizon, Christina Moreno, M. Preda
This paper proposes a low computational method to perform ROI (Region Of Interest) based video encoding and adaptive streaming for remote rendering applications. The main objective of the proposed solution is to minimize the latency in the interactive loop even when facing poor transmission conditions. In order to do that, the knowledge of the depth map information provided by the rendering engine is exploited by the real-time video encoder to adapt the bitrate of the transmitted stream. Especially, thanks to an efficient coupling between the rendering and the video encoding stages, the macroblocks of each video frame are encoded with different quantization steps that follow an ROI partitioning. The details of this partitioning algorithm are provided as well with some implementation considerations. The simulation results demonstrate the benefit of our adaptive approach from the user experience point of view.
{"title":"ROI based video streaming for 3D remote rendering","authors":"N. Tizon, Christina Moreno, M. Preda","doi":"10.1109/MMSP.2011.6093837","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093837","url":null,"abstract":"This paper proposes a low computational method to perform ROI (Region Of Interest) based video encoding and adaptive streaming for remote rendering applications. The main objective of the proposed solution is to minimize the latency in the interactive loop even when facing poor transmission conditions. In order to do that, the knowledge of the depth map information provided by the rendering engine is exploited by the real-time video encoder to adapt the bitrate of the transmitted stream. Especially, thanks to an efficient coupling between the rendering and the video encoding stages, the macroblocks of each video frame are encoded with different quantization steps that follow an ROI partitioning. The details of this partitioning algorithm are provided as well with some implementation considerations. The simulation results demonstrate the benefit of our adaptive approach from the user experience point of view.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132233033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093781
Chongwu Tang, Xiaokang Yang, Li Chen, Guangtao Zhai
Purpose of video stabilization is to register the frames of a video sequence with relative motions between each other to yield a stable video of higher perceptual quality. In this paper we focus on the problem of fast and robust video stabilization for the same scene based on temporal block matching. We use VoD principle to find the local motion vectors between adjacent frames, and then use statistical analysis to generate the global vibrant motion vector. After motion compensation, we further design an edge completion algorithm incorporating mosaicking and inpainting of neighbour frames, so as to reduce the impact of error propagation. Experimental results and comparative studies will be provided to justify the effectiveness of the proposed algorithm.
{"title":"A fast video stabilization algorithm based on block matching and edge completion","authors":"Chongwu Tang, Xiaokang Yang, Li Chen, Guangtao Zhai","doi":"10.1109/MMSP.2011.6093781","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093781","url":null,"abstract":"Purpose of video stabilization is to register the frames of a video sequence with relative motions between each other to yield a stable video of higher perceptual quality. In this paper we focus on the problem of fast and robust video stabilization for the same scene based on temporal block matching. We use VoD principle to find the local motion vectors between adjacent frames, and then use statistical analysis to generate the global vibrant motion vector. After motion compensation, we further design an edge completion algorithm incorporating mosaicking and inpainting of neighbour frames, so as to reduce the impact of error propagation. Experimental results and comparative studies will be provided to justify the effectiveness of the proposed algorithm.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129312506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093782
Aniruddha Shirahatti, Joohee Kim
Distributed video coding is a new paradigm based on two information theoretical results by Slepian-Wolf and Wyner-Ziv. The architectures designed so far have invariably made use of the uniform scalar quantization schemes along with a few attempts to make the schemes more adaptive. Quantization is one of the major contributors to the large performance gap between conventional video coding standards and distributed video coding. In this paper, an attempt is made to improve the performance of the Wyner-Ziv video coding by making the quantization algorithm more adaptive to the motion content of the video sequence without significantly increasing the encoder complexity. The proposed method also exploits the temporal correlation to provide for online correlation noise classification. Hence, the improved reconstruction technique which uses the correlation noise information is more adaptive to the motion content. Simulation results show that the proposed motion-adaptive quantization and reconstruction technique achieves improved rate-distortion performance.
{"title":"Motion-adaptive quantization and reconstruction technique for distributed video coding","authors":"Aniruddha Shirahatti, Joohee Kim","doi":"10.1109/MMSP.2011.6093782","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093782","url":null,"abstract":"Distributed video coding is a new paradigm based on two information theoretical results by Slepian-Wolf and Wyner-Ziv. The architectures designed so far have invariably made use of the uniform scalar quantization schemes along with a few attempts to make the schemes more adaptive. Quantization is one of the major contributors to the large performance gap between conventional video coding standards and distributed video coding. In this paper, an attempt is made to improve the performance of the Wyner-Ziv video coding by making the quantization algorithm more adaptive to the motion content of the video sequence without significantly increasing the encoder complexity. The proposed method also exploits the temporal correlation to provide for online correlation noise classification. Hence, the improved reconstruction technique which uses the correlation noise information is more adaptive to the motion content. Simulation results show that the proposed motion-adaptive quantization and reconstruction technique achieves improved rate-distortion performance.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126872220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093833
Jiajie Hu, Bin Jin, Weiyao Lin, Jun Huang, Hangzai Luo, Zhenzhong Chen, Hongxiang Li
In this paper, we propose a novel algorithm for improving and visualizing image search results. The proposed algorithm improves user's image search experience by three steps: (1) re-rank the initial image search results by the random walk refinement based on visual consistency and saliency cues, (2) project the re-ranked images into a 2-dimentional panel according to their saliency information and correlations, (3) detect and extract the saliency regions in each image for final visualization. To evaluate the performance of our algorithm, user study has been conducted. Experimental results demonstrate that our visualization algorithm provides more pleasing image search experience than the conventional image search methods.
{"title":"Saliency-based visualization for image search","authors":"Jiajie Hu, Bin Jin, Weiyao Lin, Jun Huang, Hangzai Luo, Zhenzhong Chen, Hongxiang Li","doi":"10.1109/MMSP.2011.6093833","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093833","url":null,"abstract":"In this paper, we propose a novel algorithm for improving and visualizing image search results. The proposed algorithm improves user's image search experience by three steps: (1) re-rank the initial image search results by the random walk refinement based on visual consistency and saliency cues, (2) project the re-ranked images into a 2-dimentional panel according to their saliency information and correlations, (3) detect and extract the saliency regions in each image for final visualization. To evaluate the performance of our algorithm, user study has been conducted. Experimental results demonstrate that our visualization algorithm provides more pleasing image search experience than the conventional image search methods.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122201342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093771
Xin Huang, L. L. Rakêt, Huynh Van Luong, M. Nielsen, F. Lauze, Søren Forchhammer
Transform Domain Wyner-Ziv (TDWZ) video coding is an efficient Distributed Video coding solution providing new features such as low complexity encoding, by mainly exploiting the source statistics at the decoder based on the availability of decoder side information. The accuracy of the decoder side information has a major impact on the performance of TDWZ. In this paper, a novel multi-hypothesis based TDWZ video coding is presented to exploit the redundancy between multiple side information and the source information. The decoder used optical flow for side information calculation. Compared with the best available single estimation mode TDWZ, the proposed multi-hypothesis based TDWZ achieves robustly better Rate-Distortion (RD) performance and the overall improvement is up to 0.6 dB at high bitrate and up to 2 dB compared with the DISCOVER TDWZ video codec.
{"title":"Multi-hypothesis transform domain Wyner-Ziv video coding including optical flow","authors":"Xin Huang, L. L. Rakêt, Huynh Van Luong, M. Nielsen, F. Lauze, Søren Forchhammer","doi":"10.1109/MMSP.2011.6093771","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093771","url":null,"abstract":"Transform Domain Wyner-Ziv (TDWZ) video coding is an efficient Distributed Video coding solution providing new features such as low complexity encoding, by mainly exploiting the source statistics at the decoder based on the availability of decoder side information. The accuracy of the decoder side information has a major impact on the performance of TDWZ. In this paper, a novel multi-hypothesis based TDWZ video coding is presented to exploit the redundancy between multiple side information and the source information. The decoder used optical flow for side information calculation. Compared with the best available single estimation mode TDWZ, the proposed multi-hypothesis based TDWZ achieves robustly better Rate-Distortion (RD) performance and the overall improvement is up to 0.6 dB at high bitrate and up to 2 dB compared with the DISCOVER TDWZ video codec.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124550261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093814
Q. Wu, Hongliang Li, Tiantang Chen
In this paper, a directional samples reordering (DSR) based algorithm was proposed for intra residual data transform. To make the residual more suitable for discrete cosine transform, the diagonal edge in arbitrary size of intra prediction block can be rotated to a regular horizontal edge by reordering the samples in the block. The intra residual data will be used to implement 2D discrete cosine transform after DSR procedure. Experimental results show that up to 0.5967 dB and on average 0.4372 dB gain can be achieved for CIF sequence in high bitrate with the proposed algorithm at high complexity mode than H.264 intra coding.
{"title":"Directional samples reordering for intra residual transform","authors":"Q. Wu, Hongliang Li, Tiantang Chen","doi":"10.1109/MMSP.2011.6093814","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093814","url":null,"abstract":"In this paper, a directional samples reordering (DSR) based algorithm was proposed for intra residual data transform. To make the residual more suitable for discrete cosine transform, the diagonal edge in arbitrary size of intra prediction block can be rotated to a regular horizontal edge by reordering the samples in the block. The intra residual data will be used to implement 2D discrete cosine transform after DSR procedure. Experimental results show that up to 0.5967 dB and on average 0.4372 dB gain can be achieved for CIF sequence in high bitrate with the proposed algorithm at high complexity mode than H.264 intra coding.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121646374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093807
Chih-Ming Fu, Ching-Yeh Chen, Yu-Wen Huang, S. Lei
A new video coding tool, sample adaptive offset (SAO), is introduced in this paper. SAO has been adopted into the Working Draft of the new video coding standard, High-Efficiency Video Coding (HEVC). The SAO is located after deblocking in the video coding loop. The concept of SAO is to classify reconstructed pixels into different categories and then reduce the distortion by simply adding an offset for each category of pixels. The pixel intensity and edge properties are used for pixel classification. To further improve the coding efficiency, a picture can be divided into regions for localization of offset parameters. Simulation results show that SAO can achieve on average 2% bit rate reduction and up to 6% bit rate reduction. The run time increases for encoders and decoders are only 2%.
{"title":"Sample adaptive offset for HEVC","authors":"Chih-Ming Fu, Ching-Yeh Chen, Yu-Wen Huang, S. Lei","doi":"10.1109/MMSP.2011.6093807","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093807","url":null,"abstract":"A new video coding tool, sample adaptive offset (SAO), is introduced in this paper. SAO has been adopted into the Working Draft of the new video coding standard, High-Efficiency Video Coding (HEVC). The SAO is located after deblocking in the video coding loop. The concept of SAO is to classify reconstructed pixels into different categories and then reduce the distortion by simply adding an offset for each category of pixels. The pixel intensity and edge properties are used for pixel classification. To further improve the coding efficiency, a picture can be divided into regions for localization of offset parameters. Simulation results show that SAO can achieve on average 2% bit rate reduction and up to 6% bit rate reduction. The run time increases for encoders and decoders are only 2%.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132596121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093805
Y. H. Tan, Chuohao Yeo, Hui Li Tan, Zhengguo Li
In the current working draft of HEVC, residual quad-tree (RQT) coding is used to encode prediction residuals in both Intra and Inter coding units (CU). However, the rationale for using RQT as a coding tool is different in the two cases. For Intra prediction units, RQT provides an efficient syntax for coding a number of sub-blocks with the same intra prediction mode. For Inter CUs, RQT adapts to the spatial-frequency variations of the CU, using as large a transform size as possible while catering to local variations in residual statistics. While providing coding gains, effective use of RQT currently requires an exhaustive search of all possible combinations of transform sizes within a block. In this paper, we exploit our insights to develop two fast RQT algorithms, each designed to meet the needs of Intra and Inter prediction residual coding.
{"title":"On residual quad-tree coding in HEVC","authors":"Y. H. Tan, Chuohao Yeo, Hui Li Tan, Zhengguo Li","doi":"10.1109/MMSP.2011.6093805","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093805","url":null,"abstract":"In the current working draft of HEVC, residual quad-tree (RQT) coding is used to encode prediction residuals in both Intra and Inter coding units (CU). However, the rationale for using RQT as a coding tool is different in the two cases. For Intra prediction units, RQT provides an efficient syntax for coding a number of sub-blocks with the same intra prediction mode. For Inter CUs, RQT adapts to the spatial-frequency variations of the CU, using as large a transform size as possible while catering to local variations in residual statistics. While providing coding gains, effective use of RQT currently requires an exhaustive search of all possible combinations of transform sizes within a block. In this paper, we exploit our insights to develop two fast RQT algorithms, each designed to meet the needs of Intra and Inter prediction residual coding.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125245836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093828
Yi Ge, Bo Yan, Kairan Sun, H. Gharavi
This paper proposes an effective algorithm for spatial error concealment with accurate edge detection and partitioning interpolation. Firstly, a new method is used for detecting possible edge pixels and their matching pixels around the lost block. Then, the true edge lines can be determined, with which the lost block is partitioned. Finally, based on the partition result, each lost pixel can be interpolated with correct reference pixels, which are in the same region with the lost pixel. Experimental results show that the proposed spatial error concealment method is obviously superior to the previous methods for different sequences by up to 4.04 dB.
{"title":"Selective pixel interpolation for spatial error concealment","authors":"Yi Ge, Bo Yan, Kairan Sun, H. Gharavi","doi":"10.1109/MMSP.2011.6093828","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093828","url":null,"abstract":"This paper proposes an effective algorithm for spatial error concealment with accurate edge detection and partitioning interpolation. Firstly, a new method is used for detecting possible edge pixels and their matching pixels around the lost block. Then, the true edge lines can be determined, with which the lost block is partitioned. Finally, based on the partition result, each lost pixel can be interpolated with correct reference pixels, which are in the same region with the lost pixel. Experimental results show that the proposed spatial error concealment method is obviously superior to the previous methods for different sequences by up to 4.04 dB.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126607403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}