In this paper, we propose a reversible data hiding (RDH) method based on a two-dimensional wavelet coefficient histogram (2D WCH) in the wavelet domain. First, a cover image is decomposed into wavelet subbands using the invertible integer-to-integer wavelet transform (121-WT). Then, the 2D WCH is generated by counting the occurrence frequency of the wavelet coefficient pairs which denote two wavelet coefficients located in the same position in the selected two subbands where the secret message is embedded. By using the 2D WCH, the correlation between the selected subbands is more effectively utilized than the traditional ID histogram. The proposed method embed the secret message reversibly in the cover image by expanding the 2D WCH. In order to embed the secret message as efficient as possible, the expansion rule for 2D WCH is proposed. Moreover, the coefficient pair selection (CPS), which the coefficients embedding the data are selected in order to modify only the selected coefficients, is implemented before generating the 2D WCH. In the experiment, the proposed method is compared with the conventional RDH methods in terms of the capacity-distortion curve.
{"title":"Two-dimensional histogram expansion of wavelet coefficient for reversible data hiding","authors":"Kazuki Yamato, Kazuma Shinoda, Madoka Hasegawa, Shigeo Kato","doi":"10.1109/VCIP.2014.7051553","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051553","url":null,"abstract":"In this paper, we propose a reversible data hiding (RDH) method based on a two-dimensional wavelet coefficient histogram (2D WCH) in the wavelet domain. First, a cover image is decomposed into wavelet subbands using the invertible integer-to-integer wavelet transform (121-WT). Then, the 2D WCH is generated by counting the occurrence frequency of the wavelet coefficient pairs which denote two wavelet coefficients located in the same position in the selected two subbands where the secret message is embedded. By using the 2D WCH, the correlation between the selected subbands is more effectively utilized than the traditional ID histogram. The proposed method embed the secret message reversibly in the cover image by expanding the 2D WCH. In order to embed the secret message as efficient as possible, the expansion rule for 2D WCH is proposed. Moreover, the coefficient pair selection (CPS), which the coefficients embedding the data are selected in order to modify only the selected coefficients, is implemented before generating the 2D WCH. In the experiment, the proposed method is compared with the conventional RDH methods in terms of the capacity-distortion curve.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121153626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/VCIP.2014.7051510
Yuwen He, Yan Ye, Jie Dong
Color gamut scalability (CGS) in scalable extensions of High Efficiency Video Coding (SHVC) supports scalable coding with multiple layers in different color spaces. Base layer conveying HDTV video in BT.709 color space and enhancement layer conveying UHDTV video in BT.2020 color space is identified as a practical use case for CGS. Efficient CGS coding can be achieved using a 3D Look-up Table (LUT) based color conversion process. This paper proposes a robust 3D LUT parameter estimation method that estimates the 3D LUT parameters globally using the Least Square method. Problems of matrix sparsity and uneven sample distribution are carefully handled to improve the stability and accuracy of the estimation process. Simulation results confirm that the proposed 3D LUT estimation method can significantly improve coding performance compared with other gamut conversion methods.
高效视频编码(High Efficiency Video Coding, SHVC)的可扩展扩展中的色域可扩展性(CGS)支持在不同颜色空间中进行多层可扩展编码。确定了在BT.709色彩空间中传输HDTV视频的基层和在BT.2020色彩空间中传输UHDTV视频的增强层作为CGS的实际用例。使用基于3D查找表(LUT)的颜色转换过程可以实现高效的CGS编码。提出了一种鲁棒的三维LUT参数估计方法,利用最小二乘法对三维LUT参数进行全局估计。为了提高估计过程的稳定性和准确性,对矩阵稀疏性和样本分布不均匀等问题进行了细致的处理。仿真结果表明,与其他色域转换方法相比,所提出的三维LUT估计方法可以显著提高编码性能。
{"title":"Robust 3D LUT estimation method for SHVC color gamut scalability","authors":"Yuwen He, Yan Ye, Jie Dong","doi":"10.1109/VCIP.2014.7051510","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051510","url":null,"abstract":"Color gamut scalability (CGS) in scalable extensions of High Efficiency Video Coding (SHVC) supports scalable coding with multiple layers in different color spaces. Base layer conveying HDTV video in BT.709 color space and enhancement layer conveying UHDTV video in BT.2020 color space is identified as a practical use case for CGS. Efficient CGS coding can be achieved using a 3D Look-up Table (LUT) based color conversion process. This paper proposes a robust 3D LUT parameter estimation method that estimates the 3D LUT parameters globally using the Least Square method. Problems of matrix sparsity and uneven sample distribution are carefully handled to improve the stability and accuracy of the estimation process. Simulation results confirm that the proposed 3D LUT estimation method can significantly improve coding performance compared with other gamut conversion methods.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115380796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/VCIP.2014.7051498
Li Song, Chen Chen, Yi Xu, Genjian Xue, Yi Zhou
A recently proposed model, known as blind/referenceless image spatial quality evaluator (BRISQUE), achieves the state-of-the-art performance in context of blind image quality assessment (IQA). This model used the predefined generalized Gaussian distribution (GGD) to describe the regularity of natural scene statistics, introducing fitting errors due to variations of image contents. In this paper, a more generalized model is proposed to better characterize the regularity of extensive image contents, which is learned from the concatenated histograms of mean subtracted contrast normalized (MSCN) coefficients and pairwise products of MSCN coefficients of neighbouring pixels. The new feature based on MSCN shows its capability of preserving intrinsic distribution of image statistics. Consequently support vector machine regression (SVR) can map it to more accurate image quality scores. Experimental results show that the proposed approach achieves a slight gain from BRISQUE, which indicates the crafted GGD modelling step in BRISQUE is not essential for final performance.
{"title":"Blind image quality assessment based on a new feature of nature scene statistics","authors":"Li Song, Chen Chen, Yi Xu, Genjian Xue, Yi Zhou","doi":"10.1109/VCIP.2014.7051498","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051498","url":null,"abstract":"A recently proposed model, known as blind/referenceless image spatial quality evaluator (BRISQUE), achieves the state-of-the-art performance in context of blind image quality assessment (IQA). This model used the predefined generalized Gaussian distribution (GGD) to describe the regularity of natural scene statistics, introducing fitting errors due to variations of image contents. In this paper, a more generalized model is proposed to better characterize the regularity of extensive image contents, which is learned from the concatenated histograms of mean subtracted contrast normalized (MSCN) coefficients and pairwise products of MSCN coefficients of neighbouring pixels. The new feature based on MSCN shows its capability of preserving intrinsic distribution of image statistics. Consequently support vector machine regression (SVR) can map it to more accurate image quality scores. Experimental results show that the proposed approach achieves a slight gain from BRISQUE, which indicates the crafted GGD modelling step in BRISQUE is not essential for final performance.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115904665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/VCIP.2014.7051519
Fabian Jäger, M. Wien
3D video is an emerging technology that bundles depth information with texture videos to allow for view synthesis applications at the receiver. Depth discontinuities define object boundaries in both, depth maps and the collocated texture video. Therefore, depth segmentation can be utilized for a fine-grained motion field partitioning of the corresponding texture component. In this paper, depth information is used to increase coding efficiency for texture videos by deriving an arbitrarily shaped partitioning. By applying motion compensation to each partition independently and eventually merging the two prediction signals, highly accurate prediction signals can be produced that reduce the remaining texture residual signal significantly. Simulation results show bitrate savings of up to 2.8% for the dependent texture views and up to about 1.0% with respect to the total bitrate.
{"title":"Simplified depth-based block partitioning and prediction merging in 3D video coding","authors":"Fabian Jäger, M. Wien","doi":"10.1109/VCIP.2014.7051519","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051519","url":null,"abstract":"3D video is an emerging technology that bundles depth information with texture videos to allow for view synthesis applications at the receiver. Depth discontinuities define object boundaries in both, depth maps and the collocated texture video. Therefore, depth segmentation can be utilized for a fine-grained motion field partitioning of the corresponding texture component. In this paper, depth information is used to increase coding efficiency for texture videos by deriving an arbitrarily shaped partitioning. By applying motion compensation to each partition independently and eventually merging the two prediction signals, highly accurate prediction signals can be produced that reduce the remaining texture residual signal significantly. Simulation results show bitrate savings of up to 2.8% for the dependent texture views and up to about 1.0% with respect to the total bitrate.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116297947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Camcorder piracy has great impact on the movie industry. Although there are many methods to prevent recording in theatre, no recognized technology satisfies the need of defeating camcorder piracy as well as having no effect on the audience. This paper presents a new projector display technique to defeat camcorder piracy in the theatre using a new paradigm of information display technology, called temporal psychovisual modulation (TPVM). TPVM exploits the difference in image formation mechanisms of human eyes and imaging sensors. The images formed in human vision is continuous integration of the light field while discrete sampling is used in digital video acquisition which has "blackout" period in each sampling cycle. Based on this difference, we can decompose a movie into a set of display frames and broadcast them out at high speed so that the audience can not notice any disturbance, while the video frames captured by camcorder will contain highly objectionable artifacts. The proposed prototype system built on the platform of DLP® LightCrafter 4500™ serves as a proof-of-concept of anti-piracy system.
{"title":"DLP based anti-piracy display system","authors":"Zhongpai Gao, Guangtao Zhai, Xiaolin Wu, Xiongkuo Min, Cheng Zhi","doi":"10.1109/VCIP.2014.7051525","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051525","url":null,"abstract":"Camcorder piracy has great impact on the movie industry. Although there are many methods to prevent recording in theatre, no recognized technology satisfies the need of defeating camcorder piracy as well as having no effect on the audience. This paper presents a new projector display technique to defeat camcorder piracy in the theatre using a new paradigm of information display technology, called temporal psychovisual modulation (TPVM). TPVM exploits the difference in image formation mechanisms of human eyes and imaging sensors. The images formed in human vision is continuous integration of the light field while discrete sampling is used in digital video acquisition which has \"blackout\" period in each sampling cycle. Based on this difference, we can decompose a movie into a set of display frames and broadcast them out at high speed so that the audience can not notice any disturbance, while the video frames captured by camcorder will contain highly objectionable artifacts. The proposed prototype system built on the platform of DLP® LightCrafter 4500™ serves as a proof-of-concept of anti-piracy system.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115699688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The new video coding standard, High Efficiency Video Coding (HEVC), has been established to succeed the widely used H.264/AVC standard. However, an enormous amount of legacy content is encoded with H.264/AVC. This makes high performance AVC to HEVC transcoding in great need. This paper presents a fast transcoding algorithm based on residual and motion information extracted from H.264 decoder. By exploiting these side information, regions' homogeneity characteristic are analysed. An efficient coding unit (CU) and prediction unit (PU) mode decision strategy is proposed combing regions' prediction homogeneity and current encoding information. The experimental results show that the proposed transcoding scheme can save up to 55% of encoding time with negligible loss of coding efficiency, when compared to that of the full decoding and full encoding transcoder.
高效视频编码(High Efficiency video coding, HEVC)是为了取代目前广泛使用的H.264/AVC标准而建立的一种新的视频编码标准。然而,大量的遗留内容是用H.264/AVC编码的。这使得高性能的AVC到HEVC转码的需求非常大。提出了一种基于H.264解码器残差和运动信息提取的快速转码算法。利用这些侧信息,分析了区域的均匀性特征。结合区域预测同质性和当前编码信息,提出了一种高效的编码单元和预测单元模式决策策略。实验结果表明,与全解码和全编码转码器相比,所提出的转码方案可节省高达55%的编码时间,而编码效率的损失可以忽略不计。
{"title":"Effective H.264/AVC to HEVC transcoder based on prediction homogeneity","authors":"Feiyang Zheng, Zhiru Shi, Xiaoyun Zhang, Zhiyong Gao","doi":"10.1109/VCIP.2014.7051547","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051547","url":null,"abstract":"The new video coding standard, High Efficiency Video Coding (HEVC), has been established to succeed the widely used H.264/AVC standard. However, an enormous amount of legacy content is encoded with H.264/AVC. This makes high performance AVC to HEVC transcoding in great need. This paper presents a fast transcoding algorithm based on residual and motion information extracted from H.264 decoder. By exploiting these side information, regions' homogeneity characteristic are analysed. An efficient coding unit (CU) and prediction unit (PU) mode decision strategy is proposed combing regions' prediction homogeneity and current encoding information. The experimental results show that the proposed transcoding scheme can save up to 55% of encoding time with negligible loss of coding efficiency, when compared to that of the full decoding and full encoding transcoder.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115012299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/VCIP.2014.7051490
Li Liu, Chao Zhou, Xinggong Zhang, Zongming Guo, Cheng Li
Recently parallel Dynamic Adaptive Streaming over HTTP (DASH) has emerged as a promising way to supply higher bandwidth, connection diversity and reliability. However, it is still a big challenge to download chunks sequentially in parallel DASH due to heterogeneous and time-varying bandwidth of multiple servers. In this paper, we propose a novel probabilistic chunk scheduling approach considering time-varying bandwidth. Video chunks are scheduled to the servers which consume the least time while with the highest probability to complete downloading before the deadline. The proposed approach is formulated as a constrained optimization problem with the objective to minimize the total downloading time. Using the probabilistic model of time-varying bandwidth, we first estimate the probability of successful downloading chunks before the playback deadline. Then we estimate the download time of chunks. A near-optimal solution algorithm is designed which schedules chunks to the servers with minimal downloading time while the completion probability is under the constraint. Compared with the existing schemes, the experimental results demonstrate that our proposed scheme greatly increases the number of chunks that are received orderly.
{"title":"Probabilistic chunk scheduling approach in parallel multiple-server DASH","authors":"Li Liu, Chao Zhou, Xinggong Zhang, Zongming Guo, Cheng Li","doi":"10.1109/VCIP.2014.7051490","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051490","url":null,"abstract":"Recently parallel Dynamic Adaptive Streaming over HTTP (DASH) has emerged as a promising way to supply higher bandwidth, connection diversity and reliability. However, it is still a big challenge to download chunks sequentially in parallel DASH due to heterogeneous and time-varying bandwidth of multiple servers. In this paper, we propose a novel probabilistic chunk scheduling approach considering time-varying bandwidth. Video chunks are scheduled to the servers which consume the least time while with the highest probability to complete downloading before the deadline. The proposed approach is formulated as a constrained optimization problem with the objective to minimize the total downloading time. Using the probabilistic model of time-varying bandwidth, we first estimate the probability of successful downloading chunks before the playback deadline. Then we estimate the download time of chunks. A near-optimal solution algorithm is designed which schedules chunks to the servers with minimal downloading time while the completion probability is under the constraint. Compared with the existing schemes, the experimental results demonstrate that our proposed scheme greatly increases the number of chunks that are received orderly.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129422509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/VCIP.2014.7051603
Yuchen Li, Yitong Liu, Hongwen Yang, Dacheng Yang
The latest standard of High Efficiency Video Coding (HEVC) has a better coding efficiency compared to H.264/AVC. It is reported that the bitrate of the video applied HEVC is the half of video applied H.264/AVC at the same encoding quality. However, the cost of improvement is the increasing computational complexity which is mainly brought by the quadtree based coding tree unit (CTU). In this paper, a fast coding units (CU) splitting and pruning method is proposed to speed up the process of searching the best partition for CTU. Experiment has shown that our method can save 46% computational complexity on average at the cost of increasing Bjontegaard delta rate (BD-rate) by 0.82% when the method is applied to sequences in Class A.
{"title":"Fast CU splitting and pruning method based on online learning for intra coding in HEVC","authors":"Yuchen Li, Yitong Liu, Hongwen Yang, Dacheng Yang","doi":"10.1109/VCIP.2014.7051603","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051603","url":null,"abstract":"The latest standard of High Efficiency Video Coding (HEVC) has a better coding efficiency compared to H.264/AVC. It is reported that the bitrate of the video applied HEVC is the half of video applied H.264/AVC at the same encoding quality. However, the cost of improvement is the increasing computational complexity which is mainly brought by the quadtree based coding tree unit (CTU). In this paper, a fast coding units (CU) splitting and pruning method is proposed to speed up the process of searching the best partition for CTU. Experiment has shown that our method can save 46% computational complexity on average at the cost of increasing Bjontegaard delta rate (BD-rate) by 0.82% when the method is applied to sequences in Class A.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129422541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/VCIP.2014.7051591
Chen Zhao, Siwei Ma, Wen Gao
Seeking a fair domain in which the signal can exhibit high sparsity is of essential significance in compressive sensing (CS). Most methods in the literature, however, use a fixed transform domain or prior information, which cannot adapt to various video contents. In this paper, we propose a video CS recovery algorithm based on the structured Laplacian model, which can effectually deal with the non-stationarity of natural videos. To build the model, structured patch groups are constructed according to the nonlocal similarity in a temporal scope. By incorporating the model into the CS paradigm, we can formulate an ℓ1-norm optimization problem, for which a solution based on the iterative shrinkage/thresholding algorithms (ISTA) is designed. Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art methods in both objective and subjective recovery quality.
{"title":"Video compressive sensing via structured Laplacian modelling","authors":"Chen Zhao, Siwei Ma, Wen Gao","doi":"10.1109/VCIP.2014.7051591","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051591","url":null,"abstract":"Seeking a fair domain in which the signal can exhibit high sparsity is of essential significance in compressive sensing (CS). Most methods in the literature, however, use a fixed transform domain or prior information, which cannot adapt to various video contents. In this paper, we propose a video CS recovery algorithm based on the structured Laplacian model, which can effectually deal with the non-stationarity of natural videos. To build the model, structured patch groups are constructed according to the nonlocal similarity in a temporal scope. By incorporating the model into the CS paradigm, we can formulate an ℓ1-norm optimization problem, for which a solution based on the iterative shrinkage/thresholding algorithms (ISTA) is designed. Experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art methods in both objective and subjective recovery quality.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130217876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/VCIP.2014.7051597
R. Farrugia, Maverick Hili
This paper presents a depth coding strategy that employs K-means clustering to segment the sequence of depth images into K clusters. The resulting clusters are losslessly compressed and transmitted as supplemental enhancement information to aid the decoder in predicting macroblocks containing depth discontinuities. This method further employs an in-loop boundary reconstruction filter to reduce distortions at the edges. The proposed algorithm was integrated within both H.264/AVC and H.264/MVC video coding standards. Simulation results demonstrate that the proposed scheme outperforms the state of the art depth coding schemes, where rendered Peak Signal to Noise Ratio (PSNR) gains between 0.1 dB and 0.5 dB were observed.
{"title":"Depth coding using depth discontinuity prediction and in-loop boundary reconstruction filtering","authors":"R. Farrugia, Maverick Hili","doi":"10.1109/VCIP.2014.7051597","DOIUrl":"https://doi.org/10.1109/VCIP.2014.7051597","url":null,"abstract":"This paper presents a depth coding strategy that employs K-means clustering to segment the sequence of depth images into K clusters. The resulting clusters are losslessly compressed and transmitted as supplemental enhancement information to aid the decoder in predicting macroblocks containing depth discontinuities. This method further employs an in-loop boundary reconstruction filter to reduce distortions at the edges. The proposed algorithm was integrated within both H.264/AVC and H.264/MVC video coding standards. Simulation results demonstrate that the proposed scheme outperforms the state of the art depth coding schemes, where rendered Peak Signal to Noise Ratio (PSNR) gains between 0.1 dB and 0.5 dB were observed.","PeriodicalId":166978,"journal":{"name":"2014 IEEE Visual Communications and Image Processing Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131030883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}