Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702457
Qionghai Dai, Xiangyang Ji, Xun Cao
3D video capturing acquires the visual information in 3D manner, which possesses the first step of the entire 3DTV system chain before 3D coding, transmission and visualization. The 3D capturing plays an important role because precise 3D visual capturing will benefit the whole 3DTV system. During the past decades, various kinds of capturing system have been built for different applications such as FTV[1], 3DTV, 3D movie, etc. As the cost of sensors reduces in recent years, a lot of systems utilize multiple cameras to acquire visual information, which is called multiview capturing. 3D information can be further extracted through multiview geometry. We will first give a brief review of these multiview systems and analyze their relationship from the perspective of plenoptic function [2]. Along with the multiple cameras, a lot of systems also make use of multiple lights to control the illumination condition. A new concept of vision field is presented in this talk according to the view-light-time subspace, which can be derived from the plenoptic function. The features and applications for each capturing system will be emphasized as well as the important issues in capturing like synchronization and calibration. Besides the multiple camera systems, some new techniques using TOF (time-offlight) camera [3] and 3D scanner will also be included in this talk.
{"title":"Vision field capturing and its applications in 3DTV","authors":"Qionghai Dai, Xiangyang Ji, Xun Cao","doi":"10.1109/PCS.2010.5702457","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702457","url":null,"abstract":"3D video capturing acquires the visual information in 3D manner, which possesses the first step of the entire 3DTV system chain before 3D coding, transmission and visualization. The 3D capturing plays an important role because precise 3D visual capturing will benefit the whole 3DTV system. During the past decades, various kinds of capturing system have been built for different applications such as FTV[1], 3DTV, 3D movie, etc. As the cost of sensors reduces in recent years, a lot of systems utilize multiple cameras to acquire visual information, which is called multiview capturing. 3D information can be further extracted through multiview geometry. We will first give a brief review of these multiview systems and analyze their relationship from the perspective of plenoptic function [2]. Along with the multiple cameras, a lot of systems also make use of multiple lights to control the illumination condition. A new concept of vision field is presented in this talk according to the view-light-time subspace, which can be derived from the plenoptic function. The features and applications for each capturing system will be emphasized as well as the important issues in capturing like synchronization and calibration. Besides the multiple camera systems, some new techniques using TOF (time-offlight) camera [3] and 3D scanner will also be included in this talk.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"20 6 Pt 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130842882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702522
S. Rodríguez, F. Díaz-de-María
In this paper we propose a novel VBR controller for real-time H.264/SVC video coding. Since consecutive pictures within the same scene often exhibit similar degrees of complexity, the proposed VBR controller allows for just an incremental variation of QP with respect to that of the previous picture, so preventing unnecessary QP fluctuations. For this purpose, an RBF network has been carefully designed to estimate the QP increment at each dependency (spatial or CGS) layer. A mobile live streaming application scenario was simulated to assess the performance of the proposed VBR controller, which was compared to a recently proposed CBR controller for H.264/SVC. The experimental results show a remarkably consistent quality, notably outperforming the reference CBR controller.
{"title":"RBF-based VBR controller for real-time H.264/SVC video coding","authors":"S. Rodríguez, F. Díaz-de-María","doi":"10.1109/PCS.2010.5702522","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702522","url":null,"abstract":"In this paper we propose a novel VBR controller for real-time H.264/SVC video coding. Since consecutive pictures within the same scene often exhibit similar degrees of complexity, the proposed VBR controller allows for just an incremental variation of QP with respect to that of the previous picture, so preventing unnecessary QP fluctuations. For this purpose, an RBF network has been carefully designed to estimate the QP increment at each dependency (spatial or CGS) layer. A mobile live streaming application scenario was simulated to assess the performance of the proposed VBR controller, which was compared to a recently proposed CBR controller for H.264/SVC. The experimental results show a remarkably consistent quality, notably outperforming the reference CBR controller.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126900942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702512
Masaaki Matsumura, Seishi Takamura, H. Jozawa
Many image/video codecs are constructed by the combination of various coding tools such as block division/scanning, branch selection and entropy coders. Codec researchers are developing new coding tools, and seeking versatile combinations that offer improved coding efficiency for various images/videos. However, because of the huge amount of the combination, deriving the best combination is impossible by man-power seeking. In this paper, we propose an automatic optimization method for deriving the combination that suits for categorized pictures. We prepare some categorised pictures, and optimize the combination for each category. In the case of optimization for lossless image coding, our method achieves a bit-rate reduction of over 2.8% (maximum) compared to the combination that offers the best bit-rate averagely prepared beforehand.
{"title":"Generating subject oriented codec by evolutionary approach","authors":"Masaaki Matsumura, Seishi Takamura, H. Jozawa","doi":"10.1109/PCS.2010.5702512","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702512","url":null,"abstract":"Many image/video codecs are constructed by the combination of various coding tools such as block division/scanning, branch selection and entropy coders. Codec researchers are developing new coding tools, and seeking versatile combinations that offer improved coding efficiency for various images/videos. However, because of the huge amount of the combination, deriving the best combination is impossible by man-power seeking. In this paper, we propose an automatic optimization method for deriving the combination that suits for categorized pictures. We prepare some categorised pictures, and optimize the combination for each category. In the case of optimization for lossless image coding, our method achieves a bit-rate reduction of over 2.8% (maximum) compared to the combination that offers the best bit-rate averagely prepared beforehand.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121300368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702499
Cheng-Wei Chou, J. Tsai, H. Hang, Hung-Chih Lin
In this paper, we propose a fast graph cut (GC) algorithm for disparity estimation. Two accelerating techniques are suggested: one is the early termination rule, and the other is prioritizing the α-β swap pair search order. Our simulations show that the proposed fast GC algorithm outperforms the original GC scheme by 210% in the average computation time while its disparity estimation quality is almost similar to that of the original GC.
{"title":"A fast graph cut algorithm for disparity estimation","authors":"Cheng-Wei Chou, J. Tsai, H. Hang, Hung-Chih Lin","doi":"10.1109/PCS.2010.5702499","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702499","url":null,"abstract":"In this paper, we propose a fast graph cut (GC) algorithm for disparity estimation. Two accelerating techniques are suggested: one is the early termination rule, and the other is prioritizing the α-β swap pair search order. Our simulations show that the proposed fast GC algorithm outperforms the original GC scheme by 210% in the average computation time while its disparity estimation quality is almost similar to that of the original GC.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"86 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120895236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702438
Tuan-Anh Nguyen, M. Kim, Min-Cheol Hong
In this paper, we propose a spatially adaptive noise removal algorithm using local statistics that consists of two stages: noise detection and removal. To corporate desirable properties into denoising process, the local weighted mean, local weighted activity, and local maximum are defined. With these local statistics, the noise detection function is defined and a modified Gaussian filter is used to suppress the detected noise components. The experimental results demonstrate the effectiveness of the proposed algorithm.
{"title":"Fast and efficient Gaussian noise image restoration algorithm by spatially adaptive filtering","authors":"Tuan-Anh Nguyen, M. Kim, Min-Cheol Hong","doi":"10.1109/PCS.2010.5702438","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702438","url":null,"abstract":"In this paper, we propose a spatially adaptive noise removal algorithm using local statistics that consists of two stages: noise detection and removal. To corporate desirable properties into denoising process, the local weighted mean, local weighted activity, and local maximum are defined. With these local statistics, the noise detection function is defined and a modified Gaussian filter is used to suppress the detected noise components. The experimental results demonstrate the effectiveness of the proposed algorithm.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"271 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115269714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702570
H. Sasaki, Z. Li, H. Kiya
One category of fast full-search block matching algorithms (BMAs) is based on the fast Fourier transformation (FFT). In conventional methods in this category, the macroblock size must be adjusted to the search window size by zero-padding. In these methods, the memory consumption and computational complexity heavily depend on the size difference between the macroblock and the search window. Thus, we propose a novel FFT-based BMA to solve this problem. The proposed method divides the search window into multiple sub search windows to versatilely control the difference between the macroblock and the search window sizes. Simulation results show the effectiveness of the proposed method.
{"title":"FFT-based full-search block matching using overlap-add method","authors":"H. Sasaki, Z. Li, H. Kiya","doi":"10.1109/PCS.2010.5702570","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702570","url":null,"abstract":"One category of fast full-search block matching algorithms (BMAs) is based on the fast Fourier transformation (FFT). In conventional methods in this category, the macroblock size must be adjusted to the search window size by zero-padding. In these methods, the memory consumption and computational complexity heavily depend on the size difference between the macroblock and the search window. Thus, we propose a novel FFT-based BMA to solve this problem. The proposed method divides the search window into multiple sub search windows to versatilely control the difference between the macroblock and the search window sizes. Simulation results show the effectiveness of the proposed method.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121023018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702481
Kyung-Yeon Min, Seanae Park, Donggyu Sim
In this paper, we propose a new distributed video coding (DVC) method based on adaptive slice size using received motion vectors (MVs). In the proposed algorithm, the MVs estimated at a DVC decoder are transmitted to a corresponding encoder. In the proposed encoder, a predicted side information (PSI) is reconstructed with the transmitted MVs and key frames. Therefore, the PSI can be generated same to side information (SI) at the decoder. We can, also, calculate an exact crossover probability between the SI and original input frame using PSI and the original frame. As a result, the proposed method can transmit minimum parity bits to maximize error correction ability of a channel decoder with minimal computational complexity. Experimental results show that the proposed algorithm is better than several conventional DVC methods.
{"title":"Distributed video coding based on adaptive slice size using received motion vectors","authors":"Kyung-Yeon Min, Seanae Park, Donggyu Sim","doi":"10.1109/PCS.2010.5702481","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702481","url":null,"abstract":"In this paper, we propose a new distributed video coding (DVC) method based on adaptive slice size using received motion vectors (MVs). In the proposed algorithm, the MVs estimated at a DVC decoder are transmitted to a corresponding encoder. In the proposed encoder, a predicted side information (PSI) is reconstructed with the transmitted MVs and key frames. Therefore, the PSI can be generated same to side information (SI) at the decoder. We can, also, calculate an exact crossover probability between the SI and original input frame using PSI and the original frame. As a result, the proposed method can transmit minimum parity bits to maximize error correction ability of a channel decoder with minimal computational complexity. Experimental results show that the proposed algorithm is better than several conventional DVC methods.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122772308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702569
K. R. Vijayanagar, Bowen Dan, Joohee Kim
Distributed Video Coding (DVC) is a popular topic in the research community and the past years have seen several different implementations. DVC has been proposed as a solution for applications that have limited battery resources and low hardware complexity, thus necessitating a low complexity encoder. An ideal application would be in remote surveillance/monitoring or live video conferencing. However, current solutions use iteratively decodable channel codes like LDPCA or Turbo codes that have large latencies. In order to make real-time communication possible. The proposed architecture makes efficient use of Skip blocks to reduce the bitrate, eliminates the iterative decoding nature of the Wyner-Ziv (WZ) channel and uses a simple data-hiding based compression algorithm. This drastically cuts down on the time complexity of the decoding procedure while still maintaining an rate-distortion performance better than that of H.264/AVC Intra coding and other current DVC solutions.
{"title":"Low delay Distributed Video Coding using data hiding","authors":"K. R. Vijayanagar, Bowen Dan, Joohee Kim","doi":"10.1109/PCS.2010.5702569","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702569","url":null,"abstract":"Distributed Video Coding (DVC) is a popular topic in the research community and the past years have seen several different implementations. DVC has been proposed as a solution for applications that have limited battery resources and low hardware complexity, thus necessitating a low complexity encoder. An ideal application would be in remote surveillance/monitoring or live video conferencing. However, current solutions use iteratively decodable channel codes like LDPCA or Turbo codes that have large latencies. In order to make real-time communication possible. The proposed architecture makes efficient use of Skip blocks to reduce the bitrate, eliminates the iterative decoding nature of the Wyner-Ziv (WZ) channel and uses a simple data-hiding based compression algorithm. This drastically cuts down on the time complexity of the decoding procedure while still maintaining an rate-distortion performance better than that of H.264/AVC Intra coding and other current DVC solutions.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115103072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702576
Muhammad Majid, G. Abhayaratne
In this paper, we present a new method for scalable multiple description video coding based on motion compensated temporal filtering and multiple description scalar quantizer with successive refinement. In our method quality scalability is achieved by successively refining the side quantizers of a multiple description scalar quantizer. The rate of each description is allocated by considering different refinement levels for each spatio-temporal subband. The performance of the proposed scheme under lossless and lossy channel conditions are presented and compared with single scalable description video coding.
{"title":"Scalable multiple description video coding using successive refinement of side quantizers","authors":"Muhammad Majid, G. Abhayaratne","doi":"10.1109/PCS.2010.5702576","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702576","url":null,"abstract":"In this paper, we present a new method for scalable multiple description video coding based on motion compensated temporal filtering and multiple description scalar quantizer with successive refinement. In our method quality scalability is achieved by successively refining the side quantizers of a multiple description scalar quantizer. The rate of each description is allocated by considering different refinement levels for each spatio-temporal subband. The performance of the proposed scheme under lossless and lossy channel conditions are presented and compared with single scalable description video coding.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115183550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-12-01DOI: 10.1109/PCS.2010.5702480
Gilbert Yammine, Eugen Wige, André Kaup
In this paper, we provide a simple method for analyzing the GOP structure of an MPEG-2 or H.264/AVC decoded video without having access to the bitstream. Noise estimation is applied on the decoded frames and the variance of the noise in the different I-, P-, and B-frames is measured. After the encoding process, the noise variance in the video sequence shows a periodic pattern, which helps in the extraction of the GOP period, as well as the type of frames. This algorithm can be used along with other algorithms to blindly analyze the encoding history of a video sequence. The method has been tested on several MPEG-2 DVB and DVD streams, as well as on H.264/AVC encoded sequences, and shows successful results in both cases.
{"title":"Blind GOP structure analysis of MPEG-2 and H.264/AVC decoded video","authors":"Gilbert Yammine, Eugen Wige, André Kaup","doi":"10.1109/PCS.2010.5702480","DOIUrl":"https://doi.org/10.1109/PCS.2010.5702480","url":null,"abstract":"In this paper, we provide a simple method for analyzing the GOP structure of an MPEG-2 or H.264/AVC decoded video without having access to the bitstream. Noise estimation is applied on the decoded frames and the variance of the noise in the different I-, P-, and B-frames is measured. After the encoding process, the noise variance in the video sequence shows a periodic pattern, which helps in the extraction of the GOP period, as well as the type of frames. This algorithm can be used along with other algorithms to blindly analyze the encoding history of a video sequence. The method has been tested on several MPEG-2 DVB and DVD streams, as well as on H.264/AVC encoded sequences, and shows successful results in both cases.","PeriodicalId":255142,"journal":{"name":"28th Picture Coding Symposium","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126239711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}