Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706431
Yu Bai, Yi Xu, Xiaokang Yang, Qing Yan
Collective motions, one of the coordinated behaviors in crowd system, widely exist in nature. Orderliness characterizes how well an individual will move smoothly and consistently with his neighbors in collective motions. It is still an open problem in computer vision. In this paper, we propose an orderliness descriptor based on correlation of interactive social force between individuals. In order to include the force correlation between two individuals in a distance, we propose a Social Force Correlation Propagation algorithm to calculate orderliness of every individual effectively and efficiently. We validate the effectiveness of the proposed orderliness descriptor on synthetic simulation. Experimental results on challenging videos of real scene crowds demonstrate that orderliness descriptor can perceive motion with low smoothness and locate disorder.
{"title":"Measuring orderliness based on social force model in collective motions","authors":"Yu Bai, Yi Xu, Xiaokang Yang, Qing Yan","doi":"10.1109/VCIP.2013.6706431","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706431","url":null,"abstract":"Collective motions, one of the coordinated behaviors in crowd system, widely exist in nature. Orderliness characterizes how well an individual will move smoothly and consistently with his neighbors in collective motions. It is still an open problem in computer vision. In this paper, we propose an orderliness descriptor based on correlation of interactive social force between individuals. In order to include the force correlation between two individuals in a distance, we propose a Social Force Correlation Propagation algorithm to calculate orderliness of every individual effectively and efficiently. We validate the effectiveness of the proposed orderliness descriptor on synthetic simulation. Experimental results on challenging videos of real scene crowds demonstrate that orderliness descriptor can perceive motion with low smoothness and locate disorder.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131719547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706453
K. Rapaka, Jianle Chen, M. Karczewicz
Scalable video coding has been a popular research topic for many years. As one of its key objectives, it aims to support different receiving devices connected through a network structure using a single bitstream. Scalable video coding extension of HEVC, also called as SHVC, is being developed by Joint Collaborative Team on Video Coding (JCT-VC) of ISO/IEC MPEG and ITU-T VCEG. Compared to previous standardized scalable video coding technologies, SHVC employs multi-loop decoding design with no low-level changes within any given layer compared to HEVC. With such a simplified extension it aims at solving some of the problems of previous scalable extensions that haven't been successful, and at the same time, aims at supporting all design features that are of vital importance for the success of SHVC. Supporting lightweight and finely tunable bandwidth adaptation is one such vital design feature important for the success of SHVC. This paper proposes novel high level syntax mechanism for SHVC quality scalability to support: (a) using the decoded pictures from higher quality layer as reference for lower layer pictures and key pictures concept to reduce drift; (b) single loop decoding design with encoder only constraints without introducing any normative low-level changes to the normal multi-loop decoding process. Experimental results based on SHVC reference software (SHM 2.0) show that the proposed key picture method achieves an average of 2.9% luma BD-rate reduction in multi-loop framework and an average of 4.4% luma BD-rate loss to attain the capability of single loop decoding.
{"title":"Efficient key picture and single loop decoding scheme for SHVC","authors":"K. Rapaka, Jianle Chen, M. Karczewicz","doi":"10.1109/VCIP.2013.6706453","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706453","url":null,"abstract":"Scalable video coding has been a popular research topic for many years. As one of its key objectives, it aims to support different receiving devices connected through a network structure using a single bitstream. Scalable video coding extension of HEVC, also called as SHVC, is being developed by Joint Collaborative Team on Video Coding (JCT-VC) of ISO/IEC MPEG and ITU-T VCEG. Compared to previous standardized scalable video coding technologies, SHVC employs multi-loop decoding design with no low-level changes within any given layer compared to HEVC. With such a simplified extension it aims at solving some of the problems of previous scalable extensions that haven't been successful, and at the same time, aims at supporting all design features that are of vital importance for the success of SHVC. Supporting lightweight and finely tunable bandwidth adaptation is one such vital design feature important for the success of SHVC. This paper proposes novel high level syntax mechanism for SHVC quality scalability to support: (a) using the decoded pictures from higher quality layer as reference for lower layer pictures and key pictures concept to reduce drift; (b) single loop decoding design with encoder only constraints without introducing any normative low-level changes to the normal multi-loop decoding process. Experimental results based on SHVC reference software (SHM 2.0) show that the proposed key picture method achieves an average of 2.9% luma BD-rate reduction in multi-loop framework and an average of 4.4% luma BD-rate loss to attain the capability of single loop decoding.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"84 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131791237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706392
A. Saxena, Felix C. A. Fernandes
In this paper, we present a secondary transform scheme for inter-layer prediction residue in scalable video coding (SVC). Efficient prediction of the co-located blocks from the base layer (BL) can significantly improve the enhancement layer (EL) coding in SVC, especially when the temporal information from previous EL frames is less correlated than the co-located BL information. However, Guo et al. showed that because of the peculiar frequency characteristics of EL residuals, the conventional DCT Type-2 transform is suboptimal and is often outperformed by either the DCT Type-3, or DST Type-3 when these transforms are applied to the EL residuals. However, their proposed technique requires upto 8 additional transform cores, two of which are of size 32×32. Here, in this work, we propose a secondary transform scheme, where the proposed transform is applied only to the lower 8x8 frequency coefficients after DCT, for block sizes 8×8 to 32×32. Our proposed transform scheme requires at most only 2 additional cores. We also propose a low-complexity 8x8 Rotational Transform as a special case of secondary transforms in this paper. Simulation results show that the proposed transform scheme provides significant BD-Rate improvement over the conventional DCT-based coding scheme for video sequences in the ongoing scalable extensions of HEVC standardization.
{"title":"On secondary transforms for scalable video coding","authors":"A. Saxena, Felix C. A. Fernandes","doi":"10.1109/VCIP.2013.6706392","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706392","url":null,"abstract":"In this paper, we present a secondary transform scheme for inter-layer prediction residue in scalable video coding (SVC). Efficient prediction of the co-located blocks from the base layer (BL) can significantly improve the enhancement layer (EL) coding in SVC, especially when the temporal information from previous EL frames is less correlated than the co-located BL information. However, Guo et al. showed that because of the peculiar frequency characteristics of EL residuals, the conventional DCT Type-2 transform is suboptimal and is often outperformed by either the DCT Type-3, or DST Type-3 when these transforms are applied to the EL residuals. However, their proposed technique requires upto 8 additional transform cores, two of which are of size 32×32. Here, in this work, we propose a secondary transform scheme, where the proposed transform is applied only to the lower 8x8 frequency coefficients after DCT, for block sizes 8×8 to 32×32. Our proposed transform scheme requires at most only 2 additional cores. We also propose a low-complexity 8x8 Rotational Transform as a special case of secondary transforms in this paper. Simulation results show that the proposed transform scheme provides significant BD-Rate improvement over the conventional DCT-based coding scheme for video sequences in the ongoing scalable extensions of HEVC standardization.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129406104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706427
Fei Liang, Xiulian Peng, Jizheng Xu
The mean-square-error (MSE) distortion criterion used in the state-of-the-art video coding standards, e.g. H.264/AVC and the High Efficiency Video Coding (HEVC) under standardization, is widely criticized for poor measurement of perceived visual quality. Existing research on perceptual video coding mainly employs low-level features of images/video, which cannot take into account the big picture people see. This paper proposes a scene-aware perceptual video coding scheme (SAPC), which accommodates human visual perception of the scene by reconstructing the scene from video and perform scene-based bits allocation. To be specific, more bits are allocated to the foreground object and its boundaries considering that people tend to pay more attention to the foreground and object boundaries are prone to blur at low bitrates for object occlusion. The structure from motion (SFM) technology is employed for scene reconstruction. Experiments taking HEVC as the benchmark show that our algorithm can give better visual quality than the original HEVC encoder at the same bitrate.
{"title":"Scene-aware perceptual video coding","authors":"Fei Liang, Xiulian Peng, Jizheng Xu","doi":"10.1109/VCIP.2013.6706427","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706427","url":null,"abstract":"The mean-square-error (MSE) distortion criterion used in the state-of-the-art video coding standards, e.g. H.264/AVC and the High Efficiency Video Coding (HEVC) under standardization, is widely criticized for poor measurement of perceived visual quality. Existing research on perceptual video coding mainly employs low-level features of images/video, which cannot take into account the big picture people see. This paper proposes a scene-aware perceptual video coding scheme (SAPC), which accommodates human visual perception of the scene by reconstructing the scene from video and perform scene-based bits allocation. To be specific, more bits are allocated to the foreground object and its boundaries considering that people tend to pay more attention to the foreground and object boundaries are prone to blur at low bitrates for object occlusion. The structure from motion (SFM) technology is employed for scene reconstruction. Experiments taking HEVC as the benchmark show that our algorithm can give better visual quality than the original HEVC encoder at the same bitrate.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130955513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706401
Li Zhang, Jewon Kang, Xin Zhao, Ying Chen, R. Joshi
3D-AVC, being developed under Joint Collaborative Team on 3D Video Coding (JCT-3V), significantly outperforms the Multiview Video Coding plus Depth (MVC+D) which has no new macroblock level coding tools compared to Multiview video coding extension of H.264/AVC (MVC). However, for multiview compatible configuration, i.e., when texture views are decoded without accessing depth information, the performance of the current 3D-AVC is only marginally better than MVC+D. The problem is caused by the lack of disparity vectors which can be obtained only from the coded depth views in 3D-AVC. In this paper, a disparity vector derivation method is proposed by using the motion information of neighboring blocks and applied along with existing coding tools in 3D-AVC. The proposed method improves 3D-AVC in the multiview compatible mode substantially, resulting in about 20% bitrate reduction for texture coding. When enabling the so-called view synthesis prediction to further refine the disparity vectors, the performance of the proposed method is 31% better than MVC+D and even better than 3D-AVC under the best performing 3D-AVC configuration.
{"title":"Neighboring block based disparity vector derivation for 3D-AVC","authors":"Li Zhang, Jewon Kang, Xin Zhao, Ying Chen, R. Joshi","doi":"10.1109/VCIP.2013.6706401","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706401","url":null,"abstract":"3D-AVC, being developed under Joint Collaborative Team on 3D Video Coding (JCT-3V), significantly outperforms the Multiview Video Coding plus Depth (MVC+D) which has no new macroblock level coding tools compared to Multiview video coding extension of H.264/AVC (MVC). However, for multiview compatible configuration, i.e., when texture views are decoded without accessing depth information, the performance of the current 3D-AVC is only marginally better than MVC+D. The problem is caused by the lack of disparity vectors which can be obtained only from the coded depth views in 3D-AVC. In this paper, a disparity vector derivation method is proposed by using the motion information of neighboring blocks and applied along with existing coding tools in 3D-AVC. The proposed method improves 3D-AVC in the multiview compatible mode substantially, resulting in about 20% bitrate reduction for texture coding. When enabling the so-called view synthesis prediction to further refine the disparity vectors, the performance of the proposed method is 31% better than MVC+D and even better than 3D-AVC under the best performing 3D-AVC configuration.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130984406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706447
Jing Wang, Xiaofeng Wang, Tianying Ji, Dake He
AVS2 is a next-generation audio and video coding standard currently under development by the Audio Video Coding Standard Workgroup of China. In this paper, a coefficient-group based transform coefficient coding design for AVS2 video coding standard is presented, which includes two main coding tools, namely, two-level coefficient coding and intra-mode based context design. The two-level coefficient coding scheme allows accurate coefficient position information to be used in the context model design and improves the coding efficiency. It also helps increase the entropy coding throughput and facilitate parallel implementation. The intra-mode based context design further improves coding performance by utilizing the intra-prediction mode information in the context model. The two coding tools combined provide consistent rate-distortion performance gains under standard test conditions. Both tools were adopted into the AVS2 working draft. Furthermore, an improved rate-distortion optimized quantization algorithm is designed based on the proposed scheme, which significantly reduces the encoder complexity.
{"title":"Transform coefficient coding design for AVS2 video coding standard","authors":"Jing Wang, Xiaofeng Wang, Tianying Ji, Dake He","doi":"10.1109/VCIP.2013.6706447","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706447","url":null,"abstract":"AVS2 is a next-generation audio and video coding standard currently under development by the Audio Video Coding Standard Workgroup of China. In this paper, a coefficient-group based transform coefficient coding design for AVS2 video coding standard is presented, which includes two main coding tools, namely, two-level coefficient coding and intra-mode based context design. The two-level coefficient coding scheme allows accurate coefficient position information to be used in the context model design and improves the coding efficiency. It also helps increase the entropy coding throughput and facilitate parallel implementation. The intra-mode based context design further improves coding performance by utilizing the intra-prediction mode information in the context model. The two coding tools combined provide consistent rate-distortion performance gains under standard test conditions. Both tools were adopted into the AVS2 working draft. Furthermore, an improved rate-distortion optimized quantization algorithm is designed based on the proposed scheme, which significantly reduces the encoder complexity.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"82 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133543161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706415
Haoqian Wang, M. Wu, Yongbing Zhang, Lei Zhang
In this paper, we propose an effective stereo matching algorithm using reliable points and region-based graph cut. Firstly, the initial disparity maps are calculated via local windowbased method. Secondly, the unreliable points are detected according to the DSI(Disparity Space Image) and the estimated disparity values of each unreliable point are obtained by considering its surrounding points. Then, the scheme of reliable points is introduced in region-based graph cut framework to optimize the initial result. Finally, remaining errors in the disparity results are effectively handled in a multi-step refinement process. Experiment results show that the proposed algorithm achieves a significant reduction in computation cost and guarantee high matching quality.
{"title":"Effective stereo matching using reliable points based graph cut","authors":"Haoqian Wang, M. Wu, Yongbing Zhang, Lei Zhang","doi":"10.1109/VCIP.2013.6706415","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706415","url":null,"abstract":"In this paper, we propose an effective stereo matching algorithm using reliable points and region-based graph cut. Firstly, the initial disparity maps are calculated via local windowbased method. Secondly, the unreliable points are detected according to the DSI(Disparity Space Image) and the estimated disparity values of each unreliable point are obtained by considering its surrounding points. Then, the scheme of reliable points is introduced in region-based graph cut framework to optimize the initial result. Finally, remaining errors in the disparity results are effectively handled in a multi-step refinement process. Experiment results show that the proposed algorithm achieves a significant reduction in computation cost and guarantee high matching quality.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132592622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To save the storage and transmission cost, it is applicable now to develop fast and efficient methods to transcode the perennial surveillance videos to HEVC ones, since HEVC has doubled the compression ratio. Considering the long-time static background characteristic of surveillance videos, this paper presents a coding unit (CU) classification based AVC-to-HEVC transcoding method with background modeling. In our method, the background frame modeled from originally decoded frames is firstly transcoded into HEVC stream as long-term reference to enhance the prediction efficiency. Afterwards, a CU classification algorithm which employs decoded motion vectors and the modeled background frame as input is proposed to divide the decoded data into background, foreground and hybrid CUs. Following this, different transcoding strategies of CU partition termination, prediction unit candidate selection and motion estimation simplification are adopted for different CU categories to reduce the complexity. Experimental results show our method can achieve 45% bit saving and 50% complexity reduction against traditional AVC-to-HEVC transcoding.
{"title":"A coding unit classification based AVC-to-HEVC transcoding with background modeling for surveillance videos","authors":"Peiyin Xing, Yonghong Tian, Xianguo Zhang, Yaowei Wang, Tiejun Huang","doi":"10.1109/VCIP.2013.6706393","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706393","url":null,"abstract":"To save the storage and transmission cost, it is applicable now to develop fast and efficient methods to transcode the perennial surveillance videos to HEVC ones, since HEVC has doubled the compression ratio. Considering the long-time static background characteristic of surveillance videos, this paper presents a coding unit (CU) classification based AVC-to-HEVC transcoding method with background modeling. In our method, the background frame modeled from originally decoded frames is firstly transcoded into HEVC stream as long-term reference to enhance the prediction efficiency. Afterwards, a CU classification algorithm which employs decoded motion vectors and the modeled background frame as input is proposed to divide the decoded data into background, foreground and hybrid CUs. Following this, different transcoding strategies of CU partition termination, prediction unit candidate selection and motion estimation simplification are adopted for different CU categories to reduce the complexity. Experimental results show our method can achieve 45% bit saving and 50% complexity reduction against traditional AVC-to-HEVC transcoding.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115364796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/VCIP.2013.6706363
Juncheng Ma, Jicheng An, Kai Zhang, Siwei Ma, S. Lei
This paper proposes a progressive motion vector resolution (PMVR) method for High Efficiency Video Coding (HEVC). In the proposed scheme, high motion vector (MV) resolutions, e.g. 1/4 or 1/8 pixel resolution, are employed for MVs near to the motion vector predictor (MVP) and low MV resolutions are employed for MVs far from the MVP. The range of each MV resolution is indicated by a threshold parameter. And a new motion vector difference (MVD) derivation method is designed to encode MVD efficiently. Experimental results show that PMVR with 1/8 pixel motion search can achieve a BD-rate gain up to 16% with almost the same coding time with HM8.0, and for PMVR without 1/8 pixel motion search, up to 6.1% BD-rate gain can be achieved with 9% encoding time saving on average.
{"title":"Progressive motion vector resolution for HEVC","authors":"Juncheng Ma, Jicheng An, Kai Zhang, Siwei Ma, S. Lei","doi":"10.1109/VCIP.2013.6706363","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706363","url":null,"abstract":"This paper proposes a progressive motion vector resolution (PMVR) method for High Efficiency Video Coding (HEVC). In the proposed scheme, high motion vector (MV) resolutions, e.g. 1/4 or 1/8 pixel resolution, are employed for MVs near to the motion vector predictor (MVP) and low MV resolutions are employed for MVs far from the MVP. The range of each MV resolution is indicated by a threshold parameter. And a new motion vector difference (MVD) derivation method is designed to encode MVD efficiently. Experimental results show that PMVR with 1/8 pixel motion search can achieve a BD-rate gain up to 16% with almost the same coding time with HM8.0, and for PMVR without 1/8 pixel motion search, up to 6.1% BD-rate gain can be achieved with 9% encoding time saving on average.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"200 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124255510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Smoke detection in video surveillance is very important for early fire detection. A general viewpoint assumes that smoke is a low frequency signal which may smoothen the background. However, some pure-color objects also have this characteristic, and smoke also produces high frequency signal because the rich edge information of its contour. In order to solve these problems, an improved smoke detection method with RGB Contrast-image and shape constrain is proposed. In this method, wavelet transformation is implemented based on the RGB Contrast-image to distinguish smoke from other low frequency signals, and the existence of smoke is determined by analyzing the combination of the shape and the energy change of the region. Experimental results show our method outperforms the conventional methods remarkably.
{"title":"Wavelet based smoke detection method with RGB Contrast-image and shape constrain","authors":"Jia Chen, Yaowei Wang, Yonghong Tian, Tiejun Huang","doi":"10.1109/VCIP.2013.6706406","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706406","url":null,"abstract":"Smoke detection in video surveillance is very important for early fire detection. A general viewpoint assumes that smoke is a low frequency signal which may smoothen the background. However, some pure-color objects also have this characteristic, and smoke also produces high frequency signal because the rich edge information of its contour. In order to solve these problems, an improved smoke detection method with RGB Contrast-image and shape constrain is proposed. In this method, wavelet transformation is implemented based on the RGB Contrast-image to distinguish smoke from other low frequency signals, and the existence of smoke is determined by analyzing the combination of the shape and the energy change of the region. Experimental results show our method outperforms the conventional methods remarkably.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114570181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}