首页 > 最新文献

2013 Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Laplace distribution based CTU level rate control for HEVC 基于拉普拉斯分布的HEVC CTU级速率控制
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706333
Junjun Si, Siwei Ma, Shiqi Wang, Wen Gao
This paper proposes a coding tree unit (CTU) level rate control for HEVC based on the Laplace distribution modeling of the transformed residuals. Firstly, we give a study on the relationship model among the optimal quantization step, the Laplace parameter and the Lagrange multiplier. Based on the relationship model, the quantization parameter for each CTU can be dynamically adjusted according to the distribution of the transformed residual. Secondly, a CTU level rate control scheme is proposed to achieve accurate rate control as well as high coding performance. Experimental results show that the proposed rate control scheme achieves better coding performance than the state-of-the-art rate control schemes for HEVC in terms of both objective and subjective quality.
本文提出了一种基于变换残差拉普拉斯分布模型的编码树单元(CTU)级速率控制方法。首先,研究了最优量化步长与拉普拉斯参数和拉格朗日乘子之间的关系模型。基于关系模型,可以根据变换后残差的分布动态调整每个CTU的量化参数。其次,提出了一种CTU级别的码率控制方案,以实现准确的码率控制和较高的编码性能。实验结果表明,该码率控制方案在客观质量和主观质量上都优于现有的HEVC码率控制方案。
{"title":"Laplace distribution based CTU level rate control for HEVC","authors":"Junjun Si, Siwei Ma, Shiqi Wang, Wen Gao","doi":"10.1109/VCIP.2013.6706333","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706333","url":null,"abstract":"This paper proposes a coding tree unit (CTU) level rate control for HEVC based on the Laplace distribution modeling of the transformed residuals. Firstly, we give a study on the relationship model among the optimal quantization step, the Laplace parameter and the Lagrange multiplier. Based on the relationship model, the quantization parameter for each CTU can be dynamically adjusted according to the distribution of the transformed residual. Secondly, a CTU level rate control scheme is proposed to achieve accurate rate control as well as high coding performance. Experimental results show that the proposed rate control scheme achieves better coding performance than the state-of-the-art rate control schemes for HEVC in terms of both objective and subjective quality.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128784994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
A novel method for stereo matching using Gabor Feature Image and Confidence Mask 一种基于Gabor特征图像和置信蒙版的立体匹配新方法
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706388
Haixu Liu, Yang Liu, Shuxin Ouyang, Chenyu Liu, Xueming Li
In this paper, we present a novel local-based algorithm for stereo matching using Gabor-Feature-Image and Confidence-Mask. Various local-based schemes have been proposed in recent years, most of them mainly use color difference as evaluation criterion when constructing the initial cost volume, however, color channel is highly sensitive to noise, illumination changes, etc. Therefore, we develop a new cost function based on Gabor-Feature-Image for obtaining a more accurate matching cost volume. Furthermore, in order to eliminate the matching ambiguities brought by the winnertakes-all method, an effective disparity refinement strategy using Confidence-Mask is implemented to select and refine the less reliable pixels. The proposed algorithm ranks 23th out of over 150 (global-based and local-based) methods on Middlebury data sets, both quantitative and qualitative evaluation show that it is comparable to state-of-the-art local-based stereo matching algorithms.
本文提出了一种基于局部的基于Gabor-Feature-Image和Confidence-Mask的立体匹配算法。近年来提出了各种基于局部的方案,大多数方案在构建初始成本体积时主要以色差作为评价标准,但颜色通道对噪声、光照变化等高度敏感。因此,我们开发了一种新的基于Gabor-Feature-Image的代价函数,以获得更精确的匹配代价体积。此外,为了消除赢家通吃方法带来的匹配歧义,实现了一种有效的视差细化策略,利用置信度掩码对可靠性较差的像素点进行选择和细化。该算法在Middlebury数据集上的150多种(基于全局和基于局部的)方法中排名第23位,定量和定性评估表明,它与最先进的基于局部的立体匹配算法相当。
{"title":"A novel method for stereo matching using Gabor Feature Image and Confidence Mask","authors":"Haixu Liu, Yang Liu, Shuxin Ouyang, Chenyu Liu, Xueming Li","doi":"10.1109/VCIP.2013.6706388","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706388","url":null,"abstract":"In this paper, we present a novel local-based algorithm for stereo matching using Gabor-Feature-Image and Confidence-Mask. Various local-based schemes have been proposed in recent years, most of them mainly use color difference as evaluation criterion when constructing the initial cost volume, however, color channel is highly sensitive to noise, illumination changes, etc. Therefore, we develop a new cost function based on Gabor-Feature-Image for obtaining a more accurate matching cost volume. Furthermore, in order to eliminate the matching ambiguities brought by the winnertakes-all method, an effective disparity refinement strategy using Confidence-Mask is implemented to select and refine the less reliable pixels. The proposed algorithm ranks 23th out of over 150 (global-based and local-based) methods on Middlebury data sets, both quantitative and qualitative evaluation show that it is comparable to state-of-the-art local-based stereo matching algorithms.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127481501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Gaze pattern analysis for video contents with different frame rates 不同帧率视频内容的注视模式分析
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706429
Manri Cheon, Jong-Seok Lee
This paper presents a study investigating the viewing behavior of human subjects for video contents having different frame rates. Frame rate variability arises when temporal video scalability is considered for adaptive video transmission, and the gaze pattern variation due to the frame rate variability would eventually affect the visual perception, which needs to be considered during perceptual optimization of such a system. We design an eye-tracking experiment using several high definition contents having a wide range of content characteristics. By comparing the gaze points for a normal frame rate condition and a low frame rate condition, it is shown that, although the overall viewing pattern remains quite similar, statistically significant difference is also observed for some time intervals. The difference is analyzed in terms of two factors, namely, overall gaze paths and subject-wise variability.
本文对不同帧率的视频内容的观看行为进行了研究。在自适应视频传输中考虑时间视频可扩展性时,会产生帧率可变性,帧率可变性导致的注视模式变化最终会影响视觉感知,这需要在系统的感知优化过程中加以考虑。我们设计了一个眼动追踪实验,使用几个具有广泛内容特征的高清内容。通过对正常帧率条件下和低帧率条件下注视点的比较,可以发现,尽管整体的注视模式非常相似,但在一些时间间隔内也观察到统计学上的显著差异。这种差异是根据两个因素来分析的,即总体注视路径和主体的可变性。
{"title":"Gaze pattern analysis for video contents with different frame rates","authors":"Manri Cheon, Jong-Seok Lee","doi":"10.1109/VCIP.2013.6706429","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706429","url":null,"abstract":"This paper presents a study investigating the viewing behavior of human subjects for video contents having different frame rates. Frame rate variability arises when temporal video scalability is considered for adaptive video transmission, and the gaze pattern variation due to the frame rate variability would eventually affect the visual perception, which needs to be considered during perceptual optimization of such a system. We design an eye-tracking experiment using several high definition contents having a wide range of content characteristics. By comparing the gaze points for a normal frame rate condition and a low frame rate condition, it is shown that, although the overall viewing pattern remains quite similar, statistically significant difference is also observed for some time intervals. The difference is analyzed in terms of two factors, namely, overall gaze paths and subject-wise variability.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121200835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Endoscopy video summarization based on unsupervised learning and feature discrimination 基于无监督学习和特征识别的内窥镜视频摘要
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706410
M. Ismail, Ouiem Bchir, Ahmed Z. Emam
We propose a novel endoscopy video summarization approach based on unsupervised learning and feature discrimination. The proposed learning approach partitions the collection of video frames into homogeneous categories based on their visual and temporal descriptors. Also, it generates possibilistic memberships in order to represent the degree of typicality of each video frame within every category, and reduce the influence of noise frames on the learning process. The algorithm learns iteratively the optimal relevance weight for each feature subset within each cluster. Moreover, it finds the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. The endoscopy video summary consists of the most typical frames in all clusters after discarding noise frames. We compare the performance of the proposed algorithm with state-of-the-art learning approaches. We show that the possibilistic approach is more robust. The endoscopy videos collection includes more than 90k video frames.
提出了一种基于无监督学习和特征识别的内窥镜视频摘要方法。该学习方法根据视频帧的视觉描述符和时间描述符将视频帧的集合划分为同类类别。此外,它还生成可能性隶属度,以表示每个视频帧在每个类别中的典型程度,并减少噪声帧对学习过程的影响。该算法迭代学习每个聚类中每个特征子集的最优关联权值。此外,它利用可能性隶属函数的一些性质,以无监督和有效的方式找到最优簇数。内窥镜视频摘要在剔除噪声帧后,由所有聚类中最典型的帧组成。我们将所提出的算法的性能与最先进的学习方法进行了比较。我们证明了可能性方法更健壮。内窥镜视频收藏包括超过90k视频帧。
{"title":"Endoscopy video summarization based on unsupervised learning and feature discrimination","authors":"M. Ismail, Ouiem Bchir, Ahmed Z. Emam","doi":"10.1109/VCIP.2013.6706410","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706410","url":null,"abstract":"We propose a novel endoscopy video summarization approach based on unsupervised learning and feature discrimination. The proposed learning approach partitions the collection of video frames into homogeneous categories based on their visual and temporal descriptors. Also, it generates possibilistic memberships in order to represent the degree of typicality of each video frame within every category, and reduce the influence of noise frames on the learning process. The algorithm learns iteratively the optimal relevance weight for each feature subset within each cluster. Moreover, it finds the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. The endoscopy video summary consists of the most typical frames in all clusters after discarding noise frames. We compare the performance of the proposed algorithm with state-of-the-art learning approaches. We show that the possibilistic approach is more robust. The endoscopy videos collection includes more than 90k video frames.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121282432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A control theory based rate adaption scheme for dash over multiple servers 一种基于控制理论的多服务器dash速率自适应方案
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706335
Chao Zhou, Xinggong Zhang, Zongming Guo
Recently, Dynamic Adaptive Streaming over HTTP (DASH) has been widely deployed in the Internet. However, the research about DASH over Multiple Content Distribution Servers (MCDS) is few. Compared with traditional single-server-DASH, MCDS are able to offer expanded bandwidth, link diversity, and reliability. It is, however, a challenging problem to smooth video bitrate switchings over multiple servers due to their diverse bandwidths. In this paper, we propose a block-based rate adaptation method considering both the diverse bandwidths and feedback buffered video time. Multiple fragments are grouped into a block, and the fragments are downloaded in parallel from multiple servers. We propose to adapt video bitrate at the block level rather than at the fragment level. By dynamically adjusting the block length and scheduling fragment requests to multiple servers, the requested video bitrates from the multiple servers are synchronized, making the fragments downloaded orderly. Then, we propose a control-theoretic approach to select an appropriate bitrate for each block. By modeling and linearizing the rate adaption system, we propose a novel Proportional-Derivative (PD) controller to adapt video bitrate with high responsiveness and stability. Theoretical analysis and extensive experiments on the Internet demonstrate the good efficiency of our DASH designs.
近年来,基于HTTP的动态自适应流(Dynamic Adaptive Streaming over HTTP, DASH)在Internet上得到了广泛的应用。然而,关于基于多内容分发服务器(MCDS)的DASH的研究却很少。与传统的单服务器dash相比,MCDS能够提供更大的带宽、链路多样性和可靠性。然而,由于多个服务器的带宽不同,在多个服务器上实现视频比特率的平滑切换是一个具有挑战性的问题。在本文中,我们提出了一种基于块的速率自适应方法,同时考虑了不同的带宽和反馈缓冲视频时间。多个片段被分组成一个块,并从多个服务器并行下载这些片段。我们建议在块级而不是片段级调整视频比特率。通过动态调整块长度和调度片段请求到多个服务器,同步多个服务器请求的视频比特率,使片段有序下载。然后,我们提出了一种控制理论方法来为每个块选择合适的比特率。通过对速率自适应系统进行建模和线性化,我们提出了一种新的比例导数(PD)控制器来适应具有高响应性和稳定性的视频比特率。理论分析和广泛的网络实验证明了DASH设计的良好效率。
{"title":"A control theory based rate adaption scheme for dash over multiple servers","authors":"Chao Zhou, Xinggong Zhang, Zongming Guo","doi":"10.1109/VCIP.2013.6706335","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706335","url":null,"abstract":"Recently, Dynamic Adaptive Streaming over HTTP (DASH) has been widely deployed in the Internet. However, the research about DASH over Multiple Content Distribution Servers (MCDS) is few. Compared with traditional single-server-DASH, MCDS are able to offer expanded bandwidth, link diversity, and reliability. It is, however, a challenging problem to smooth video bitrate switchings over multiple servers due to their diverse bandwidths. In this paper, we propose a block-based rate adaptation method considering both the diverse bandwidths and feedback buffered video time. Multiple fragments are grouped into a block, and the fragments are downloaded in parallel from multiple servers. We propose to adapt video bitrate at the block level rather than at the fragment level. By dynamically adjusting the block length and scheduling fragment requests to multiple servers, the requested video bitrates from the multiple servers are synchronized, making the fragments downloaded orderly. Then, we propose a control-theoretic approach to select an appropriate bitrate for each block. By modeling and linearizing the rate adaption system, we propose a novel Proportional-Derivative (PD) controller to adapt video bitrate with high responsiveness and stability. Theoretical analysis and extensive experiments on the Internet demonstrate the good efficiency of our DASH designs.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128719560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Dense depth acquisition via one-shot stripe structured light 通过一次发射条纹结构光进行密集深度采集
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706402
Qin Li, Fu Li, Guangming Shi, Fei Qi, Yuexin Shi, Shan Gao
Depth acquisition for moving objects becomes increasingly critical for some applications such as human facial expression recognition. This paper presents a method for capturing the depth maps of moving objects that uses a one-shot black-and-white stripe pattern with the features of simplicity and easily generation. Considering the accuracy of a matching is crucial for a precise depth map but the matching of variant-width stripes is sparse and rough, the phase differences extracted by Gabor filter to achieve a pixel-wise matching with sub-pixel accuracy are used. The details of the derivation are presented to prove that this method based on the phase difference calculated by Gabor filter is valid. In addition, the periodic ambiguity of the encoded stripe is eliminated by the epipolar segment covering a given depth range at a camera-projector calibrating stage to decrease the calculation complexity. Experimental results show that our method can get a dense and accurate depth map of a moving object.
运动物体的深度采集在某些应用中变得越来越重要,例如人类面部表情识别。本文提出了一种利用黑白条纹模式一次性捕获运动物体深度图的方法,该方法具有简单、易于生成的特点。考虑到匹配精度对精确的深度图至关重要,但变宽条纹的匹配是稀疏和粗糙的,采用Gabor滤波器提取相位差来实现亚像素精度的逐像素匹配。最后给出了具体的推导过程,证明了基于Gabor滤波器计算相位差的方法是有效的。此外,在摄像机-投影仪标定阶段,利用覆盖给定深度范围的极段消除了编码条纹的周期性模糊性,降低了计算复杂度。实验结果表明,该方法可以获得密集、准确的运动物体深度图。
{"title":"Dense depth acquisition via one-shot stripe structured light","authors":"Qin Li, Fu Li, Guangming Shi, Fei Qi, Yuexin Shi, Shan Gao","doi":"10.1109/VCIP.2013.6706402","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706402","url":null,"abstract":"Depth acquisition for moving objects becomes increasingly critical for some applications such as human facial expression recognition. This paper presents a method for capturing the depth maps of moving objects that uses a one-shot black-and-white stripe pattern with the features of simplicity and easily generation. Considering the accuracy of a matching is crucial for a precise depth map but the matching of variant-width stripes is sparse and rough, the phase differences extracted by Gabor filter to achieve a pixel-wise matching with sub-pixel accuracy are used. The details of the derivation are presented to prove that this method based on the phase difference calculated by Gabor filter is valid. In addition, the periodic ambiguity of the encoded stripe is eliminated by the epipolar segment covering a given depth range at a camera-projector calibrating stage to decrease the calculation complexity. Experimental results show that our method can get a dense and accurate depth map of a moving object.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115409596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Universal and low-complexity quantizer design for compressive sensing image coding 压缩感知图像编码的通用低复杂度量化器设计
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706403
Xiangwei Li, Xuguang Lan, Meng Yang, Jianru Xue, Nanning Zheng
Compressive sensing imaging (CSI) is a new framework for image coding, which enables acquiring and compressing a scene simultaneously. The CS encoder shifts the bulk of the system complexity to the decoder efficiently. Ideally, implementation of CSI provides lossless compression in image coding. In this paper, we consider the lossy compression of the CS measurements in CSI system. We design a universal quantizer for the CS measurements of any input image. The proposed method firstly establishes a universal probability model for the CS measurements in advance, without knowing any information of the input image. Then a fast quantizer is designed based on this established model. Simulation result demonstrates that the proposed method has nearly optimal rate-distortion (R~D) performance, meanwhile, maintains a very low computational complexity at the CS encoder.
压缩感知成像(CSI)是一种新的图像编码框架,它可以同时获取和压缩场景。CS编码器有效地将大部分系统复杂性转移到解码器上。理想情况下,CSI的实现在图像编码中提供无损压缩。本文研究了CSI系统中CS测量值的有损压缩问题。我们设计了一个通用量化器,用于任何输入图像的CS测量。该方法在不知道输入图像的任何信息的情况下,预先建立了CS测量的通用概率模型。在此基础上设计了快速量化器。仿真结果表明,该方法具有接近最优的率失真(R~D)性能,同时在CS编码器上保持了极低的计算复杂度。
{"title":"Universal and low-complexity quantizer design for compressive sensing image coding","authors":"Xiangwei Li, Xuguang Lan, Meng Yang, Jianru Xue, Nanning Zheng","doi":"10.1109/VCIP.2013.6706403","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706403","url":null,"abstract":"Compressive sensing imaging (CSI) is a new framework for image coding, which enables acquiring and compressing a scene simultaneously. The CS encoder shifts the bulk of the system complexity to the decoder efficiently. Ideally, implementation of CSI provides lossless compression in image coding. In this paper, we consider the lossy compression of the CS measurements in CSI system. We design a universal quantizer for the CS measurements of any input image. The proposed method firstly establishes a universal probability model for the CS measurements in advance, without knowing any information of the input image. Then a fast quantizer is designed based on this established model. Simulation result demonstrates that the proposed method has nearly optimal rate-distortion (R~D) performance, meanwhile, maintains a very low computational complexity at the CS encoder.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114778639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Expression-invariant and sparse representation for mesh-based compression for 3-D face models 基于网格的三维人脸模型压缩的表达式不变和稀疏表示
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706442
Junhui Hou, Lap-Pui Chau, Ying He, N. Magnenat-Thalmann
Compression of mesh-based 3-D models has been an important issue, which ensures efficient storage and transmission. In this paper, we present a very effective compression scheme specifically for expression variation 3-D face models. Firstly, 3-D models are mapped into 2-D parametric domain and corresponded by expression-invariant parameterizaton, leading to 2-D image format representation namely geometry images, which simplifies the 3-D model compression into 2-D image compression. Then, sparse representation with learned dictionaries via K-SVD is applied to each patch from sliced GI so that only few coefficients and their indices are needed to be encoded, leading to low datasize. Experimental results demonstrate that the proposed scheme provides significant improvement in terms of compression performance, especially at low bitrate, compared with existing algorithms.
基于网格的三维模型的压缩一直是一个重要的问题,它保证了高效的存储和传输。在本文中,我们提出了一个非常有效的压缩方案,特别是表情变化的三维人脸模型。首先,将三维模型映射到二维参数域,通过不变表达式参数化进行对应,得到二维图像格式表示即几何图像,将三维模型压缩简化为二维图像压缩;然后,通过K-SVD将学习字典的稀疏表示应用于切片GI的每个patch,这样只需要编码很少的系数及其索引,从而导致低数据化。实验结果表明,与现有算法相比,该方案在压缩性能方面有显著提高,特别是在低比特率下。
{"title":"Expression-invariant and sparse representation for mesh-based compression for 3-D face models","authors":"Junhui Hou, Lap-Pui Chau, Ying He, N. Magnenat-Thalmann","doi":"10.1109/VCIP.2013.6706442","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706442","url":null,"abstract":"Compression of mesh-based 3-D models has been an important issue, which ensures efficient storage and transmission. In this paper, we present a very effective compression scheme specifically for expression variation 3-D face models. Firstly, 3-D models are mapped into 2-D parametric domain and corresponded by expression-invariant parameterizaton, leading to 2-D image format representation namely geometry images, which simplifies the 3-D model compression into 2-D image compression. Then, sparse representation with learned dictionaries via K-SVD is applied to each patch from sliced GI so that only few coefficients and their indices are needed to be encoded, leading to low datasize. Experimental results demonstrate that the proposed scheme provides significant improvement in terms of compression performance, especially at low bitrate, compared with existing algorithms.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"653 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121986493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Joint trilateral filtering for depth map super-resolution 深度图超分辨率联合三边滤波
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706444
Kai-Han Lo, Y. Wang, K. Hua
Depth map super-resolution is an emerging topic due to the increasing needs and applications using RGB-D sensors. Together with the color image, the corresponding range data provides additional information and makes visual analysis tasks more tractable. However, since the depth maps captured by such sensors are typically with limited resolution, it is preferable to enhance its resolution for improved recognition. In this paper, we present a novel joint trilateral filtering (JTF) algorithm for solving depth map super-resolution (SR) problems. Inspired by bilateral filtering, our JTF utilizes and preserves edge information from the associated high-resolution (HR) image by taking spatial and range information of local pixels. Our proposed further integrates local gradient information of the depth map when synthesizing its HR output, which alleviates textural artifacts like edge discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.
由于RGB-D传感器的需求和应用日益增加,深度图超分辨率是一个新兴的课题。与彩色图像一起,相应的距离数据提供了额外的信息,使可视化分析任务更容易处理。然而,由于这种传感器捕获的深度图通常具有有限的分辨率,因此最好提高其分辨率以提高识别能力。在本文中,我们提出了一种新的联合三边滤波(JTF)算法来解决深度图超分辨率问题。受双边滤波的启发,我们的JTF通过获取局部像素的空间和距离信息,利用并保留相关高分辨率(HR)图像的边缘信息。在合成深度图的HR输出时,我们进一步整合了深度图的局部梯度信息,减轻了边缘不连续等纹理伪影。定量和定性实验结果证明了我们的方法比先前的深度图上采样工作的有效性和鲁棒性。
{"title":"Joint trilateral filtering for depth map super-resolution","authors":"Kai-Han Lo, Y. Wang, K. Hua","doi":"10.1109/VCIP.2013.6706444","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706444","url":null,"abstract":"Depth map super-resolution is an emerging topic due to the increasing needs and applications using RGB-D sensors. Together with the color image, the corresponding range data provides additional information and makes visual analysis tasks more tractable. However, since the depth maps captured by such sensors are typically with limited resolution, it is preferable to enhance its resolution for improved recognition. In this paper, we present a novel joint trilateral filtering (JTF) algorithm for solving depth map super-resolution (SR) problems. Inspired by bilateral filtering, our JTF utilizes and preserves edge information from the associated high-resolution (HR) image by taking spatial and range information of local pixels. Our proposed further integrates local gradient information of the depth map when synthesizing its HR output, which alleviates textural artifacts like edge discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121871878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Recovering depth of background and foreground from a monocular video with camera motion 从带有摄像机运动的单目视频中恢复背景和前景的深度
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706409
Hu Tian, Bojin Zhuang, Yan Hua, Yanyun Zhao, A. Cai
In this paper we propose a depth recovery approach for monocular videos with or without camera motion. By combining geometric information and moving object extraction, not only the depth of background but also the depth of foreground can be recovered. Furthermore, for cases involving complex camera motion such as fast moving, translating, vertical movement, we propose a novel global motion estimation (GME) method including effective outlier rejection to extract moving objects, and experiments demonstrate that the proposed GME method outperforms most of the state-of-the-art methods. The depth recovery approach we propose is tested on four video sequences with different camera movements. Experimental results show that our approach produces more accurate depth of both background and foreground than existing depth recovery methods.
在本文中,我们提出了一种深度恢复方法,用于有或没有相机运动的单目视频。将几何信息与运动目标提取相结合,不仅可以恢复背景深度,还可以恢复前景深度。此外,针对快速运动、平移、垂直运动等复杂的摄像机运动,我们提出了一种新的全局运动估计(GME)方法,包括有效的异常值抑制来提取运动目标,实验表明,该方法优于大多数最先进的方法。我们提出的深度恢复方法在四个不同摄像机运动的视频序列上进行了测试。实验结果表明,与现有的深度恢复方法相比,该方法可以获得更精确的背景和前景深度。
{"title":"Recovering depth of background and foreground from a monocular video with camera motion","authors":"Hu Tian, Bojin Zhuang, Yan Hua, Yanyun Zhao, A. Cai","doi":"10.1109/VCIP.2013.6706409","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706409","url":null,"abstract":"In this paper we propose a depth recovery approach for monocular videos with or without camera motion. By combining geometric information and moving object extraction, not only the depth of background but also the depth of foreground can be recovered. Furthermore, for cases involving complex camera motion such as fast moving, translating, vertical movement, we propose a novel global motion estimation (GME) method including effective outlier rejection to extract moving objects, and experiments demonstrate that the proposed GME method outperforms most of the state-of-the-art methods. The depth recovery approach we propose is tested on four video sequences with different camera movements. Experimental results show that our approach produces more accurate depth of both background and foreground than existing depth recovery methods.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121712613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2013 Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1