首页 > 最新文献

2013 Visual Communications and Image Processing (VCIP)最新文献

英文 中文
A novel motion compensated prediction framework using weighted AMVP prediction for HEVC 基于加权AMVP预测的HEVC运动补偿预测框架
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706416
Li Yu, Guangtao Fu, Aidong Men, Binji Luo, Huiling Zhao
In this paper, we propose a novel motion compensated prediction (MCP) framework which combines the properties of motion vector restriction and weighted advanced motion vector prediction (AMVP) to achieve higher prediction accuracy for High Efficiency Video Coding (HEVC). In our framework, motion vectors of the prediction units (PUs) surrounding the current PU are checked first by a motion field model, and the geometric relationship between motion vectors of the current PU and its neighboring coded PUs is analyzed. Then whether to use the weighted AMVP prediction method is determined by the motion vector restriction criterion. True motion vectors therefore can be obtained. Experimental results show that the proposed framework achieves BD-PSNR increments ranging from 0.03dB to 0.22dB and the BD-rate saving is up to 6.4%.
为了提高高效视频编码(HEVC)的预测精度,提出了一种新的运动补偿预测(MCP)框架,该框架结合了运动矢量约束和加权高级运动矢量预测(AMVP)的特性。在我们的框架中,首先通过运动场模型检查当前PU周围的预测单元(PU)的运动向量,并分析当前PU与其相邻编码PU的运动向量之间的几何关系。然后根据运动矢量约束准则确定是否采用加权AMVP预测方法。因此可以得到真正的运动向量。实验结果表明,该框架实现了0.03 ~ 0.22dB的BD-PSNR增量,降低了6.4%的BD-PSNR。
{"title":"A novel motion compensated prediction framework using weighted AMVP prediction for HEVC","authors":"Li Yu, Guangtao Fu, Aidong Men, Binji Luo, Huiling Zhao","doi":"10.1109/VCIP.2013.6706416","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706416","url":null,"abstract":"In this paper, we propose a novel motion compensated prediction (MCP) framework which combines the properties of motion vector restriction and weighted advanced motion vector prediction (AMVP) to achieve higher prediction accuracy for High Efficiency Video Coding (HEVC). In our framework, motion vectors of the prediction units (PUs) surrounding the current PU are checked first by a motion field model, and the geometric relationship between motion vectors of the current PU and its neighboring coded PUs is analyzed. Then whether to use the weighted AMVP prediction method is determined by the motion vector restriction criterion. True motion vectors therefore can be obtained. Experimental results show that the proposed framework achieves BD-PSNR increments ranging from 0.03dB to 0.22dB and the BD-rate saving is up to 6.4%.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132575270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling the color image and video quality on liquid crystal displays with backlight dimming 对背光调光液晶显示器上的彩色图像和视频质量进行建模
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706383
J. Korhonen, Claire Mantel, Nino Burini, Søren Forchhammer
Objective image and video quality metrics focus mostly on the digital representation of the signal. However, the display characteristics are also essential for the overall Quality of Experience (QoE). In this paper, we use a model of a backlight dimming system for Liquid Crystal Display (LCD) and show how the modeled image can be used as an input to quality assessment algorithms. For quality assessment, we propose an image quality metric, based on Peak Signal-to-Noise Ratio (PSNR) computation in the CIE L*a*b* color space. The metric takes luminance reduction, color distortion and loss of uniformity in the resulting image in consideration. Subjective evaluations of images generated using different backlight dimming algorithms and clipping strategies show that the proposed metric estimates the perceived image quality more accurately than conventional PSNR.
客观图像和视频质量指标主要关注信号的数字表示。然而,显示特性对于整体体验质量(QoE)也是必不可少的。在本文中,我们使用了液晶显示器(LCD)的背光调光系统模型,并展示了如何将建模图像用作质量评估算法的输入。对于质量评估,我们提出了一种基于CIE L*a*b*色彩空间中峰值信噪比(PSNR)计算的图像质量度量。该度量考虑了产生的图像中的亮度降低、颜色失真和均匀性损失。对使用不同背光调光算法和裁剪策略生成的图像的主观评价表明,所提出的度量比传统的PSNR更准确地估计感知图像质量。
{"title":"Modeling the color image and video quality on liquid crystal displays with backlight dimming","authors":"J. Korhonen, Claire Mantel, Nino Burini, Søren Forchhammer","doi":"10.1109/VCIP.2013.6706383","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706383","url":null,"abstract":"Objective image and video quality metrics focus mostly on the digital representation of the signal. However, the display characteristics are also essential for the overall Quality of Experience (QoE). In this paper, we use a model of a backlight dimming system for Liquid Crystal Display (LCD) and show how the modeled image can be used as an input to quality assessment algorithms. For quality assessment, we propose an image quality metric, based on Peak Signal-to-Noise Ratio (PSNR) computation in the CIE L*a*b* color space. The metric takes luminance reduction, color distortion and loss of uniformity in the resulting image in consideration. Subjective evaluations of images generated using different backlight dimming algorithms and clipping strategies show that the proposed metric estimates the perceived image quality more accurately than conventional PSNR.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134164457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Gaze pattern analysis for video contents with different frame rates 不同帧率视频内容的注视模式分析
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706429
Manri Cheon, Jong-Seok Lee
This paper presents a study investigating the viewing behavior of human subjects for video contents having different frame rates. Frame rate variability arises when temporal video scalability is considered for adaptive video transmission, and the gaze pattern variation due to the frame rate variability would eventually affect the visual perception, which needs to be considered during perceptual optimization of such a system. We design an eye-tracking experiment using several high definition contents having a wide range of content characteristics. By comparing the gaze points for a normal frame rate condition and a low frame rate condition, it is shown that, although the overall viewing pattern remains quite similar, statistically significant difference is also observed for some time intervals. The difference is analyzed in terms of two factors, namely, overall gaze paths and subject-wise variability.
本文对不同帧率的视频内容的观看行为进行了研究。在自适应视频传输中考虑时间视频可扩展性时,会产生帧率可变性,帧率可变性导致的注视模式变化最终会影响视觉感知,这需要在系统的感知优化过程中加以考虑。我们设计了一个眼动追踪实验,使用几个具有广泛内容特征的高清内容。通过对正常帧率条件下和低帧率条件下注视点的比较,可以发现,尽管整体的注视模式非常相似,但在一些时间间隔内也观察到统计学上的显著差异。这种差异是根据两个因素来分析的,即总体注视路径和主体的可变性。
{"title":"Gaze pattern analysis for video contents with different frame rates","authors":"Manri Cheon, Jong-Seok Lee","doi":"10.1109/VCIP.2013.6706429","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706429","url":null,"abstract":"This paper presents a study investigating the viewing behavior of human subjects for video contents having different frame rates. Frame rate variability arises when temporal video scalability is considered for adaptive video transmission, and the gaze pattern variation due to the frame rate variability would eventually affect the visual perception, which needs to be considered during perceptual optimization of such a system. We design an eye-tracking experiment using several high definition contents having a wide range of content characteristics. By comparing the gaze points for a normal frame rate condition and a low frame rate condition, it is shown that, although the overall viewing pattern remains quite similar, statistically significant difference is also observed for some time intervals. The difference is analyzed in terms of two factors, namely, overall gaze paths and subject-wise variability.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121200835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Endoscopy video summarization based on unsupervised learning and feature discrimination 基于无监督学习和特征识别的内窥镜视频摘要
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706410
M. Ismail, Ouiem Bchir, Ahmed Z. Emam
We propose a novel endoscopy video summarization approach based on unsupervised learning and feature discrimination. The proposed learning approach partitions the collection of video frames into homogeneous categories based on their visual and temporal descriptors. Also, it generates possibilistic memberships in order to represent the degree of typicality of each video frame within every category, and reduce the influence of noise frames on the learning process. The algorithm learns iteratively the optimal relevance weight for each feature subset within each cluster. Moreover, it finds the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. The endoscopy video summary consists of the most typical frames in all clusters after discarding noise frames. We compare the performance of the proposed algorithm with state-of-the-art learning approaches. We show that the possibilistic approach is more robust. The endoscopy videos collection includes more than 90k video frames.
提出了一种基于无监督学习和特征识别的内窥镜视频摘要方法。该学习方法根据视频帧的视觉描述符和时间描述符将视频帧的集合划分为同类类别。此外,它还生成可能性隶属度,以表示每个视频帧在每个类别中的典型程度,并减少噪声帧对学习过程的影响。该算法迭代学习每个聚类中每个特征子集的最优关联权值。此外,它利用可能性隶属函数的一些性质,以无监督和有效的方式找到最优簇数。内窥镜视频摘要在剔除噪声帧后,由所有聚类中最典型的帧组成。我们将所提出的算法的性能与最先进的学习方法进行了比较。我们证明了可能性方法更健壮。内窥镜视频收藏包括超过90k视频帧。
{"title":"Endoscopy video summarization based on unsupervised learning and feature discrimination","authors":"M. Ismail, Ouiem Bchir, Ahmed Z. Emam","doi":"10.1109/VCIP.2013.6706410","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706410","url":null,"abstract":"We propose a novel endoscopy video summarization approach based on unsupervised learning and feature discrimination. The proposed learning approach partitions the collection of video frames into homogeneous categories based on their visual and temporal descriptors. Also, it generates possibilistic memberships in order to represent the degree of typicality of each video frame within every category, and reduce the influence of noise frames on the learning process. The algorithm learns iteratively the optimal relevance weight for each feature subset within each cluster. Moreover, it finds the optimal number of clusters in an unsupervised and efficient way by exploiting some properties of the possibilistic membership function. The endoscopy video summary consists of the most typical frames in all clusters after discarding noise frames. We compare the performance of the proposed algorithm with state-of-the-art learning approaches. We show that the possibilistic approach is more robust. The endoscopy videos collection includes more than 90k video frames.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121282432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
A control theory based rate adaption scheme for dash over multiple servers 一种基于控制理论的多服务器dash速率自适应方案
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706335
Chao Zhou, Xinggong Zhang, Zongming Guo
Recently, Dynamic Adaptive Streaming over HTTP (DASH) has been widely deployed in the Internet. However, the research about DASH over Multiple Content Distribution Servers (MCDS) is few. Compared with traditional single-server-DASH, MCDS are able to offer expanded bandwidth, link diversity, and reliability. It is, however, a challenging problem to smooth video bitrate switchings over multiple servers due to their diverse bandwidths. In this paper, we propose a block-based rate adaptation method considering both the diverse bandwidths and feedback buffered video time. Multiple fragments are grouped into a block, and the fragments are downloaded in parallel from multiple servers. We propose to adapt video bitrate at the block level rather than at the fragment level. By dynamically adjusting the block length and scheduling fragment requests to multiple servers, the requested video bitrates from the multiple servers are synchronized, making the fragments downloaded orderly. Then, we propose a control-theoretic approach to select an appropriate bitrate for each block. By modeling and linearizing the rate adaption system, we propose a novel Proportional-Derivative (PD) controller to adapt video bitrate with high responsiveness and stability. Theoretical analysis and extensive experiments on the Internet demonstrate the good efficiency of our DASH designs.
近年来,基于HTTP的动态自适应流(Dynamic Adaptive Streaming over HTTP, DASH)在Internet上得到了广泛的应用。然而,关于基于多内容分发服务器(MCDS)的DASH的研究却很少。与传统的单服务器dash相比,MCDS能够提供更大的带宽、链路多样性和可靠性。然而,由于多个服务器的带宽不同,在多个服务器上实现视频比特率的平滑切换是一个具有挑战性的问题。在本文中,我们提出了一种基于块的速率自适应方法,同时考虑了不同的带宽和反馈缓冲视频时间。多个片段被分组成一个块,并从多个服务器并行下载这些片段。我们建议在块级而不是片段级调整视频比特率。通过动态调整块长度和调度片段请求到多个服务器,同步多个服务器请求的视频比特率,使片段有序下载。然后,我们提出了一种控制理论方法来为每个块选择合适的比特率。通过对速率自适应系统进行建模和线性化,我们提出了一种新的比例导数(PD)控制器来适应具有高响应性和稳定性的视频比特率。理论分析和广泛的网络实验证明了DASH设计的良好效率。
{"title":"A control theory based rate adaption scheme for dash over multiple servers","authors":"Chao Zhou, Xinggong Zhang, Zongming Guo","doi":"10.1109/VCIP.2013.6706335","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706335","url":null,"abstract":"Recently, Dynamic Adaptive Streaming over HTTP (DASH) has been widely deployed in the Internet. However, the research about DASH over Multiple Content Distribution Servers (MCDS) is few. Compared with traditional single-server-DASH, MCDS are able to offer expanded bandwidth, link diversity, and reliability. It is, however, a challenging problem to smooth video bitrate switchings over multiple servers due to their diverse bandwidths. In this paper, we propose a block-based rate adaptation method considering both the diverse bandwidths and feedback buffered video time. Multiple fragments are grouped into a block, and the fragments are downloaded in parallel from multiple servers. We propose to adapt video bitrate at the block level rather than at the fragment level. By dynamically adjusting the block length and scheduling fragment requests to multiple servers, the requested video bitrates from the multiple servers are synchronized, making the fragments downloaded orderly. Then, we propose a control-theoretic approach to select an appropriate bitrate for each block. By modeling and linearizing the rate adaption system, we propose a novel Proportional-Derivative (PD) controller to adapt video bitrate with high responsiveness and stability. Theoretical analysis and extensive experiments on the Internet demonstrate the good efficiency of our DASH designs.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128719560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Dense depth acquisition via one-shot stripe structured light 通过一次发射条纹结构光进行密集深度采集
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706402
Qin Li, Fu Li, Guangming Shi, Fei Qi, Yuexin Shi, Shan Gao
Depth acquisition for moving objects becomes increasingly critical for some applications such as human facial expression recognition. This paper presents a method for capturing the depth maps of moving objects that uses a one-shot black-and-white stripe pattern with the features of simplicity and easily generation. Considering the accuracy of a matching is crucial for a precise depth map but the matching of variant-width stripes is sparse and rough, the phase differences extracted by Gabor filter to achieve a pixel-wise matching with sub-pixel accuracy are used. The details of the derivation are presented to prove that this method based on the phase difference calculated by Gabor filter is valid. In addition, the periodic ambiguity of the encoded stripe is eliminated by the epipolar segment covering a given depth range at a camera-projector calibrating stage to decrease the calculation complexity. Experimental results show that our method can get a dense and accurate depth map of a moving object.
运动物体的深度采集在某些应用中变得越来越重要,例如人类面部表情识别。本文提出了一种利用黑白条纹模式一次性捕获运动物体深度图的方法,该方法具有简单、易于生成的特点。考虑到匹配精度对精确的深度图至关重要,但变宽条纹的匹配是稀疏和粗糙的,采用Gabor滤波器提取相位差来实现亚像素精度的逐像素匹配。最后给出了具体的推导过程,证明了基于Gabor滤波器计算相位差的方法是有效的。此外,在摄像机-投影仪标定阶段,利用覆盖给定深度范围的极段消除了编码条纹的周期性模糊性,降低了计算复杂度。实验结果表明,该方法可以获得密集、准确的运动物体深度图。
{"title":"Dense depth acquisition via one-shot stripe structured light","authors":"Qin Li, Fu Li, Guangming Shi, Fei Qi, Yuexin Shi, Shan Gao","doi":"10.1109/VCIP.2013.6706402","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706402","url":null,"abstract":"Depth acquisition for moving objects becomes increasingly critical for some applications such as human facial expression recognition. This paper presents a method for capturing the depth maps of moving objects that uses a one-shot black-and-white stripe pattern with the features of simplicity and easily generation. Considering the accuracy of a matching is crucial for a precise depth map but the matching of variant-width stripes is sparse and rough, the phase differences extracted by Gabor filter to achieve a pixel-wise matching with sub-pixel accuracy are used. The details of the derivation are presented to prove that this method based on the phase difference calculated by Gabor filter is valid. In addition, the periodic ambiguity of the encoded stripe is eliminated by the epipolar segment covering a given depth range at a camera-projector calibrating stage to decrease the calculation complexity. Experimental results show that our method can get a dense and accurate depth map of a moving object.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115409596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Universal and low-complexity quantizer design for compressive sensing image coding 压缩感知图像编码的通用低复杂度量化器设计
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706403
Xiangwei Li, Xuguang Lan, Meng Yang, Jianru Xue, Nanning Zheng
Compressive sensing imaging (CSI) is a new framework for image coding, which enables acquiring and compressing a scene simultaneously. The CS encoder shifts the bulk of the system complexity to the decoder efficiently. Ideally, implementation of CSI provides lossless compression in image coding. In this paper, we consider the lossy compression of the CS measurements in CSI system. We design a universal quantizer for the CS measurements of any input image. The proposed method firstly establishes a universal probability model for the CS measurements in advance, without knowing any information of the input image. Then a fast quantizer is designed based on this established model. Simulation result demonstrates that the proposed method has nearly optimal rate-distortion (R~D) performance, meanwhile, maintains a very low computational complexity at the CS encoder.
压缩感知成像(CSI)是一种新的图像编码框架,它可以同时获取和压缩场景。CS编码器有效地将大部分系统复杂性转移到解码器上。理想情况下,CSI的实现在图像编码中提供无损压缩。本文研究了CSI系统中CS测量值的有损压缩问题。我们设计了一个通用量化器,用于任何输入图像的CS测量。该方法在不知道输入图像的任何信息的情况下,预先建立了CS测量的通用概率模型。在此基础上设计了快速量化器。仿真结果表明,该方法具有接近最优的率失真(R~D)性能,同时在CS编码器上保持了极低的计算复杂度。
{"title":"Universal and low-complexity quantizer design for compressive sensing image coding","authors":"Xiangwei Li, Xuguang Lan, Meng Yang, Jianru Xue, Nanning Zheng","doi":"10.1109/VCIP.2013.6706403","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706403","url":null,"abstract":"Compressive sensing imaging (CSI) is a new framework for image coding, which enables acquiring and compressing a scene simultaneously. The CS encoder shifts the bulk of the system complexity to the decoder efficiently. Ideally, implementation of CSI provides lossless compression in image coding. In this paper, we consider the lossy compression of the CS measurements in CSI system. We design a universal quantizer for the CS measurements of any input image. The proposed method firstly establishes a universal probability model for the CS measurements in advance, without knowing any information of the input image. Then a fast quantizer is designed based on this established model. Simulation result demonstrates that the proposed method has nearly optimal rate-distortion (R~D) performance, meanwhile, maintains a very low computational complexity at the CS encoder.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114778639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Expression-invariant and sparse representation for mesh-based compression for 3-D face models 基于网格的三维人脸模型压缩的表达式不变和稀疏表示
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706442
Junhui Hou, Lap-Pui Chau, Ying He, N. Magnenat-Thalmann
Compression of mesh-based 3-D models has been an important issue, which ensures efficient storage and transmission. In this paper, we present a very effective compression scheme specifically for expression variation 3-D face models. Firstly, 3-D models are mapped into 2-D parametric domain and corresponded by expression-invariant parameterizaton, leading to 2-D image format representation namely geometry images, which simplifies the 3-D model compression into 2-D image compression. Then, sparse representation with learned dictionaries via K-SVD is applied to each patch from sliced GI so that only few coefficients and their indices are needed to be encoded, leading to low datasize. Experimental results demonstrate that the proposed scheme provides significant improvement in terms of compression performance, especially at low bitrate, compared with existing algorithms.
基于网格的三维模型的压缩一直是一个重要的问题,它保证了高效的存储和传输。在本文中,我们提出了一个非常有效的压缩方案,特别是表情变化的三维人脸模型。首先,将三维模型映射到二维参数域,通过不变表达式参数化进行对应,得到二维图像格式表示即几何图像,将三维模型压缩简化为二维图像压缩;然后,通过K-SVD将学习字典的稀疏表示应用于切片GI的每个patch,这样只需要编码很少的系数及其索引,从而导致低数据化。实验结果表明,与现有算法相比,该方案在压缩性能方面有显著提高,特别是在低比特率下。
{"title":"Expression-invariant and sparse representation for mesh-based compression for 3-D face models","authors":"Junhui Hou, Lap-Pui Chau, Ying He, N. Magnenat-Thalmann","doi":"10.1109/VCIP.2013.6706442","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706442","url":null,"abstract":"Compression of mesh-based 3-D models has been an important issue, which ensures efficient storage and transmission. In this paper, we present a very effective compression scheme specifically for expression variation 3-D face models. Firstly, 3-D models are mapped into 2-D parametric domain and corresponded by expression-invariant parameterizaton, leading to 2-D image format representation namely geometry images, which simplifies the 3-D model compression into 2-D image compression. Then, sparse representation with learned dictionaries via K-SVD is applied to each patch from sliced GI so that only few coefficients and their indices are needed to be encoded, leading to low datasize. Experimental results demonstrate that the proposed scheme provides significant improvement in terms of compression performance, especially at low bitrate, compared with existing algorithms.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"653 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121986493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Joint trilateral filtering for depth map super-resolution 深度图超分辨率联合三边滤波
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706444
Kai-Han Lo, Y. Wang, K. Hua
Depth map super-resolution is an emerging topic due to the increasing needs and applications using RGB-D sensors. Together with the color image, the corresponding range data provides additional information and makes visual analysis tasks more tractable. However, since the depth maps captured by such sensors are typically with limited resolution, it is preferable to enhance its resolution for improved recognition. In this paper, we present a novel joint trilateral filtering (JTF) algorithm for solving depth map super-resolution (SR) problems. Inspired by bilateral filtering, our JTF utilizes and preserves edge information from the associated high-resolution (HR) image by taking spatial and range information of local pixels. Our proposed further integrates local gradient information of the depth map when synthesizing its HR output, which alleviates textural artifacts like edge discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.
由于RGB-D传感器的需求和应用日益增加,深度图超分辨率是一个新兴的课题。与彩色图像一起,相应的距离数据提供了额外的信息,使可视化分析任务更容易处理。然而,由于这种传感器捕获的深度图通常具有有限的分辨率,因此最好提高其分辨率以提高识别能力。在本文中,我们提出了一种新的联合三边滤波(JTF)算法来解决深度图超分辨率问题。受双边滤波的启发,我们的JTF通过获取局部像素的空间和距离信息,利用并保留相关高分辨率(HR)图像的边缘信息。在合成深度图的HR输出时,我们进一步整合了深度图的局部梯度信息,减轻了边缘不连续等纹理伪影。定量和定性实验结果证明了我们的方法比先前的深度图上采样工作的有效性和鲁棒性。
{"title":"Joint trilateral filtering for depth map super-resolution","authors":"Kai-Han Lo, Y. Wang, K. Hua","doi":"10.1109/VCIP.2013.6706444","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706444","url":null,"abstract":"Depth map super-resolution is an emerging topic due to the increasing needs and applications using RGB-D sensors. Together with the color image, the corresponding range data provides additional information and makes visual analysis tasks more tractable. However, since the depth maps captured by such sensors are typically with limited resolution, it is preferable to enhance its resolution for improved recognition. In this paper, we present a novel joint trilateral filtering (JTF) algorithm for solving depth map super-resolution (SR) problems. Inspired by bilateral filtering, our JTF utilizes and preserves edge information from the associated high-resolution (HR) image by taking spatial and range information of local pixels. Our proposed further integrates local gradient information of the depth map when synthesizing its HR output, which alleviates textural artifacts like edge discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121871878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Recovering depth of background and foreground from a monocular video with camera motion 从带有摄像机运动的单目视频中恢复背景和前景的深度
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706409
Hu Tian, Bojin Zhuang, Yan Hua, Yanyun Zhao, A. Cai
In this paper we propose a depth recovery approach for monocular videos with or without camera motion. By combining geometric information and moving object extraction, not only the depth of background but also the depth of foreground can be recovered. Furthermore, for cases involving complex camera motion such as fast moving, translating, vertical movement, we propose a novel global motion estimation (GME) method including effective outlier rejection to extract moving objects, and experiments demonstrate that the proposed GME method outperforms most of the state-of-the-art methods. The depth recovery approach we propose is tested on four video sequences with different camera movements. Experimental results show that our approach produces more accurate depth of both background and foreground than existing depth recovery methods.
在本文中,我们提出了一种深度恢复方法,用于有或没有相机运动的单目视频。将几何信息与运动目标提取相结合,不仅可以恢复背景深度,还可以恢复前景深度。此外,针对快速运动、平移、垂直运动等复杂的摄像机运动,我们提出了一种新的全局运动估计(GME)方法,包括有效的异常值抑制来提取运动目标,实验表明,该方法优于大多数最先进的方法。我们提出的深度恢复方法在四个不同摄像机运动的视频序列上进行了测试。实验结果表明,与现有的深度恢复方法相比,该方法可以获得更精确的背景和前景深度。
{"title":"Recovering depth of background and foreground from a monocular video with camera motion","authors":"Hu Tian, Bojin Zhuang, Yan Hua, Yanyun Zhao, A. Cai","doi":"10.1109/VCIP.2013.6706409","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706409","url":null,"abstract":"In this paper we propose a depth recovery approach for monocular videos with or without camera motion. By combining geometric information and moving object extraction, not only the depth of background but also the depth of foreground can be recovered. Furthermore, for cases involving complex camera motion such as fast moving, translating, vertical movement, we propose a novel global motion estimation (GME) method including effective outlier rejection to extract moving objects, and experiments demonstrate that the proposed GME method outperforms most of the state-of-the-art methods. The depth recovery approach we propose is tested on four video sequences with different camera movements. Experimental results show that our approach produces more accurate depth of both background and foreground than existing depth recovery methods.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121712613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2013 Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1