首页 > 最新文献

2013 Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Low complexity image matching using color based SIFT 基于彩色SIFT的低复杂度图像匹配
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706456
Abhishek Nagar, A. Saxena, S. Bucak, Felix C. A. Fernandes, Kong-Posh Bhat
Image matching and search is gaining significant commercial importance nowadays due to various applications it enables such as augmented reality, image-queries for internet search, etc. Many researchers have effectively used color information in an image to improve its matching accuracy. These techniques, however, cannot be directly used for large scale mobile visual search applications that pose strict constraints on the size of the extracted features, computational resources and the system accuracy. To overcome this limitation, we propose a new and effective technique to incorporate color information that can use the SIFT extraction technique. We conduct our experiments on a large dataset containing around 33, 000 images that is currently being investigated in the MPEG-Compact Descriptors for Visual Search Standard and show substantial improvement compared to baseline.
如今,图像匹配和搜索正在获得重要的商业重要性,因为它可以实现各种应用,如增强现实,互联网搜索的图像查询等。许多研究者已经有效地利用图像中的颜色信息来提高图像的匹配精度。然而,这些技术不能直接用于大规模的移动视觉搜索应用,这些应用对提取特征的大小、计算资源和系统精度都有严格的限制。为了克服这一限制,我们提出了一种新的有效的技术来融合颜色信息,可以使用SIFT提取技术。我们在一个包含大约33,000张图像的大型数据集上进行实验,这些图像目前正在MPEG-Compact Descriptors for Visual Search Standard中进行研究,与基线相比显示出实质性的改进。
{"title":"Low complexity image matching using color based SIFT","authors":"Abhishek Nagar, A. Saxena, S. Bucak, Felix C. A. Fernandes, Kong-Posh Bhat","doi":"10.1109/VCIP.2013.6706456","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706456","url":null,"abstract":"Image matching and search is gaining significant commercial importance nowadays due to various applications it enables such as augmented reality, image-queries for internet search, etc. Many researchers have effectively used color information in an image to improve its matching accuracy. These techniques, however, cannot be directly used for large scale mobile visual search applications that pose strict constraints on the size of the extracted features, computational resources and the system accuracy. To overcome this limitation, we propose a new and effective technique to incorporate color information that can use the SIFT extraction technique. We conduct our experiments on a large dataset containing around 33, 000 images that is currently being investigated in the MPEG-Compact Descriptors for Visual Search Standard and show substantial improvement compared to baseline.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"203 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128703565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
From local representation to global face hallucination: A novel super-resolution method by nonnegative feature transformation 从局部表征到全局面孔幻觉:一种基于非负特征变换的超分辨方法
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706354
T. Lu, R. Hu, Zhen Han, Junjun Jiang, Yanduo Zhang
Most of global face hallucination methods treat the face as a whole, ignoring the fact that the face is composed by part-based organs. Therefore, the results obtained by these methods always lack of detailed information. Nonnegative matrix factorization (NMF) based face hallucination method is properly used to enhance the detailed information. Usually, NMF basis is only learnt from high-resolution (HR) samples, leading to over-smooth output and lack of high frequency details. In order to solve this problem, we propose a simple but novel face hallucination method using nonnegative feature transformation by two-step framework. In particular, we learn the NMF basis from low-resolution (LR) and HR samples separately, and then transform the local representation feature of input into the global representation subspaces, keeping the weights into the HR samples space for output. Furthermore, the maximum a posteriori (MAP) method is used to estimate a better output. Experiments show that the hallucinated face of the proposed method is not only more high-frequency details, but also has better performance than many state-of-art algorithms.
全球大多数的人脸幻觉方法都将人脸视为一个整体,忽略了人脸是由部分器官组成的事实。因此,这些方法得到的结果往往缺乏详细的信息。采用基于非负矩阵分解(NMF)的人脸幻觉方法增强图像的细节信息。通常,NMF基只从高分辨率(HR)样本中学习,导致输出过于平滑,缺乏高频细节。为了解决这一问题,我们提出了一种简单而新颖的基于两步框架的非负特征变换人脸幻觉方法。特别是,我们分别从低分辨率(LR)和HR样本中学习NMF基,然后将输入的局部表示特征转换为全局表示子空间,将权重保留在HR样本空间中用于输出。此外,使用最大后验(MAP)方法来估计更好的输出。实验表明,该方法的幻觉人脸不仅具有更多的高频细节,而且比许多现有算法具有更好的性能。
{"title":"From local representation to global face hallucination: A novel super-resolution method by nonnegative feature transformation","authors":"T. Lu, R. Hu, Zhen Han, Junjun Jiang, Yanduo Zhang","doi":"10.1109/VCIP.2013.6706354","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706354","url":null,"abstract":"Most of global face hallucination methods treat the face as a whole, ignoring the fact that the face is composed by part-based organs. Therefore, the results obtained by these methods always lack of detailed information. Nonnegative matrix factorization (NMF) based face hallucination method is properly used to enhance the detailed information. Usually, NMF basis is only learnt from high-resolution (HR) samples, leading to over-smooth output and lack of high frequency details. In order to solve this problem, we propose a simple but novel face hallucination method using nonnegative feature transformation by two-step framework. In particular, we learn the NMF basis from low-resolution (LR) and HR samples separately, and then transform the local representation feature of input into the global representation subspaces, keeping the weights into the HR samples space for output. Furthermore, the maximum a posteriori (MAP) method is used to estimate a better output. Experiments show that the hallucinated face of the proposed method is not only more high-frequency details, but also has better performance than many state-of-art algorithms.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122970854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A 3D→4D color space transform for efficient lossless image compression 3D→4D色彩空间变换,实现高效无损图像压缩
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706379
J. Porwal
Color images are commonly represented as a combination of different components, for example, Red, Blue and Green in the RGB color space. The components are often correlated and contain common information. By representing an image in a color space in which its components are decorrelated, it can be more efficiently encoded achieving better compression. Most of the approaches to find a suitable color space for image compression are limited to transforming an n component color space to another n component color space. In this paper, we propose a novel transform that converts a 3 component RGB image to a 4 component cGST (color, gray, shade, tinge) image and vice-versa, and show its suitability for image compression. The transform is fully reversible (and hence, is suitable for lossy as well as lossless image compression) and preserves the bit-length for the GST components (allowing existing algorithms to be applied to the components). We develop an encoder-decoder tool using the transform and JPEG-LS prediction scheme, and demonstrate its efficiency (upto 35% better compression ratios over JPEG-LS, 2-5 times less runtime than JPEG 2000 with similar compression ratios) on a diverse set of test images. The transform works especially well for satellite images, computer generated animations and real images with shadows. The work also opens the scope for studying color transforms not restricted to matrix multiplication or n→n dimensional conversions for image compression. Our work also adds to the understanding of the impact of shadows on color components and is useful in image analysis in general.
彩色图像通常表示为不同成分的组合,例如,RGB色彩空间中的红、蓝、绿。组件通常是相关的,并且包含公共信息。通过在颜色空间中表示图像,其中其分量是去相关的,可以更有效地对图像进行编码,从而实现更好的压缩。大多数寻找适合图像压缩的颜色空间的方法都局限于将一个n分量的颜色空间转换为另一个n分量的颜色空间。在本文中,我们提出了一种新的变换,将3分量的RGB图像转换为4分量的cGST(彩色、灰色、阴影、色调)图像,反之亦然,并证明了它对图像压缩的适用性。变换是完全可逆的(因此,既适用于有损图像压缩,也适用于无损图像压缩),并保留了GST组件的位长度(允许现有算法应用于组件)。我们使用变换和JPEG- ls预测方案开发了一个编码器-解码器工具,并在不同的测试图像集上证明了它的效率(比JPEG- ls的压缩比提高35%,比具有相似压缩比的JPEG 2000的运行时间少2-5倍)。这种变换尤其适用于卫星图像、计算机生成的动画和带有阴影的真实图像。这项工作还打开了研究颜色变换的范围,不局限于矩阵乘法或图像压缩的n→n维转换。我们的工作还增加了对阴影对颜色成分影响的理解,并且在一般的图像分析中很有用。
{"title":"A 3D→4D color space transform for efficient lossless image compression","authors":"J. Porwal","doi":"10.1109/VCIP.2013.6706379","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706379","url":null,"abstract":"Color images are commonly represented as a combination of different components, for example, Red, Blue and Green in the RGB color space. The components are often correlated and contain common information. By representing an image in a color space in which its components are decorrelated, it can be more efficiently encoded achieving better compression. Most of the approaches to find a suitable color space for image compression are limited to transforming an n component color space to another n component color space. In this paper, we propose a novel transform that converts a 3 component RGB image to a 4 component cGST (color, gray, shade, tinge) image and vice-versa, and show its suitability for image compression. The transform is fully reversible (and hence, is suitable for lossy as well as lossless image compression) and preserves the bit-length for the GST components (allowing existing algorithms to be applied to the components). We develop an encoder-decoder tool using the transform and JPEG-LS prediction scheme, and demonstrate its efficiency (upto 35% better compression ratios over JPEG-LS, 2-5 times less runtime than JPEG 2000 with similar compression ratios) on a diverse set of test images. The transform works especially well for satellite images, computer generated animations and real images with shadows. The work also opens the scope for studying color transforms not restricted to matrix multiplication or n→n dimensional conversions for image compression. Our work also adds to the understanding of the impact of shadows on color components and is useful in image analysis in general.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123084952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Correlation estimation for distributed wireless video communication 分布式无线视频通信的相关估计
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706372
Xiaoliang Zhu, N. Zhang, Xiaopeng Fan, Ruiqin Xiong, Debin Zhao
One important problem in distributed video coding is to estimate the variance of the correlation noise between the video signal and its decoder side information. This variance is hard to estimate due to the lack of the motion vectors at the encoder side. In this paper, we first propose a linear model to estimate this variance by referring the zero motion prediction at the encoder based on a Markov field assumption. Furthermore, not only the prediction noise from the video signal itself but also the additional noise due to wireless transmission is considered in this paper. We applied our correlation estimation method in our recent distributed wireless visual communication framework called DCAST. The experimental results show that the proposed method improves the video PSNR by 0.5-1.5dB while avoiding motion estimation at encoder.
分布式视频编码的一个重要问题是估计视频信号与其解码器侧信息之间的相关噪声的方差。由于编码器侧缺乏运动向量,这种方差很难估计。在本文中,我们首先提出了一个线性模型来估计这个方差参考零运动预测在编码器基于马尔可夫场假设。此外,本文不仅考虑了视频信号本身的预测噪声,还考虑了由于无线传输而产生的附加噪声。我们将相关估计方法应用到最新的分布式无线视觉通信框架DCAST中。实验结果表明,该方法在避免编码器运动估计的情况下,将视频的PSNR提高了0.5 ~ 1.5 db。
{"title":"Correlation estimation for distributed wireless video communication","authors":"Xiaoliang Zhu, N. Zhang, Xiaopeng Fan, Ruiqin Xiong, Debin Zhao","doi":"10.1109/VCIP.2013.6706372","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706372","url":null,"abstract":"One important problem in distributed video coding is to estimate the variance of the correlation noise between the video signal and its decoder side information. This variance is hard to estimate due to the lack of the motion vectors at the encoder side. In this paper, we first propose a linear model to estimate this variance by referring the zero motion prediction at the encoder based on a Markov field assumption. Furthermore, not only the prediction noise from the video signal itself but also the additional noise due to wireless transmission is considered in this paper. We applied our correlation estimation method in our recent distributed wireless visual communication framework called DCAST. The experimental results show that the proposed method improves the video PSNR by 0.5-1.5dB while avoiding motion estimation at encoder.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123455940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video viewer state estimation using gaze tracking and video content analysis 基于注视跟踪和视频内容分析的视频观看者状态估计
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706365
Jae-Woo Kim, Jong-Ok Kim
In this paper, we propose a novel viewer state model based on gaze tracking and video content analysis. There are two primary contributions in this paper. We first improve gaze state classification significantly by combining video content analysis. Then, based on the estimated gaze state, we propose a novel viewer state model indicating both viewer's interest and existence of viewer's ROIs. Experiments were conducted to verify the performance of the proposed gaze state classifier and viewer state model. The experimental results show that the use of video content analysis in gaze state classification considerably improves the classification results and consequently, the viewer state model correctly estimates the interest state of video viewers.
本文提出了一种基于注视跟踪和视频内容分析的观看者状态模型。本文有两个主要贡献。我们首先结合视频内容分析,显著提高了注视状态分类。然后,基于估计的注视状态,我们提出了一种新的观看者状态模型,该模型同时显示了观看者的兴趣和是否存在观看者的roi。实验验证了所提出的凝视状态分类器和观看者状态模型的性能。实验结果表明,在注视状态分类中使用视频内容分析大大提高了分类结果,从而使观看者状态模型能够正确地估计视频观看者的兴趣状态。
{"title":"Video viewer state estimation using gaze tracking and video content analysis","authors":"Jae-Woo Kim, Jong-Ok Kim","doi":"10.1109/VCIP.2013.6706365","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706365","url":null,"abstract":"In this paper, we propose a novel viewer state model based on gaze tracking and video content analysis. There are two primary contributions in this paper. We first improve gaze state classification significantly by combining video content analysis. Then, based on the estimated gaze state, we propose a novel viewer state model indicating both viewer's interest and existence of viewer's ROIs. Experiments were conducted to verify the performance of the proposed gaze state classifier and viewer state model. The experimental results show that the use of video content analysis in gaze state classification considerably improves the classification results and consequently, the viewer state model correctly estimates the interest state of video viewers.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127154044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Apriori-like algorithm for automatic extraction of the common action characteristics 一种类似apriori的共同动作特征自动提取算法
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706394
Tran Thang Thanh, Fan Chen, K. Kotani, H. Le
With the development of the technology like 3D specialized markers, we could capture the moving signals from marker joints and create a huge set of 3D action MoCap data. The more we understand the human action, the better we could apply it to applications like security, analysis of sports, game etc. In order to find the semantically representative features of human actions, we extract the sets of action characteristics which appear frequently in the database. We then propose an Apriori-like algorithm to automatically extract the common sets shared by different action classes. The extracted representative action characteristics are defined in the semantic level, so that it better describes the intrinsic differences between various actions. In our experiments, we show that the knowledge extracted by this method achieves high accuracy of over 80% in recognizing actions on both training and testing data.
随着3D专用标记等技术的发展,我们可以捕获标记关节的运动信号,并创建一套庞大的3D动作动作捕捉数据。我们对人类行为了解得越多,我们就能更好地将其应用于安全、体育分析、游戏等领域。为了找到具有语义代表性的人类动作特征,我们提取了数据库中频繁出现的动作特征集。然后,我们提出了一种类似apriori的算法来自动提取不同动作类共享的公共集。提取的代表性动作特征在语义层面进行定义,以便更好地描述各种动作之间的内在差异。实验表明,该方法提取的知识在训练数据和测试数据上的动作识别准确率都达到了80%以上。
{"title":"An Apriori-like algorithm for automatic extraction of the common action characteristics","authors":"Tran Thang Thanh, Fan Chen, K. Kotani, H. Le","doi":"10.1109/VCIP.2013.6706394","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706394","url":null,"abstract":"With the development of the technology like 3D specialized markers, we could capture the moving signals from marker joints and create a huge set of 3D action MoCap data. The more we understand the human action, the better we could apply it to applications like security, analysis of sports, game etc. In order to find the semantically representative features of human actions, we extract the sets of action characteristics which appear frequently in the database. We then propose an Apriori-like algorithm to automatically extract the common sets shared by different action classes. The extracted representative action characteristics are defined in the semantic level, so that it better describes the intrinsic differences between various actions. In our experiments, we show that the knowledge extracted by this method achieves high accuracy of over 80% in recognizing actions on both training and testing data.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127928523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Object co-segmentation based on directed graph clustering 基于有向图聚类的对象共分割
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706376
Fanman Meng, Bing Luo, Chao Huang
In this paper, we develop a new algorithm to segment multiple common objects from a group of images. Our method consists of two aspects: directed graph clustering and prior propagation. The clustering is used to cluster the local regions of the original images and generate the foreground priors from these clusterings. The second step propagates the prior of each class and locates the common objects from the images in terms of foreground map. Finally, we use the foreground map as the unary term of Markov random field segmentation and segment the common objects by graph-cuts algorithm. We test our method on FlickrMFC and ICoseg datasets. The experimental results show that the proposed method can achieve larger accuracy compared with several state-of-arts co-segmentation methods.
本文提出了一种从一组图像中分割多个共同目标的新算法。我们的方法包括两个方面:有向图聚类和先验传播。聚类用于对原始图像的局部区域进行聚类,并从这些聚类中生成前景先验。第二步传播每个类的先验,根据前景图从图像中定位出共同的目标。最后,将前景图作为马尔可夫随机场分割的一元项,利用图切算法对常见目标进行分割。我们在FlickrMFC和ICoseg数据集上测试了我们的方法。实验结果表明,与现有的几种协同分割方法相比,该方法具有更高的分割精度。
{"title":"Object co-segmentation based on directed graph clustering","authors":"Fanman Meng, Bing Luo, Chao Huang","doi":"10.1109/VCIP.2013.6706376","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706376","url":null,"abstract":"In this paper, we develop a new algorithm to segment multiple common objects from a group of images. Our method consists of two aspects: directed graph clustering and prior propagation. The clustering is used to cluster the local regions of the original images and generate the foreground priors from these clusterings. The second step propagates the prior of each class and locates the common objects from the images in terms of foreground map. Finally, we use the foreground map as the unary term of Markov random field segmentation and segment the common objects by graph-cuts algorithm. We test our method on FlickrMFC and ICoseg datasets. The experimental results show that the proposed method can achieve larger accuracy compared with several state-of-arts co-segmentation methods.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121273480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Bilinear decomposition for blended expressions representation 混合表达式表示的双线性分解
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706355
Catherine Soladié, R. Séguier, Nicolas Stoiber
This paper proposes a new method for the analysis of blended expressions with varying intensity. The method is based on an asymmetric bilinear model learned on a small amount of expressions. In the resulting expression space, a blended unknown expression has a signature, that can be interpreted as a mixture of the basic expressions used in the creation of the space. Three methods are compared: a traditional method based on active appearance vectors, the asymmetric bilinear model on person-independent appearance vectors and the asymmetric bilinear model on person-specific appearance vectors. Experimental results on the recognition of 14 blended unknown expressions show the relevance of the bilinear models compared to appearance-based methods and the robustness of the person-specific models according to the types of parameters (shape and/or texture).
本文提出了一种分析变强度混合表达式的新方法。该方法基于基于少量表达式学习的非对称双线性模型。在由此产生的表达空间中,混合的未知表达具有签名,可以解释为空间创建中使用的基本表达的混合。比较了基于活动外观向量的传统方法、基于个人无关外观向量的非对称双线性模型和基于个人特定外观向量的非对称双线性模型。对14种混合未知表情的识别实验结果表明,双线性模型与基于外观的方法相比具有相关性,并且根据参数类型(形状和/或纹理)对个人特定模型具有鲁棒性。
{"title":"Bilinear decomposition for blended expressions representation","authors":"Catherine Soladié, R. Séguier, Nicolas Stoiber","doi":"10.1109/VCIP.2013.6706355","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706355","url":null,"abstract":"This paper proposes a new method for the analysis of blended expressions with varying intensity. The method is based on an asymmetric bilinear model learned on a small amount of expressions. In the resulting expression space, a blended unknown expression has a signature, that can be interpreted as a mixture of the basic expressions used in the creation of the space. Three methods are compared: a traditional method based on active appearance vectors, the asymmetric bilinear model on person-independent appearance vectors and the asymmetric bilinear model on person-specific appearance vectors. Experimental results on the recognition of 14 blended unknown expressions show the relevance of the bilinear models compared to appearance-based methods and the robustness of the person-specific models according to the types of parameters (shape and/or texture).","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126316951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of the primary quantization parameter in MPEG videos MPEG视频中主要量化参数的估计
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706413
Wan Wang, Xinghao Jiang, Shilin Wang, Tanfeng Sun
The advanced technology and sophisticated software have rendered audiovisual content exposed to forgery, inspiring the emergence of multimedia forensic research. Since video tampering may involve double compression, the analysis of compression history is of significance. In this paper, we consider the processing chains of two compression steps and propose an algorithm that aims at identifying the quantization parameter used in the previous coding process. The method relies on the fact that characteristic footprints can be observed under different relationships between quantization parameters of consecutive compression operations. Features are extracted from both Discrete Cosine Transform (DCT) coefficients and their differential counterparts to capture the statistical disturbance. Experimental results demonstrate the effectiveness of our method.
先进的技术和复杂的软件使视听内容暴露于伪造,激发了多媒体法医研究的出现。由于视频篡改可能涉及双重压缩,因此对压缩历史的分析具有重要意义。在本文中,我们考虑了两个压缩步骤的处理链,并提出了一种旨在识别前一个编码过程中使用的量化参数的算法。该方法依赖于在连续压缩操作的量化参数之间的不同关系下可以观察到特征足迹。从离散余弦变换(DCT)系数和它们的微分对应系数中提取特征以捕获统计扰动。实验结果证明了该方法的有效性。
{"title":"Estimation of the primary quantization parameter in MPEG videos","authors":"Wan Wang, Xinghao Jiang, Shilin Wang, Tanfeng Sun","doi":"10.1109/VCIP.2013.6706413","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706413","url":null,"abstract":"The advanced technology and sophisticated software have rendered audiovisual content exposed to forgery, inspiring the emergence of multimedia forensic research. Since video tampering may involve double compression, the analysis of compression history is of significance. In this paper, we consider the processing chains of two compression steps and propose an algorithm that aims at identifying the quantization parameter used in the previous coding process. The method relies on the fact that characteristic footprints can be observed under different relationships between quantization parameters of consecutive compression operations. Features are extracted from both Discrete Cosine Transform (DCT) coefficients and their differential counterparts to capture the statistical disturbance. Experimental results demonstrate the effectiveness of our method.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125562930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Visually lossless screen content coding using HEVC base-layer 视觉上无损的屏幕内容编码使用HEVC基础层
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706364
Geert Braeckman, Shahid M. Satti, Heng Chen, A. Munteanu, P. Schelkens
This paper presents a novel two-layer coding framework targeting visually lossless compression of screen content video. The proposed framework employs the conventional HEVC standard for the base-layer. For the enhancement layer, a hybrid of spatial and temporal block-prediction mechanism is introduced to guarantee a small energy of the error-residual. Spatial prediction is generally chosen for dynamic areas, while temporal predictions yield better prediction for static areas in a video frame. The prediction residual is quantized based on whether a given block is static or dynamic. Run-length coding, Golomb based binarization and context-based arithmetic coding are employed to efficiently code the quantized residual and form the enhancement-layer. Performance evaluations using 4:4:4 screen content sequences show that, for visually lossless video quality, the proposed system significantly saves the bit-rate compared to the two-layer lossless HEVC framework.
针对屏幕内容视频的视觉无损压缩,提出了一种新的两层编码框架。该框架采用传统的HEVC标准作为基础层。在增强层,引入了时空混合块预测机制,保证了误差残差能量小。空间预测通常用于动态区域,而时间预测可以更好地预测视频帧中的静态区域。根据给定块是静态的还是动态的,对预测残差进行量化。采用行距编码、基于Golomb的二值化和基于上下文的算术编码对量化残差进行有效编码,形成增强层。使用4:4:4屏幕内容序列进行的性能评估表明,对于视觉上无损的视频质量,与两层无损HEVC框架相比,所提出的系统显著节省了比特率。
{"title":"Visually lossless screen content coding using HEVC base-layer","authors":"Geert Braeckman, Shahid M. Satti, Heng Chen, A. Munteanu, P. Schelkens","doi":"10.1109/VCIP.2013.6706364","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706364","url":null,"abstract":"This paper presents a novel two-layer coding framework targeting visually lossless compression of screen content video. The proposed framework employs the conventional HEVC standard for the base-layer. For the enhancement layer, a hybrid of spatial and temporal block-prediction mechanism is introduced to guarantee a small energy of the error-residual. Spatial prediction is generally chosen for dynamic areas, while temporal predictions yield better prediction for static areas in a video frame. The prediction residual is quantized based on whether a given block is static or dynamic. Run-length coding, Golomb based binarization and context-based arithmetic coding are employed to efficiently code the quantized residual and form the enhancement-layer. Performance evaluations using 4:4:4 screen content sequences show that, for visually lossless video quality, the proposed system significantly saves the bit-rate compared to the two-layer lossless HEVC framework.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129766126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2013 Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1