2013 Visual Communications and Image Processing (VCIP)最新文献

英文中文

Detection of salient objects in computer synthesized images based on object-level contrast 基于目标级对比度的计算机合成图像中显著目标的检测

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706362

L. Dong, Weisi Lin, Yuming Fang, Shiqian Wu, S. H. Soon

In this work, we propose a method to detect visually salient objects in computer synthesized images from 3D meshes. Different from existing detection methods on graphic saliency which compute saliency based on pixel-level contrast, the proposed method computes saliency by measuring object-level contrast of each object to the other objects in a rendered image. Given a synthesized image, the proposed method first extracts dominant colors from each object, and represents each object with the dominant color descriptor (DCD). Saliency is measured as the contrast between the DCD of the object and the DCDs of its surrounding objects. We evaluate the proposed method on a data set of computer rendered images, and the results show that the proposed method obtains much better performance compared with existing related methods.

在这项工作中，我们提出了一种从3D网格中检测计算机合成图像中视觉显著目标的方法。与现有图形显著性检测方法基于像素级对比度计算显著性不同，该方法通过测量渲染图像中每个对象与其他对象的对象级对比度来计算显著性。该方法首先从合成图像中提取各目标的主色，并用主色描述符(DCD)表示每个目标。显著性是用物体的DCD与其周围物体的DCD之间的对比度来衡量的。在计算机渲染图像数据集上对所提方法进行了评价，结果表明所提方法比现有的相关方法获得了更好的性能。

引用次数: 0

Video viewer state estimation using gaze tracking and video content analysis 基于注视跟踪和视频内容分析的视频观看者状态估计

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706365

Jae-Woo Kim, Jong-Ok Kim

In this paper, we propose a novel viewer state model based on gaze tracking and video content analysis. There are two primary contributions in this paper. We first improve gaze state classification significantly by combining video content analysis. Then, based on the estimated gaze state, we propose a novel viewer state model indicating both viewer's interest and existence of viewer's ROIs. Experiments were conducted to verify the performance of the proposed gaze state classifier and viewer state model. The experimental results show that the use of video content analysis in gaze state classification considerably improves the classification results and consequently, the viewer state model correctly estimates the interest state of video viewers.

本文提出了一种基于注视跟踪和视频内容分析的观看者状态模型。本文有两个主要贡献。我们首先结合视频内容分析，显著提高了注视状态分类。然后，基于估计的注视状态，我们提出了一种新的观看者状态模型，该模型同时显示了观看者的兴趣和是否存在观看者的roi。实验验证了所提出的凝视状态分类器和观看者状态模型的性能。实验结果表明，在注视状态分类中使用视频内容分析大大提高了分类结果，从而使观看者状态模型能够正确地估计视频观看者的兴趣状态。

引用次数: 1

An Apriori-like algorithm for automatic extraction of the common action characteristics 一种类似apriori的共同动作特征自动提取算法

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706394

Tran Thang Thanh, Fan Chen, K. Kotani, H. Le

With the development of the technology like 3D specialized markers, we could capture the moving signals from marker joints and create a huge set of 3D action MoCap data. The more we understand the human action, the better we could apply it to applications like security, analysis of sports, game etc. In order to find the semantically representative features of human actions, we extract the sets of action characteristics which appear frequently in the database. We then propose an Apriori-like algorithm to automatically extract the common sets shared by different action classes. The extracted representative action characteristics are defined in the semantic level, so that it better describes the intrinsic differences between various actions. In our experiments, we show that the knowledge extracted by this method achieves high accuracy of over 80% in recognizing actions on both training and testing data.

随着3D专用标记等技术的发展，我们可以捕获标记关节的运动信号，并创建一套庞大的3D动作动作捕捉数据。我们对人类行为了解得越多，我们就能更好地将其应用于安全、体育分析、游戏等领域。为了找到具有语义代表性的人类动作特征，我们提取了数据库中频繁出现的动作特征集。然后，我们提出了一种类似apriori的算法来自动提取不同动作类共享的公共集。提取的代表性动作特征在语义层面进行定义，以便更好地描述各种动作之间的内在差异。实验表明，该方法提取的知识在训练数据和测试数据上的动作识别准确率都达到了80%以上。

引用次数: 2

Object co-segmentation based on directed graph clustering 基于有向图聚类的对象共分割

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706376

Fanman Meng, Bing Luo, Chao Huang

In this paper, we develop a new algorithm to segment multiple common objects from a group of images. Our method consists of two aspects: directed graph clustering and prior propagation. The clustering is used to cluster the local regions of the original images and generate the foreground priors from these clusterings. The second step propagates the prior of each class and locates the common objects from the images in terms of foreground map. Finally, we use the foreground map as the unary term of Markov random field segmentation and segment the common objects by graph-cuts algorithm. We test our method on FlickrMFC and ICoseg datasets. The experimental results show that the proposed method can achieve larger accuracy compared with several state-of-arts co-segmentation methods.

本文提出了一种从一组图像中分割多个共同目标的新算法。我们的方法包括两个方面:有向图聚类和先验传播。聚类用于对原始图像的局部区域进行聚类，并从这些聚类中生成前景先验。第二步传播每个类的先验，根据前景图从图像中定位出共同的目标。最后，将前景图作为马尔可夫随机场分割的一元项，利用图切算法对常见目标进行分割。我们在FlickrMFC和ICoseg数据集上测试了我们的方法。实验结果表明，与现有的几种协同分割方法相比，该方法具有更高的分割精度。

引用次数: 7

Oriented geodesic distance based non-local regularisation approach for optic flow estimation 基于定向测地线距离的非局部正则化光流估计方法

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706439

Shan Yu, D. Molloy

Optical flow (OF) estimation needs spatial coherence regularisation, due to local image noise and the well-known aperture problem. More recently, OF local-region regularisation has been extended to larger or non-local region of regularisation to further deal with the aperture problem. After careful literature review, it has been determined that the criteria used for deciding the degree of motion coherence can be further improved. For this reason, we propose an oriented geodesic distance based motion regularisation scheme. The proposed approach is particular useful in reducing errors in estimating motions along object boundaries, and recovering motions for nearby objects with similar appearance. Experiment results, compared to leading-edge non-local regularisation schemes, have confirmed the superior performance of the proposed approach.

由于局部图像噪声和众所周知的孔径问题，光流估计需要空间相干正则化。近年来，OF局部区域正则化已扩展到更大或非局部区域的正则化，以进一步处理孔径问题。经过仔细的文献回顾，已经确定用于决定运动连贯程度的标准可以进一步改进。为此，我们提出了一种基于定向测地线距离的运动正则化方案。该方法在减少沿目标边界估计运动的误差和恢复具有相似外观的附近目标的运动方面特别有用。实验结果表明，与非局部正则化方法相比，该方法具有较好的性能。

引用次数: 0

Depth from defocus and blur for single image 深度从散焦和模糊的单一图像

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706352

Huadong Sun, Zhijie Zhao, Xuesong Jin, Lianding Niu, Lizhi Zhang

Depth for single image is a hot problem in computer vision, which is very important to 2D/3D image conversion. Generally, depth of the object in the scene varies with the amount of blur in the defocus images. So, depth in the scene can be recovered by measuring the blur. In this paper, a new method for depth estimation based on focus/defocus cue is presented, where the entropy of high frequency subband of wavelet decomposition is regarded as the measure of blur. The proposed method, which is unnecessary to select threshold, can provide pixel-level depth map. The experimental results show that this method is effective and reliable.

单幅图像的深度问题是计算机视觉中的一个热点问题，它对二维/三维图像的转换具有重要意义。一般来说，景物的深度随离焦图像的模糊程度而变化。因此，景深可以通过测量模糊来恢复。本文提出了一种基于聚焦/离焦线索的深度估计新方法，该方法以小波分解的高频子带熵作为模糊度量。该方法不需要选择阈值，可以提供像素级深度图。实验结果表明，该方法是有效可靠的。

引用次数: 4

Low complexity image matching using color based SIFT 基于彩色SIFT的低复杂度图像匹配

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706456

Abhishek Nagar, A. Saxena, S. Bucak, Felix C. A. Fernandes, Kong-Posh Bhat

Image matching and search is gaining significant commercial importance nowadays due to various applications it enables such as augmented reality, image-queries for internet search, etc. Many researchers have effectively used color information in an image to improve its matching accuracy. These techniques, however, cannot be directly used for large scale mobile visual search applications that pose strict constraints on the size of the extracted features, computational resources and the system accuracy. To overcome this limitation, we propose a new and effective technique to incorporate color information that can use the SIFT extraction technique. We conduct our experiments on a large dataset containing around 33, 000 images that is currently being investigated in the MPEG-Compact Descriptors for Visual Search Standard and show substantial improvement compared to baseline.

如今，图像匹配和搜索正在获得重要的商业重要性，因为它可以实现各种应用，如增强现实，互联网搜索的图像查询等。许多研究者已经有效地利用图像中的颜色信息来提高图像的匹配精度。然而，这些技术不能直接用于大规模的移动视觉搜索应用，这些应用对提取特征的大小、计算资源和系统精度都有严格的限制。为了克服这一限制，我们提出了一种新的有效的技术来融合颜色信息，可以使用SIFT提取技术。我们在一个包含大约33,000张图像的大型数据集上进行实验，这些图像目前正在MPEG-Compact Descriptors for Visual Search Standard中进行研究，与基线相比显示出实质性的改进。

引用次数: 2

Color image guided locality regularized representation for Kinect depth holes filling 彩色图像引导的Kinect深度孔填充局部正则化表示

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706366

Jinhui Hu, R. Hu, Zhongyuan Wang, Yan Gong, Mang Duan

The emergence of Microsoft Kinect has attracted the attention not only from consumers but also from researchers in the field of computer vision. It facilitates the possibility to capture the depth map of the scene in real time and with low cost. Nonetheless, due to the limitations of structured light measurements used by Kinect, the captured depth map suffers random depth missing in the occlusion or smooth regions, which affects the accuracy of many Kinect based applications. In order to fill in the holes existing in Kinect depth map, some approaches that adopted color image guided in-painting or joint bilateral filter have been proposed to represent the missing depth pixel by available depth pixels. However, they are not able to obtain the optimal weights, thus the obtained missing depth values are not best. In this paper, we propose a color image guided locality regularized representation (CGLRR) to reconstruct the missing depth pixels by comprehensively determining the optimal weights of the available depth pixels from collocated patches in color image. Experimental results demonstrate that the proposed algorithm can better fill in the holes of depth map both in smooth and edge region than previous works.

微软Kinect的出现不仅引起了消费者的关注，也引起了计算机视觉领域研究人员的关注。它为实时、低成本地获取景深图提供了可能。然而，由于Kinect使用的结构光测量的局限性，捕获的深度图在遮挡或平滑区域中存在随机深度缺失，这影响了许多基于Kinect的应用程序的准确性。为了填补Kinect深度图中存在的漏洞，提出了一些采用彩色图像引导内画或联合双边滤波器的方法，用可用的深度像素来表示缺失的深度像素。然而，它们不能得到最优的权重，因此得到的缺失深度值不是最好的。本文提出了一种彩色图像引导的局部正则化表示(CGLRR)，通过综合确定彩色图像中并置块中可用深度像素的最优权重来重建缺失的深度像素。实验结果表明，该算法在光滑区域和边缘区域都能较好地填补深度图的空洞。

{"title":"Color image guided locality regularized representation for Kinect depth holes filling","authors":"Jinhui Hu, R. Hu, Zhongyuan Wang, Yan Gong, Mang Duan","doi":"10.1109/VCIP.2013.6706366","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706366","url":null,"abstract":"The emergence of Microsoft Kinect has attracted the attention not only from consumers but also from researchers in the field of computer vision. It facilitates the possibility to capture the depth map of the scene in real time and with low cost. Nonetheless, due to the limitations of structured light measurements used by Kinect, the captured depth map suffers random depth missing in the occlusion or smooth regions, which affects the accuracy of many Kinect based applications. In order to fill in the holes existing in Kinect depth map, some approaches that adopted color image guided in-painting or joint bilateral filter have been proposed to represent the missing depth pixel by available depth pixels. However, they are not able to obtain the optimal weights, thus the obtained missing depth values are not best. In this paper, we propose a color image guided locality regularized representation (CGLRR) to reconstruct the missing depth pixels by comprehensively determining the optimal weights of the available depth pixels from collocated patches in color image. Experimental results demonstrate that the proposed algorithm can better fill in the holes of depth map both in smooth and edge region than previous works.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130844467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Retina model inspired image quality assessment 视网膜模型启发的图像质量评估

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706367

Guangtao Zhai, A. Kaup, Jia Wang, Xiaokang Yang

We proposed in this paper a retina model based approach for image quality assessment. The retinal model is consisted of an optical modulation transfer module and an adaptive low-pass filtering module. We treat the model as a black box and design the adaptive filter using an information theoretical approach. Since the information rate of visual signals is far beyond the processing power of the human visual system, there must be an effective data reduction stage in human visual brain. Therefore, the underlying assumption for the retina model is that the retina reduces the data amount of the visual scene while retaining as much useful information as possible. For full reference image quality assessment, the original and distorted images pass through the retinal filter before some kind of distance is calculated between the images. Retina filtering can serve as a general preprocessing stage for most existing image quality metrics. We show in this paper that retina model based MSE/PSNR, though being straightforward, has already state of the art performance on several image quality databases.

本文提出了一种基于视网膜模型的图像质量评估方法。该视网膜模型由光调制传输模块和自适应低通滤波模块组成。我们将模型视为一个黑盒，并采用信息论的方法设计自适应滤波器。由于视觉信号的信息量远远超出了人类视觉系统的处理能力，因此在人类视觉大脑中必然存在一个有效的数据约简阶段。因此，视网膜模型的基本假设是，视网膜减少了视觉场景的数据量，同时保留了尽可能多的有用信息。为了充分评估参考图像的质量，在计算图像之间的距离之前，原始图像和扭曲图像通过视网膜滤光器。视网膜滤波可以作为大多数现有图像质量度量的一般预处理阶段。我们在本文中表明，基于MSE/PSNR的视网膜模型虽然简单，但在几个图像质量数据库上已经具有最先进的性能。

引用次数: 6

Visually lossless screen content coding using HEVC base-layer 视觉上无损的屏幕内容编码使用HEVC基础层

2013 Visual Communications and Image Processing (VCIP)

Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706364

Geert Braeckman, Shahid M. Satti, Heng Chen, A. Munteanu, P. Schelkens

This paper presents a novel two-layer coding framework targeting visually lossless compression of screen content video. The proposed framework employs the conventional HEVC standard for the base-layer. For the enhancement layer, a hybrid of spatial and temporal block-prediction mechanism is introduced to guarantee a small energy of the error-residual. Spatial prediction is generally chosen for dynamic areas, while temporal predictions yield better prediction for static areas in a video frame. The prediction residual is quantized based on whether a given block is static or dynamic. Run-length coding, Golomb based binarization and context-based arithmetic coding are employed to efficiently code the quantized residual and form the enhancement-layer. Performance evaluations using 4:4:4 screen content sequences show that, for visually lossless video quality, the proposed system significantly saves the bit-rate compared to the two-layer lossless HEVC framework.

针对屏幕内容视频的视觉无损压缩，提出了一种新的两层编码框架。该框架采用传统的HEVC标准作为基础层。在增强层，引入了时空混合块预测机制，保证了误差残差能量小。空间预测通常用于动态区域，而时间预测可以更好地预测视频帧中的静态区域。根据给定块是静态的还是动态的，对预测残差进行量化。采用行距编码、基于Golomb的二值化和基于上下文的算术编码对量化残差进行有效编码，形成增强层。使用4:4:4屏幕内容序列进行的性能评估表明，对于视觉上无损的视频质量，与两层无损HEVC框架相比，所提出的系统显著节省了比特率。

引用次数: 3

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2013 Visual Communications and Image Processing (VCIP)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀