首页 > 最新文献

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献

英文 中文
Is there a general structure for grammars? 语法有一个通用的结构吗?
D. Mumford
Summary form only given. Linguists have proposed dozens of formalisms for grammars and now vision is weighing in with its versions based on its needs. Ulf Grenander has proposed general pattern theory, and has used grammar-like graphical parses of "thoughts" in the style of AI. One wants a natural, simple formalism treating all these cases. I want to pose this as a central problem in modeling intelligence. Pattern theory started in the 70's with the ideas of Ulf Grenander and his school at Brown. The aim is to analyze from a statistical point of view the patterns in all "signals" generated by the world, whether they be images, sounds, written text, DNA or protein strings, spike trains in neurons, time series of prices or weather, etc. Pattern theory proposes that the types of patterns-and the hidden variables needed to describe these patterns - found in one class of signals will often be found in the others and that their characteristic variability will be similar. The underlying idea is to find classes of stochastic models which can capture all the patterns that we see in nature, so that random samples from these models have the same "look and feel" as the samples from the world itself. Then the detection of patterns in noisy and ambiguous samples can be achieved by the use of Bayes' rule, a method that can be described as "analysis by synthesis".
只提供摘要形式。语言学家已经提出了几十种语法形式,现在vision正在根据自己的需要推出不同的版本。Ulf Grenander提出了通用模式理论,并以人工智能的方式使用类似语法的图形化“思想”解析。人们需要一种自然的,简单的形式主义来处理所有这些情况。我想把它作为智能建模的一个核心问题。模式理论始于70年代,由Ulf Grenander和他在布朗大学的学派提出。其目的是从统计学的角度分析世界产生的所有“信号”的模式,无论这些信号是图像、声音、书面文本、DNA或蛋白质串、神经元的尖峰序列、价格或天气的时间序列等等。模式理论提出,在一类信号中发现的模式类型——以及描述这些模式所需的隐藏变量——通常会在其他信号中发现,并且它们的特征可变性将是相似的。潜在的想法是找到能够捕获我们在自然界中看到的所有模式的随机模型类别,因此这些模型中的随机样本与来自世界本身的样本具有相同的“外观和感觉”。然后利用贝叶斯规则实现对噪声和模糊样本的模式检测,这种方法可以被描述为“综合分析”。
{"title":"Is there a general structure for grammars?","authors":"D. Mumford","doi":"10.1109/CVPRW.2009.5204334","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204334","url":null,"abstract":"Summary form only given. Linguists have proposed dozens of formalisms for grammars and now vision is weighing in with its versions based on its needs. Ulf Grenander has proposed general pattern theory, and has used grammar-like graphical parses of \"thoughts\" in the style of AI. One wants a natural, simple formalism treating all these cases. I want to pose this as a central problem in modeling intelligence. Pattern theory started in the 70's with the ideas of Ulf Grenander and his school at Brown. The aim is to analyze from a statistical point of view the patterns in all \"signals\" generated by the world, whether they be images, sounds, written text, DNA or protein strings, spike trains in neurons, time series of prices or weather, etc. Pattern theory proposes that the types of patterns-and the hidden variables needed to describe these patterns - found in one class of signals will often be found in the others and that their characteristic variability will be similar. The underlying idea is to find classes of stochastic models which can capture all the patterns that we see in nature, so that random samples from these models have the same \"look and feel\" as the samples from the world itself. Then the detection of patterns in noisy and ambiguous samples can be achieved by the use of Bayes' rule, a method that can be described as \"analysis by synthesis\".","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123248761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3D stochastic completion fields for fiber tractography 纤维束成像的三维随机完井场
P. MomayyezSiahkal, Kaleem Siddiqi
We approach the problem of fiber tractography from the viewpoint that a computational theory should relate to the underlying quantity that is being measured - the diffusion of water molecules. We characterize the Brownian motion of water by a 3D random walk described by a stochastic non-linear differential equation. We show that the maximum-likelihood trajectories are 3D elastica, or curves of least energy. We illustrate the model with Monte-Carlo (sequential) simulations and then develop a more efficient (local, parallelizable) implementation, based on the Fokker-Planck equation. The final algorithm allows us to efficiently compute stochastic completion fields to connect a source region to a sink region, while taking into account the underlying diffusion MRI data. We demonstrate promising tractography results using high angular resolution diffusion data as input.
我们从计算理论应该与被测量的潜在量——水分子的扩散——相关的观点来处理纤维束图的问题。我们用随机非线性微分方程描述的三维随机游走来表征水的布朗运动。我们表明,最大似然轨迹是三维弹性的,或能量最小的曲线。我们用蒙特卡罗(顺序)模拟说明了该模型,然后基于Fokker-Planck方程开发了一个更有效的(局部的,可并行化的)实现。最后的算法允许我们有效地计算随机补全场,将源区域连接到汇聚区域,同时考虑到潜在的扩散MRI数据。我们展示了有希望的牵引成像结果使用高角分辨率的扩散数据作为输入。
{"title":"3D stochastic completion fields for fiber tractography","authors":"P. MomayyezSiahkal, Kaleem Siddiqi","doi":"10.1109/CVPRW.2009.5204044","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204044","url":null,"abstract":"We approach the problem of fiber tractography from the viewpoint that a computational theory should relate to the underlying quantity that is being measured - the diffusion of water molecules. We characterize the Brownian motion of water by a 3D random walk described by a stochastic non-linear differential equation. We show that the maximum-likelihood trajectories are 3D elastica, or curves of least energy. We illustrate the model with Monte-Carlo (sequential) simulations and then develop a more efficient (local, parallelizable) implementation, based on the Fokker-Planck equation. The final algorithm allows us to efficiently compute stochastic completion fields to connect a source region to a sink region, while taking into account the underlying diffusion MRI data. We demonstrate promising tractography results using high angular resolution diffusion data as input.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123565647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Nonparametric bottom-up saliency detection by self-resemblance 基于自相似的非参数自底向上显著性检测
H. Seo, P. Milanfar
We present a novel bottom-up saliency detection algorithm. Our method computes so-called local regression kernels (i.e., local features) from the given image, which measure the likeness of a pixel to its surroundings. Visual saliency is then computed using the said “self-resemblance” measure. The framework results in a saliency map where each pixel indicates the statistical likelihood of saliency of a feature matrix given its surrounding feature matrices. As a similarity measure, matrix cosine similarity (a generalization of cosine similarity) is employed. State of the art performance is demonstrated on commonly used human eye fixation data [3] and some psychological patterns.
提出了一种新颖的自下而上显著性检测算法。我们的方法从给定的图像中计算所谓的局部回归核(即局部特征),它测量像素与其周围环境的相似性。然后使用上述“自相似”测量来计算视觉显著性。该框架生成显著性图,其中每个像素表示给定其周围特征矩阵的特征矩阵显著性的统计可能性。采用矩阵余弦相似度(余弦相似度的一种推广)作为相似度度量。在常用的人眼注视数据[3]和一些心理模式上展示了最新的表现。
{"title":"Nonparametric bottom-up saliency detection by self-resemblance","authors":"H. Seo, P. Milanfar","doi":"10.1109/CVPRW.2009.5204207","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204207","url":null,"abstract":"We present a novel bottom-up saliency detection algorithm. Our method computes so-called local regression kernels (i.e., local features) from the given image, which measure the likeness of a pixel to its surroundings. Visual saliency is then computed using the said “self-resemblance” measure. The framework results in a saliency map where each pixel indicates the statistical likelihood of saliency of a feature matrix given its surrounding feature matrices. As a similarity measure, matrix cosine similarity (a generalization of cosine similarity) is employed. State of the art performance is demonstrated on commonly used human eye fixation data [3] and some psychological patterns.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121938672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 116
Multiple label prediction for image annotation with multiple Kernel correlation models 基于多核相关模型的图像标注多标签预测
Oksana Yakhnenko, Vasant G Honavar
Image annotation is a challenging task that allows to correlate text keywords with an image. In this paper we address the problem of image annotation using Kernel Multiple Linear Regression model. Multiple Linear Regression (MLR) model reconstructs image caption from an image by performing a linear transformation of an image into some semantic space, and then recovers the caption by performing another linear transformation from the semantic space into the label space. The model is trained so that model parameters minimize the error of reconstruction directly. This model is related to Canonical Correlation Analysis (CCA) which maps both images and caption into the semantic space to minimize the distance of mapping in the semantic space. Kernel trick is then used for the MLR resulting in Kernel Multiple Linear Regression model. The solution to KMLR is a solution to the generalized eigen-value problem, related to KCCA (Kernel Canonical Correlation Analysis). We then extend Kernel Multiple Linear Regression and Kernel Canonical Correlation analysis models to multiple kernel setting, to allow various representations of images and captions. We present results for image annotation using Multiple Kernel Learning CCA and MLR on Oliva and Torralba (2001) scene recognition that show kernel selection behaviour.
图像注释是一项具有挑战性的任务,它允许将文本关键字与图像关联起来。本文利用核多元线性回归模型解决了图像标注问题。多元线性回归(Multiple Linear Regression, MLR)模型通过对图像进行某种语义空间的线性变换来重建图像标题,然后对图像进行另一次从语义空间到标签空间的线性变换来恢复图像标题。对模型进行训练,使模型参数直接减小重构误差。该模型与典型相关分析(CCA)有关,典型相关分析将图像和标题都映射到语义空间中,以最小化语义空间中的映射距离。然后将核技巧用于MLR,从而得到核多元线性回归模型。KMLR的解是广义特征值问题的解,与核典型相关分析(KCCA)有关。然后,我们将核多元线性回归和核典型相关分析模型扩展到多个核设置,以允许图像和标题的各种表示。我们展示了在Oliva和Torralba(2001)场景识别上使用多核学习CCA和MLR进行图像注释的结果,这些结果显示了核选择行为。
{"title":"Multiple label prediction for image annotation with multiple Kernel correlation models","authors":"Oksana Yakhnenko, Vasant G Honavar","doi":"10.1109/CVPRW.2009.5204274","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204274","url":null,"abstract":"Image annotation is a challenging task that allows to correlate text keywords with an image. In this paper we address the problem of image annotation using Kernel Multiple Linear Regression model. Multiple Linear Regression (MLR) model reconstructs image caption from an image by performing a linear transformation of an image into some semantic space, and then recovers the caption by performing another linear transformation from the semantic space into the label space. The model is trained so that model parameters minimize the error of reconstruction directly. This model is related to Canonical Correlation Analysis (CCA) which maps both images and caption into the semantic space to minimize the distance of mapping in the semantic space. Kernel trick is then used for the MLR resulting in Kernel Multiple Linear Regression model. The solution to KMLR is a solution to the generalized eigen-value problem, related to KCCA (Kernel Canonical Correlation Analysis). We then extend Kernel Multiple Linear Regression and Kernel Canonical Correlation analysis models to multiple kernel setting, to allow various representations of images and captions. We present results for image annotation using Multiple Kernel Learning CCA and MLR on Oliva and Torralba (2001) scene recognition that show kernel selection behaviour.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130887149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A method for selecting and ranking quality metrics for optimization of biometric recognition systems 一种用于优化生物特征识别系统的质量度量的选择和排序方法
N. Schmid, Francesco Nicolo
In the field of biometrics evaluation of quality of biometric samples has a number of important applications. The main applications include (1) to reject poor quality images during acquisition, (2) to use as enhancement metric, and (3) to apply as a weighting factor in fusion schemes. Since a biometric-based recognition system relies on measures of performance such as matching scores and recognition probability of error, it becomes intuitive that the metrics evaluating biometric sample quality have to be linked to the recognition performance of the system. The goal of this work is to design a method for evaluating and ranking various quality metrics applied to biometric images or signals based on their ability to predict recognition performance of a biometric recognition system. The proposed method involves: (1) Preprocessing algorithm operating on pairs of quality scores and generating relative scores, (2) Adaptive multivariate mapping relating quality scores and measures of recognition performance and (3) Ranking algorithm that selects the best combinations of quality measures. The performance of the method is demonstrated on face and iris biometric data.
在生物识别领域中,生物识别样品的质量评价有着许多重要的应用。主要应用包括(1)在采集过程中剔除质量差的图像,(2)用作增强度量,以及(3)作为融合方案中的加权因子。由于基于生物特征的识别系统依赖于诸如匹配分数和识别错误概率等性能度量,因此评估生物特征样本质量的度量必须与系统的识别性能联系起来,这变得很直观。这项工作的目标是设计一种方法,根据生物特征图像或信号预测识别性能的能力,对应用于生物特征图像或信号的各种质量指标进行评估和排序。该方法包括:(1)对质量分数对进行预处理并生成相对分数的算法;(2)质量分数与识别性能指标之间的自适应多变量映射;(3)选择最佳质量指标组合的排序算法。在人脸和虹膜生物特征数据上验证了该方法的有效性。
{"title":"A method for selecting and ranking quality metrics for optimization of biometric recognition systems","authors":"N. Schmid, Francesco Nicolo","doi":"10.1109/CVPRW.2009.5204309","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204309","url":null,"abstract":"In the field of biometrics evaluation of quality of biometric samples has a number of important applications. The main applications include (1) to reject poor quality images during acquisition, (2) to use as enhancement metric, and (3) to apply as a weighting factor in fusion schemes. Since a biometric-based recognition system relies on measures of performance such as matching scores and recognition probability of error, it becomes intuitive that the metrics evaluating biometric sample quality have to be linked to the recognition performance of the system. The goal of this work is to design a method for evaluating and ranking various quality metrics applied to biometric images or signals based on their ability to predict recognition performance of a biometric recognition system. The proposed method involves: (1) Preprocessing algorithm operating on pairs of quality scores and generating relative scores, (2) Adaptive multivariate mapping relating quality scores and measures of recognition performance and (3) Ranking algorithm that selects the best combinations of quality measures. The performance of the method is demonstrated on face and iris biometric data.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128244961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
GPU-accelerated, gradient-free MI deformable registration for atlas-based MR brain image segmentation 基于阿特拉斯的磁共振脑图像分割的gpu加速,无梯度MI可变形配准
Xiao Han, L. Hibbard, V. Willcut
Brain structure segmentation is an important task in many neuroscience and clinical applications. In this paper, we introduce a novel MI-based dense deformable registration method and apply it to the automatic segmentation of detailed brain structures. Together with a multiple atlas fusion strategy, very accurate segmentation results were obtained, as compared with other reported methods in the literature. To make multi-atlas segmentation computationally feasible, we also propose to take advantage of the recent advancements in GPU technology and introduce a GPU-based implementation of the proposed registration method. With GPU acceleration it takes less than 8 minutes to compile a multi-atlas segmentation for each subject even with as many as 17 atlases, which demonstrates that the use of GPUs can greatly facilitate the application of such atlas-based segmentation methods in practice.
脑结构分割是许多神经科学和临床应用中的重要任务。本文提出了一种新颖的基于mi的密集形变配准方法,并将其应用于脑结构细节的自动分割。与文献中报道的其他方法相比,结合多图谱融合策略,获得了非常准确的分割结果。为了使多图谱分割在计算上可行,我们还建议利用GPU技术的最新进展,并引入基于GPU的实现所提出的配准方法。在GPU加速的情况下,即使多达17个地图集,也可以在不到8分钟的时间内编译出每个主题的多地图集分割,这表明使用GPU可以极大地促进这种基于地图集的分割方法在实践中的应用。
{"title":"GPU-accelerated, gradient-free MI deformable registration for atlas-based MR brain image segmentation","authors":"Xiao Han, L. Hibbard, V. Willcut","doi":"10.1109/CVPRW.2009.5204043","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204043","url":null,"abstract":"Brain structure segmentation is an important task in many neuroscience and clinical applications. In this paper, we introduce a novel MI-based dense deformable registration method and apply it to the automatic segmentation of detailed brain structures. Together with a multiple atlas fusion strategy, very accurate segmentation results were obtained, as compared with other reported methods in the literature. To make multi-atlas segmentation computationally feasible, we also propose to take advantage of the recent advancements in GPU technology and introduce a GPU-based implementation of the proposed registration method. With GPU acceleration it takes less than 8 minutes to compile a multi-atlas segmentation for each subject even with as many as 17 atlases, which demonstrates that the use of GPUs can greatly facilitate the application of such atlas-based segmentation methods in practice.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114562914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Fuzzy statistical modeling of dynamic backgrounds for moving object detection in infrared videos 红外视频运动目标检测动态背景的模糊统计建模
Fida El Baf, T. Bouwmans, B. Vachon
Mixture of Gaussians (MOG) is the most popular technique for background modeling and presents some limitations when dynamic changes occur in the scene like camera jitter and movement in the background. Furthermore, the MOG is initialized using a training sequence which may be noisy and/or insufficient to model correctly the background. All these critical situations generate false classification in the foreground detection mask due to the related uncertainty. In this context, we present a background modeling algorithm based on Type-2 Fuzzy Mixture of Gaussians which is particularly suitable for infrared videos. The use of the Type-2 Fuzzy Set Theory allows to take into account the uncertainty. The results using the OTCBVS benchmark/test dataset videos show the robustness of the proposed method in presence of dynamic backgrounds.
混合高斯(MOG)是最流行的背景建模技术,但当场景中发生动态变化时,如相机抖动和背景运动时,该技术存在一些局限性。此外,MOG初始化使用的训练序列可能是有噪声的和/或不足以正确建模背景。这些关键的情况,由于相关的不确定性,都会在前景检测掩码中产生错误的分类。在此背景下,我们提出了一种特别适用于红外视频的基于2型模糊混合高斯的背景建模算法。二类模糊集理论的使用允许考虑不确定性。使用OTCBVS基准/测试数据集视频的结果表明,该方法在动态背景下具有鲁棒性。
{"title":"Fuzzy statistical modeling of dynamic backgrounds for moving object detection in infrared videos","authors":"Fida El Baf, T. Bouwmans, B. Vachon","doi":"10.1109/CVPRW.2009.5204109","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204109","url":null,"abstract":"Mixture of Gaussians (MOG) is the most popular technique for background modeling and presents some limitations when dynamic changes occur in the scene like camera jitter and movement in the background. Furthermore, the MOG is initialized using a training sequence which may be noisy and/or insufficient to model correctly the background. All these critical situations generate false classification in the foreground detection mask due to the related uncertainty. In this context, we present a background modeling algorithm based on Type-2 Fuzzy Mixture of Gaussians which is particularly suitable for infrared videos. The use of the Type-2 Fuzzy Set Theory allows to take into account the uncertainty. The results using the OTCBVS benchmark/test dataset videos show the robustness of the proposed method in presence of dynamic backgrounds.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114512789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
A framework for automated measurement of the intensity of non-posed Facial Action Units 一个用于自动测量非姿势面部动作单元强度的框架
M. Mahoor, S. Cadavid, D. Messinger, J. Cohn
This paper presents a framework to automatically measure the intensity of naturally occurring facial actions. Naturalistic expressions are non-posed spontaneous actions. The facial action coding system (FACS) is the gold standard technique for describing facial expressions, which are parsed as comprehensive, nonoverlapping action units (Aus). AUs have intensities ranging from absent to maximal on a six-point metric (i.e., 0 to 5). Despite the efforts in recognizing the presence of non-posed action units, measuring their intensity has not been studied comprehensively. In this paper, we develop a framework to measure the intensity of AU12 (lip corner puller) and AU6 (cheek raising) in videos captured from infant-mother live face-to-face communications. The AU12 and AU6 are the most challenging case of infant's expressions (e.g., low facial texture in infant's face). One of the problems in facial image analysis is the large dimensionality of the visual data. Our approach for solving this problem is to utilize the spectral regression technique to project high dimensionality facial images into a low dimensionality space. Represented facial images in the low dimensional space are utilized to train support vector machine classifiers to predict the intensity of action units. Analysis of 18 minutes of captured video of non-posed facial expressions of several infants and mothers shows significant agreement between a human FACS coder and our approach, which makes it an efficient approach for automated measurement of the intensity of non-posed facial action units.
本文提出了一个自动测量自然发生的面部动作强度的框架。自然主义的表达是不做作的自发行为。面部动作编码系统(FACS)是描述面部表情的黄金标准技术,它被解析为全面的、不重叠的动作单元(au)。在6点度量(即0到5)上,au的强度范围从没有到最大。尽管在识别非姿势动作单元的存在方面做出了努力,但测量它们的强度尚未得到全面研究。在本文中,我们开发了一个框架来测量从母婴现场面对面交流中捕获的视频中AU12(唇角拉动)和AU6(脸颊抬起)的强度。AU12和AU6是婴儿表情最具挑战性的情况(如婴儿面部纹理低)。人脸图像分析中存在的问题之一是视觉数据的维数过大。我们解决这个问题的方法是利用光谱回归技术将高维人脸图像投影到低维空间中。利用低维空间中表示的面部图像来训练支持向量机分类器来预测动作单元的强度。对几个婴儿和母亲的18分钟非摆姿势面部表情视频的分析表明,人类FACS编码器和我们的方法之间存在显著的一致性,这使得它成为自动测量非摆姿势面部动作单元强度的有效方法。
{"title":"A framework for automated measurement of the intensity of non-posed Facial Action Units","authors":"M. Mahoor, S. Cadavid, D. Messinger, J. Cohn","doi":"10.1109/CVPRW.2009.5204259","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204259","url":null,"abstract":"This paper presents a framework to automatically measure the intensity of naturally occurring facial actions. Naturalistic expressions are non-posed spontaneous actions. The facial action coding system (FACS) is the gold standard technique for describing facial expressions, which are parsed as comprehensive, nonoverlapping action units (Aus). AUs have intensities ranging from absent to maximal on a six-point metric (i.e., 0 to 5). Despite the efforts in recognizing the presence of non-posed action units, measuring their intensity has not been studied comprehensively. In this paper, we develop a framework to measure the intensity of AU12 (lip corner puller) and AU6 (cheek raising) in videos captured from infant-mother live face-to-face communications. The AU12 and AU6 are the most challenging case of infant's expressions (e.g., low facial texture in infant's face). One of the problems in facial image analysis is the large dimensionality of the visual data. Our approach for solving this problem is to utilize the spectral regression technique to project high dimensionality facial images into a low dimensionality space. Represented facial images in the low dimensional space are utilized to train support vector machine classifiers to predict the intensity of action units. Analysis of 18 minutes of captured video of non-posed facial expressions of several infants and mothers shows significant agreement between a human FACS coder and our approach, which makes it an efficient approach for automated measurement of the intensity of non-posed facial action units.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126236149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 122
Inference and learning with hierarchical compositional models 基于分层组合模型的推理和学习
Iasonas Kokkinos, A. Yuille
Summary form only given: In this work we consider the problem of object parsing, namely detecting an object and its components by composing them from image observations. We build to address the computational complexity of the inference problem. For this we exploit our hierarchical object representation to efficiently compute a coarse solution to the problem, which we then use to guide search at a finer level. Starting from our adaptation of the A* parsing algorithm to the problem of object parsing, we then propose a coarse-to-fine approach that is capable of detecting multiple objects simultaneously. We extend this work to automatically learn a hierarchical model for a category from a set of training images for which only the bounding box is available. Our approach consists in (a) automatically registering a set of training images and constructing an object template (b) recovering object contours (c) finding object parts based on contour affinities and (d) discriminatively learning a parsing cost function.
在这项工作中,我们考虑对象解析的问题,即通过从图像观察中组合它们来检测对象及其组成部分。我们构建来解决推理问题的计算复杂性。为此,我们利用我们的分层对象表示来有效地计算问题的粗略解,然后我们使用它来指导更精细的搜索。从我们将A*解析算法应用到对象解析问题开始,我们提出了一种能够同时检测多个对象的从粗到精的方法。我们将这项工作扩展到从一组只有边界框可用的训练图像中自动学习类别的分层模型。我们的方法包括(a)自动注册一组训练图像并构建对象模板(b)恢复对象轮廓(c)基于轮廓亲和力查找对象部分和(d)判别学习解析成本函数。
{"title":"Inference and learning with hierarchical compositional models","authors":"Iasonas Kokkinos, A. Yuille","doi":"10.1109/CVPRW.2009.5204336","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204336","url":null,"abstract":"Summary form only given: In this work we consider the problem of object parsing, namely detecting an object and its components by composing them from image observations. We build to address the computational complexity of the inference problem. For this we exploit our hierarchical object representation to efficiently compute a coarse solution to the problem, which we then use to guide search at a finer level. Starting from our adaptation of the A* parsing algorithm to the problem of object parsing, we then propose a coarse-to-fine approach that is capable of detecting multiple objects simultaneously. We extend this work to automatically learn a hierarchical model for a category from a set of training images for which only the bounding box is available. Our approach consists in (a) automatically registering a set of training images and constructing an object template (b) recovering object contours (c) finding object parts based on contour affinities and (d) discriminatively learning a parsing cost function.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128058850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An affine Invariant hyperspectral texture descriptor based upon heavy-tailed distributions and fourier analysis 基于重尾分布和傅立叶分析的仿射不变高光谱纹理描述子
P. Khuwuthyakorn, A. Robles-Kelly, J. Zhou
In this paper, we address the problem of recovering a hyperspectral texture descriptor. We do this by viewing the wavelength-indexed bands corresponding to the texture in the image as those arising from a stochastic process whose statistics can be captured making use of the relationships between moment generating functions and Fourier kernels. In this manner, we can interpret the probability distribution of the hyper-spectral texture as a heavy-tailed one which can be rendered invariant to affine geometric transformations on the texture plane making use of the spectral power of its Fourier cosine transform. We do this by recovering the affine geometric distortion matrices corresponding to the probability density function for the texture under study. This treatment permits the development of a robust descriptor which has a high information compaction property and can capture the space and wavelength correlation for the spectra in the hyperspectral images. We illustrate the utility of our descriptor for purposes of recognition and provide results on real-world datasets. We also compare our results to those yielded by a number of alternatives.
在本文中,我们解决了高光谱纹理描述符的恢复问题。我们通过将图像中与纹理对应的波长索引波段视为随机过程产生的波段来实现这一点,随机过程的统计数据可以利用矩生成函数和傅立叶核之间的关系来捕获。通过这种方式,我们可以将高光谱纹理的概率分布解释为一个重尾分布,利用其傅立叶余弦变换的光谱功率,可以使其对纹理平面上的仿射几何变换保持不变。我们通过恢复与所研究纹理的概率密度函数相对应的仿射几何畸变矩阵来实现这一点。这种处理允许开发具有高信息压缩特性的鲁棒描述子,并且可以捕获高光谱图像中光谱的空间和波长相关性。我们举例说明了我们的描述符用于识别的效用,并提供了真实世界数据集的结果。我们还将我们的结果与许多替代方法产生的结果进行比较。
{"title":"An affine Invariant hyperspectral texture descriptor based upon heavy-tailed distributions and fourier analysis","authors":"P. Khuwuthyakorn, A. Robles-Kelly, J. Zhou","doi":"10.1109/CVPRW.2009.5204126","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204126","url":null,"abstract":"In this paper, we address the problem of recovering a hyperspectral texture descriptor. We do this by viewing the wavelength-indexed bands corresponding to the texture in the image as those arising from a stochastic process whose statistics can be captured making use of the relationships between moment generating functions and Fourier kernels. In this manner, we can interpret the probability distribution of the hyper-spectral texture as a heavy-tailed one which can be rendered invariant to affine geometric transformations on the texture plane making use of the spectral power of its Fourier cosine transform. We do this by recovering the affine geometric distortion matrices corresponding to the probability density function for the texture under study. This treatment permits the development of a robust descriptor which has a high information compaction property and can capture the space and wavelength correlation for the spectra in the hyperspectral images. We illustrate the utility of our descriptor for purposes of recognition and provide results on real-world datasets. We also compare our results to those yielded by a number of alternatives.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126655433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1