首页 > 最新文献

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献

英文 中文
Towards automated large scale discovery of image families 朝着自动大规模发现图像族的方向发展
M. Aly, P. Welinder, Mario E. Munich, P. Perona
Gathering large collections of images is quite easy nowadays with the advent of image sharing Web sites, such as flickr.com. However, such collections inevitably contain duplicates and highly similar images, what we refer to as image families. Automatic discovery and cataloguing of such similar images in large collections is important for many applications, e.g. image search, image collection visualization, and research purposes among others. In this work, we investigate this problem by thoroughly comparing two broad approaches for measuring image similarity: global vs. local features. We assess their performance as the image collection scales up to over 11,000 images with over 6,300 families. We present our results on three datasets with different statistics, including two new challenging datasets. Moreover, we present a new algorithm to automatically determine the number of families in the collection with promising results.
随着像flickr.com这样的图片共享网站的出现,收集大量的图片变得非常容易。然而,这样的集合不可避免地包含重复和高度相似的图像,我们称之为图像族。在大型集合中自动发现和编目这些相似的图像对许多应用程序都很重要,例如图像搜索,图像集合可视化和研究目的等。在这项工作中,我们通过全面比较测量图像相似性的两种广泛方法来研究这个问题:全局特征与局部特征。当图像收集扩展到超过11,000张图像,超过6,300个家庭时,我们评估了它们的性能。我们在三个具有不同统计数据的数据集上展示了我们的结果,包括两个新的具有挑战性的数据集。此外,我们提出了一种新的算法来自动确定集合中的家庭数量,结果很有希望。
{"title":"Towards automated large scale discovery of image families","authors":"M. Aly, P. Welinder, Mario E. Munich, P. Perona","doi":"10.1109/CVPRW.2009.5204177","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204177","url":null,"abstract":"Gathering large collections of images is quite easy nowadays with the advent of image sharing Web sites, such as flickr.com. However, such collections inevitably contain duplicates and highly similar images, what we refer to as image families. Automatic discovery and cataloguing of such similar images in large collections is important for many applications, e.g. image search, image collection visualization, and research purposes among others. In this work, we investigate this problem by thoroughly comparing two broad approaches for measuring image similarity: global vs. local features. We assess their performance as the image collection scales up to over 11,000 images with over 6,300 families. We present our results on three datasets with different statistics, including two new challenging datasets. Moreover, we present a new algorithm to automatically determine the number of families in the collection with promising results.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132552108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Multi-view reconstruction for projector camera systems based on bundle adjustment 基于束调整的投影摄像系统多视点重建
Furukawa Ryo, K. Inose, Hiroshi Kawasaki
Range scanners using projector-camera systems have been studied actively in recent years as methods for measuring 3D shapes accurately and cost-effectively. To acquire an entire 3D shape of an object with such systems, the shape of the object should be captured from multiple directions and the set of captured shapes should be aligned using algorithms such as ICPs. Then, the aligned shapes are integrated into a single 3D shape model. However, the captured shapes are often distorted due to errors of intrinsic or extrinsic parameters of the camera and the projector. Because of these distortions, gaps between overlapped surfaces remain even after aligning the 3D shapes. In this paper, we propose a new method to capture an entire shape with high precision using an active stereo range scanner which consists of a projector and a camera with fixed relative positions. In the proposed method, minimization of calibration errors of the projector-camera pair and registration errors between 3D shapes from different viewpoints are simultaneously achieved. The proposed method can be considered as a variation of bundle adjustment techniques adapted to projector-camera systems. Since acquisition of correspondences between different views is not easy for projector-camera systems, a solution for the problem is also presented.
近年来,使用投影相机系统的距离扫描仪作为一种精确、经济有效地测量三维形状的方法得到了积极的研究。为了用这样的系统获得一个物体的完整3D形状,物体的形状应该从多个方向捕获,并且应该使用诸如icp之类的算法对捕获的形状集进行对齐。然后,将对齐的形状集成到单个3D形状模型中。然而,由于相机和投影仪的内在或外在参数的误差,捕获的形状经常失真。由于这些扭曲,即使在对齐3D形状之后,重叠表面之间的间隙仍然存在。在本文中,我们提出了一种利用由投影仪和固定相对位置的相机组成的有源立体距离扫描仪来高精度捕获整个形状的新方法。该方法同时实现了投影-相机对标定误差和不同视点三维形状配准误差的最小化。所提出的方法可以看作是适用于投影-摄像机系统的束平差技术的一种变体。由于投影-摄像机系统不易获取不同视点之间的对应关系,提出了一种解决方法。
{"title":"Multi-view reconstruction for projector camera systems based on bundle adjustment","authors":"Furukawa Ryo, K. Inose, Hiroshi Kawasaki","doi":"10.1109/CVPRW.2009.5204318","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204318","url":null,"abstract":"Range scanners using projector-camera systems have been studied actively in recent years as methods for measuring 3D shapes accurately and cost-effectively. To acquire an entire 3D shape of an object with such systems, the shape of the object should be captured from multiple directions and the set of captured shapes should be aligned using algorithms such as ICPs. Then, the aligned shapes are integrated into a single 3D shape model. However, the captured shapes are often distorted due to errors of intrinsic or extrinsic parameters of the camera and the projector. Because of these distortions, gaps between overlapped surfaces remain even after aligning the 3D shapes. In this paper, we propose a new method to capture an entire shape with high precision using an active stereo range scanner which consists of a projector and a camera with fixed relative positions. In the proposed method, minimization of calibration errors of the projector-camera pair and registration errors between 3D shapes from different viewpoints are simultaneously achieved. The proposed method can be considered as a variation of bundle adjustment techniques adapted to projector-camera systems. Since acquisition of correspondences between different views is not easy for projector-camera systems, a solution for the problem is also presented.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132854845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Learning a hierarchical compositional representation of multiple object classes 学习多个对象类的分层组合表示
A. Leonardis
Summary form only given. Visual categorization, recognition, and detection of objects has been an area of active research in the vision community for decades. Ultimately, the goal is to recognize and detect a large number of object classes in images within an acceptable time frame. This problem entangles three highly interconnected issues: the internal object representation which should expand sublinearly with the number of classes, means to learn the representation from a set of images, and an effective inference algorithm that matches the object representation against the representation produced from the scene. In the main part of the talk I will present our framework for learning a hierarchical compositional representation of multiple object classes. Learning is unsupervised, statistical, and is performed bottom-up. The approach takes simple contour fragments and learns their frequent spatial configurations which recursively combine into increasingly more complex and class-specific contour compositions.
只提供摘要形式。几十年来,视觉分类、识别和检测一直是视觉界的一个活跃研究领域。最终的目标是在可接受的时间范围内识别和检测图像中的大量对象类。这个问题涉及三个高度相互关联的问题:内部对象表示,它应该随着类的数量次线性扩展,意味着从一组图像中学习表示,以及一个有效的推理算法,将对象表示与从场景中产生的表示相匹配。在演讲的主要部分,我将介绍我们的框架,用于学习多个对象类的分层组合表示。学习是无监督的、统计的、自下而上的。该方法采用简单的轮廓碎片,并学习它们频繁的空间配置,这些空间配置递归地组合成越来越复杂和特定类别的轮廓组合。
{"title":"Learning a hierarchical compositional representation of multiple object classes","authors":"A. Leonardis","doi":"10.1109/CVPRW.2009.5204332","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204332","url":null,"abstract":"Summary form only given. Visual categorization, recognition, and detection of objects has been an area of active research in the vision community for decades. Ultimately, the goal is to recognize and detect a large number of object classes in images within an acceptable time frame. This problem entangles three highly interconnected issues: the internal object representation which should expand sublinearly with the number of classes, means to learn the representation from a set of images, and an effective inference algorithm that matches the object representation against the representation produced from the scene. In the main part of the talk I will present our framework for learning a hierarchical compositional representation of multiple object classes. Learning is unsupervised, statistical, and is performed bottom-up. The approach takes simple contour fragments and learns their frequent spatial configurations which recursively combine into increasingly more complex and class-specific contour compositions.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134256524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A syntax for image understanding 用于图像理解的语法
N. Ahuja
We consider one of the most basic questions in computer vision, that of finding a low-level image representation that could be used to seed diverse, subsequent computations of image understanding. Can we define a relatively general purpose image representation which would serve as the syntax for diverse needs of image understanding? What makes good image syntax? How do we evaluate it? We pose a series of such questions and evolve a set of answers to them, which in turn help evolve an image representation. For concreteness, we first perform this exercise in the specific context of the following problem.
我们考虑了计算机视觉中最基本的问题之一,即找到一个低级别的图像表示,可以用来为图像理解的各种后续计算提供种子。我们能否定义一个相对通用的图像表示,作为图像理解不同需求的语法?什么是好的图像语法?我们如何评估它?我们提出了一系列这样的问题,并进化出一套答案,这反过来又有助于进化出一种图像表示。为了具体起见,我们首先在下面这个问题的具体背景下进行这个练习。
{"title":"A syntax for image understanding","authors":"N. Ahuja","doi":"10.1109/CVPRW.2009.5204337","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204337","url":null,"abstract":"We consider one of the most basic questions in computer vision, that of finding a low-level image representation that could be used to seed diverse, subsequent computations of image understanding. Can we define a relatively general purpose image representation which would serve as the syntax for diverse needs of image understanding? What makes good image syntax? How do we evaluate it? We pose a series of such questions and evolve a set of answers to them, which in turn help evolve an image representation. For concreteness, we first perform this exercise in the specific context of the following problem.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133419770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate estimation of pulmonary nodule's growth rate in CT images with nonrigid registration and precise nodule detection and segmentation 利用非刚性配准和精确的结节检测与分割,准确估计CT图像中肺结节的生长速度
Yuanjie Zheng, C. Kambhamettu, T. Bauer, K. Steiner
We propose a new tumor growth measure for pulmonary nodules in CT images, which can account for the tumor deformation caused by the inspiration level's difference. It is accomplished with a new nonrigid lung registration process, which can handle the tumor expanding/shrinking problem occurring in many conventional nonrigid registration methods. The accurate nonrigid registration is performed by weighting the matching cost of each voxel, based on the result of a new nodule detection approach and a powerful nodule segmentation algorithm. Comprehensive experiments show the high accuracy of our algorithms and the promising results of our new tumor growth measure.
我们提出了一种新的肺结节CT图像的肿瘤生长测量方法,该方法可以解释由于吸入水平的差异而引起的肿瘤变形。它通过一种新的非刚性肺配准过程来完成,可以解决许多传统非刚性配准方法中出现的肿瘤扩张/缩小问题。基于一种新的结节检测方法和强大的结节分割算法,通过加权每个体素的匹配代价来实现精确的非刚性配准。综合实验表明,我们的算法具有很高的准确性,并且我们的新肿瘤生长测量方法取得了令人鼓舞的结果。
{"title":"Accurate estimation of pulmonary nodule's growth rate in CT images with nonrigid registration and precise nodule detection and segmentation","authors":"Yuanjie Zheng, C. Kambhamettu, T. Bauer, K. Steiner","doi":"10.1109/CVPRW.2009.5204050","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204050","url":null,"abstract":"We propose a new tumor growth measure for pulmonary nodules in CT images, which can account for the tumor deformation caused by the inspiration level's difference. It is accomplished with a new nonrigid lung registration process, which can handle the tumor expanding/shrinking problem occurring in many conventional nonrigid registration methods. The accurate nonrigid registration is performed by weighting the matching cost of each voxel, based on the result of a new nodule detection approach and a powerful nodule segmentation algorithm. Comprehensive experiments show the high accuracy of our algorithms and the promising results of our new tumor growth measure.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132241863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Feature based person detection beyond the visible spectrum 超越可见光谱的基于特征的人检测
K. Jüngling, Michael Arens
One of the main challenges in computer vision is the automatic detection of specific object classes in images. Recent advances of object detection performance in the visible spectrum encourage the application of these approaches to data beyond the visible spectrum. In this paper, we show the applicability of a well known, local-feature based object detector for the case of people detection in thermal data. We adapt the detector to the special conditions of infrared data and show the specifics relevant for feature based object detection. For that, we employ the SURF feature detector and descriptor that is well suited for infrared data. We evaluate the performance of our adapted object detector in the task of person detection in different real-world scenarios where people occur at multiple scales. Finally, we show how this local-feature based detector can be used to recognize specific object parts, i.e., body parts of detected people.
计算机视觉的主要挑战之一是图像中特定对象类别的自动检测。可见光谱中目标检测性能的最新进展鼓励了这些方法在可见光谱以外数据中的应用。在本文中,我们展示了一种众所周知的基于局部特征的目标检测器在热数据中检测人的情况下的适用性。我们使探测器适应红外数据的特殊条件,并显示了基于特征的目标检测的相关细节。为此,我们采用了非常适合红外数据的SURF特征检测器和描述符。我们在不同的现实世界场景中评估了我们的适应对象检测器在人检测任务中的性能,其中人出现在多个尺度上。最后,我们展示了如何使用这种基于局部特征的检测器来识别特定的物体部分,即被检测人的身体部位。
{"title":"Feature based person detection beyond the visible spectrum","authors":"K. Jüngling, Michael Arens","doi":"10.1109/CVPRW.2009.5204085","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204085","url":null,"abstract":"One of the main challenges in computer vision is the automatic detection of specific object classes in images. Recent advances of object detection performance in the visible spectrum encourage the application of these approaches to data beyond the visible spectrum. In this paper, we show the applicability of a well known, local-feature based object detector for the case of people detection in thermal data. We adapt the detector to the special conditions of infrared data and show the specifics relevant for feature based object detection. For that, we employ the SURF feature detector and descriptor that is well suited for infrared data. We evaluate the performance of our adapted object detector in the task of person detection in different real-world scenarios where people occur at multiple scales. Finally, we show how this local-feature based detector can be used to recognize specific object parts, i.e., body parts of detected people.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131565552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
Fuzzy statistical modeling of dynamic backgrounds for moving object detection in infrared videos 红外视频运动目标检测动态背景的模糊统计建模
Fida El Baf, T. Bouwmans, B. Vachon
Mixture of Gaussians (MOG) is the most popular technique for background modeling and presents some limitations when dynamic changes occur in the scene like camera jitter and movement in the background. Furthermore, the MOG is initialized using a training sequence which may be noisy and/or insufficient to model correctly the background. All these critical situations generate false classification in the foreground detection mask due to the related uncertainty. In this context, we present a background modeling algorithm based on Type-2 Fuzzy Mixture of Gaussians which is particularly suitable for infrared videos. The use of the Type-2 Fuzzy Set Theory allows to take into account the uncertainty. The results using the OTCBVS benchmark/test dataset videos show the robustness of the proposed method in presence of dynamic backgrounds.
混合高斯(MOG)是最流行的背景建模技术,但当场景中发生动态变化时,如相机抖动和背景运动时,该技术存在一些局限性。此外,MOG初始化使用的训练序列可能是有噪声的和/或不足以正确建模背景。这些关键的情况,由于相关的不确定性,都会在前景检测掩码中产生错误的分类。在此背景下,我们提出了一种特别适用于红外视频的基于2型模糊混合高斯的背景建模算法。二类模糊集理论的使用允许考虑不确定性。使用OTCBVS基准/测试数据集视频的结果表明,该方法在动态背景下具有鲁棒性。
{"title":"Fuzzy statistical modeling of dynamic backgrounds for moving object detection in infrared videos","authors":"Fida El Baf, T. Bouwmans, B. Vachon","doi":"10.1109/CVPRW.2009.5204109","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204109","url":null,"abstract":"Mixture of Gaussians (MOG) is the most popular technique for background modeling and presents some limitations when dynamic changes occur in the scene like camera jitter and movement in the background. Furthermore, the MOG is initialized using a training sequence which may be noisy and/or insufficient to model correctly the background. All these critical situations generate false classification in the foreground detection mask due to the related uncertainty. In this context, we present a background modeling algorithm based on Type-2 Fuzzy Mixture of Gaussians which is particularly suitable for infrared videos. The use of the Type-2 Fuzzy Set Theory allows to take into account the uncertainty. The results using the OTCBVS benchmark/test dataset videos show the robustness of the proposed method in presence of dynamic backgrounds.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114512789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
A framework for automated measurement of the intensity of non-posed Facial Action Units 一个用于自动测量非姿势面部动作单元强度的框架
M. Mahoor, S. Cadavid, D. Messinger, J. Cohn
This paper presents a framework to automatically measure the intensity of naturally occurring facial actions. Naturalistic expressions are non-posed spontaneous actions. The facial action coding system (FACS) is the gold standard technique for describing facial expressions, which are parsed as comprehensive, nonoverlapping action units (Aus). AUs have intensities ranging from absent to maximal on a six-point metric (i.e., 0 to 5). Despite the efforts in recognizing the presence of non-posed action units, measuring their intensity has not been studied comprehensively. In this paper, we develop a framework to measure the intensity of AU12 (lip corner puller) and AU6 (cheek raising) in videos captured from infant-mother live face-to-face communications. The AU12 and AU6 are the most challenging case of infant's expressions (e.g., low facial texture in infant's face). One of the problems in facial image analysis is the large dimensionality of the visual data. Our approach for solving this problem is to utilize the spectral regression technique to project high dimensionality facial images into a low dimensionality space. Represented facial images in the low dimensional space are utilized to train support vector machine classifiers to predict the intensity of action units. Analysis of 18 minutes of captured video of non-posed facial expressions of several infants and mothers shows significant agreement between a human FACS coder and our approach, which makes it an efficient approach for automated measurement of the intensity of non-posed facial action units.
本文提出了一个自动测量自然发生的面部动作强度的框架。自然主义的表达是不做作的自发行为。面部动作编码系统(FACS)是描述面部表情的黄金标准技术,它被解析为全面的、不重叠的动作单元(au)。在6点度量(即0到5)上,au的强度范围从没有到最大。尽管在识别非姿势动作单元的存在方面做出了努力,但测量它们的强度尚未得到全面研究。在本文中,我们开发了一个框架来测量从母婴现场面对面交流中捕获的视频中AU12(唇角拉动)和AU6(脸颊抬起)的强度。AU12和AU6是婴儿表情最具挑战性的情况(如婴儿面部纹理低)。人脸图像分析中存在的问题之一是视觉数据的维数过大。我们解决这个问题的方法是利用光谱回归技术将高维人脸图像投影到低维空间中。利用低维空间中表示的面部图像来训练支持向量机分类器来预测动作单元的强度。对几个婴儿和母亲的18分钟非摆姿势面部表情视频的分析表明,人类FACS编码器和我们的方法之间存在显著的一致性,这使得它成为自动测量非摆姿势面部动作单元强度的有效方法。
{"title":"A framework for automated measurement of the intensity of non-posed Facial Action Units","authors":"M. Mahoor, S. Cadavid, D. Messinger, J. Cohn","doi":"10.1109/CVPRW.2009.5204259","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204259","url":null,"abstract":"This paper presents a framework to automatically measure the intensity of naturally occurring facial actions. Naturalistic expressions are non-posed spontaneous actions. The facial action coding system (FACS) is the gold standard technique for describing facial expressions, which are parsed as comprehensive, nonoverlapping action units (Aus). AUs have intensities ranging from absent to maximal on a six-point metric (i.e., 0 to 5). Despite the efforts in recognizing the presence of non-posed action units, measuring their intensity has not been studied comprehensively. In this paper, we develop a framework to measure the intensity of AU12 (lip corner puller) and AU6 (cheek raising) in videos captured from infant-mother live face-to-face communications. The AU12 and AU6 are the most challenging case of infant's expressions (e.g., low facial texture in infant's face). One of the problems in facial image analysis is the large dimensionality of the visual data. Our approach for solving this problem is to utilize the spectral regression technique to project high dimensionality facial images into a low dimensionality space. Represented facial images in the low dimensional space are utilized to train support vector machine classifiers to predict the intensity of action units. Analysis of 18 minutes of captured video of non-posed facial expressions of several infants and mothers shows significant agreement between a human FACS coder and our approach, which makes it an efficient approach for automated measurement of the intensity of non-posed facial action units.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126236149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 122
Inference and learning with hierarchical compositional models 基于分层组合模型的推理和学习
Iasonas Kokkinos, A. Yuille
Summary form only given: In this work we consider the problem of object parsing, namely detecting an object and its components by composing them from image observations. We build to address the computational complexity of the inference problem. For this we exploit our hierarchical object representation to efficiently compute a coarse solution to the problem, which we then use to guide search at a finer level. Starting from our adaptation of the A* parsing algorithm to the problem of object parsing, we then propose a coarse-to-fine approach that is capable of detecting multiple objects simultaneously. We extend this work to automatically learn a hierarchical model for a category from a set of training images for which only the bounding box is available. Our approach consists in (a) automatically registering a set of training images and constructing an object template (b) recovering object contours (c) finding object parts based on contour affinities and (d) discriminatively learning a parsing cost function.
在这项工作中,我们考虑对象解析的问题,即通过从图像观察中组合它们来检测对象及其组成部分。我们构建来解决推理问题的计算复杂性。为此,我们利用我们的分层对象表示来有效地计算问题的粗略解,然后我们使用它来指导更精细的搜索。从我们将A*解析算法应用到对象解析问题开始,我们提出了一种能够同时检测多个对象的从粗到精的方法。我们将这项工作扩展到从一组只有边界框可用的训练图像中自动学习类别的分层模型。我们的方法包括(a)自动注册一组训练图像并构建对象模板(b)恢复对象轮廓(c)基于轮廓亲和力查找对象部分和(d)判别学习解析成本函数。
{"title":"Inference and learning with hierarchical compositional models","authors":"Iasonas Kokkinos, A. Yuille","doi":"10.1109/CVPRW.2009.5204336","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204336","url":null,"abstract":"Summary form only given: In this work we consider the problem of object parsing, namely detecting an object and its components by composing them from image observations. We build to address the computational complexity of the inference problem. For this we exploit our hierarchical object representation to efficiently compute a coarse solution to the problem, which we then use to guide search at a finer level. Starting from our adaptation of the A* parsing algorithm to the problem of object parsing, we then propose a coarse-to-fine approach that is capable of detecting multiple objects simultaneously. We extend this work to automatically learn a hierarchical model for a category from a set of training images for which only the bounding box is available. Our approach consists in (a) automatically registering a set of training images and constructing an object template (b) recovering object contours (c) finding object parts based on contour affinities and (d) discriminatively learning a parsing cost function.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128058850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An affine Invariant hyperspectral texture descriptor based upon heavy-tailed distributions and fourier analysis 基于重尾分布和傅立叶分析的仿射不变高光谱纹理描述子
P. Khuwuthyakorn, A. Robles-Kelly, J. Zhou
In this paper, we address the problem of recovering a hyperspectral texture descriptor. We do this by viewing the wavelength-indexed bands corresponding to the texture in the image as those arising from a stochastic process whose statistics can be captured making use of the relationships between moment generating functions and Fourier kernels. In this manner, we can interpret the probability distribution of the hyper-spectral texture as a heavy-tailed one which can be rendered invariant to affine geometric transformations on the texture plane making use of the spectral power of its Fourier cosine transform. We do this by recovering the affine geometric distortion matrices corresponding to the probability density function for the texture under study. This treatment permits the development of a robust descriptor which has a high information compaction property and can capture the space and wavelength correlation for the spectra in the hyperspectral images. We illustrate the utility of our descriptor for purposes of recognition and provide results on real-world datasets. We also compare our results to those yielded by a number of alternatives.
在本文中,我们解决了高光谱纹理描述符的恢复问题。我们通过将图像中与纹理对应的波长索引波段视为随机过程产生的波段来实现这一点,随机过程的统计数据可以利用矩生成函数和傅立叶核之间的关系来捕获。通过这种方式,我们可以将高光谱纹理的概率分布解释为一个重尾分布,利用其傅立叶余弦变换的光谱功率,可以使其对纹理平面上的仿射几何变换保持不变。我们通过恢复与所研究纹理的概率密度函数相对应的仿射几何畸变矩阵来实现这一点。这种处理允许开发具有高信息压缩特性的鲁棒描述子,并且可以捕获高光谱图像中光谱的空间和波长相关性。我们举例说明了我们的描述符用于识别的效用,并提供了真实世界数据集的结果。我们还将我们的结果与许多替代方法产生的结果进行比较。
{"title":"An affine Invariant hyperspectral texture descriptor based upon heavy-tailed distributions and fourier analysis","authors":"P. Khuwuthyakorn, A. Robles-Kelly, J. Zhou","doi":"10.1109/CVPRW.2009.5204126","DOIUrl":"https://doi.org/10.1109/CVPRW.2009.5204126","url":null,"abstract":"In this paper, we address the problem of recovering a hyperspectral texture descriptor. We do this by viewing the wavelength-indexed bands corresponding to the texture in the image as those arising from a stochastic process whose statistics can be captured making use of the relationships between moment generating functions and Fourier kernels. In this manner, we can interpret the probability distribution of the hyper-spectral texture as a heavy-tailed one which can be rendered invariant to affine geometric transformations on the texture plane making use of the spectral power of its Fourier cosine transform. We do this by recovering the affine geometric distortion matrices corresponding to the probability density function for the texture under study. This treatment permits the development of a robust descriptor which has a high information compaction property and can capture the space and wavelength correlation for the spectra in the hyperspectral images. We illustrate the utility of our descriptor for purposes of recognition and provide results on real-world datasets. We also compare our results to those yielded by a number of alternatives.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126655433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1