首页 > 最新文献

Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)最新文献

英文 中文
Classifying facial attributes using a 2-D Gabor wavelet representation and discriminant analysis 基于二维Gabor小波表示和判别分析的人脸属性分类方法
Michael J. Lyons, Julien Budynek, A. Plante, S. Akamatsu
A method for automatically classifying facial images is proposed. Faces are represented using elastic graphs labelled with with 2D Gabor wavelet features. The system is trained from examples to classify faces on the basis of high-level attributes, such as sex, "race", and expression, using linear discriminant analysis (LDA). Use of the Gabor representation relaxes the requirement for precise normalization of the face: approximate registration of a facial graph is sufficient. LDA allows simple and rapid training from examples, as well as straightforward interpretation of the role of the input features for classification. The algorithm is tested on three different facial image datasets, one of which was acquired under relatively uncontrolled conditions, on tasks of sex, "race" and expression classification. Results of these tests are presented. The discriminant vectors may be interpreted in terms of the saliency of the input features for the different classification tasks, which we portray visually with feature saliency maps for node position as well as filter spatial frequency and orientation.
提出了一种人脸图像自动分类方法。用二维Gabor小波特征标记的弹性图来表示人脸。该系统从样本中训练,使用线性判别分析(LDA)基于高级属性(如性别、“种族”和表情)对人脸进行分类。使用Gabor表示放宽了对面部精确归一化的要求:面部图的近似配准就足够了。LDA允许从示例中进行简单快速的训练,以及直接解释输入特征在分类中的作用。该算法在三个不同的面部图像数据集上进行了测试,其中一个数据集是在相对不受控制的条件下获得的,包括性别、“种族”和表情分类。给出了这些试验的结果。判别向量可以根据不同分类任务的输入特征的显著性来解释,我们用节点位置的特征显著性图以及过滤空间频率和方向来直观地描绘这些特征。
{"title":"Classifying facial attributes using a 2-D Gabor wavelet representation and discriminant analysis","authors":"Michael J. Lyons, Julien Budynek, A. Plante, S. Akamatsu","doi":"10.1109/AFGR.2000.840635","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840635","url":null,"abstract":"A method for automatically classifying facial images is proposed. Faces are represented using elastic graphs labelled with with 2D Gabor wavelet features. The system is trained from examples to classify faces on the basis of high-level attributes, such as sex, \"race\", and expression, using linear discriminant analysis (LDA). Use of the Gabor representation relaxes the requirement for precise normalization of the face: approximate registration of a facial graph is sufficient. LDA allows simple and rapid training from examples, as well as straightforward interpretation of the role of the input features for classification. The algorithm is tested on three different facial image datasets, one of which was acquired under relatively uncontrolled conditions, on tasks of sex, \"race\" and expression classification. Results of these tests are presented. The discriminant vectors may be interpreted in terms of the saliency of the input features for the different classification tasks, which we portray visually with feature saliency maps for node position as well as filter spatial frequency and orientation.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124976993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 176
Facial tracking and animation using a 3D sensor 面部跟踪和动画使用3D传感器
T. Fromherz, B. Takács, E. Hueso, Dimitris N. Metaxas, P. Stucki
Summary form only given. We describe a high-performance face tracking and animation solution used for low-cost production of TV commercials, films and special effects. The system combines a 3D sensor that captures full facial surfaces at arbitrary frame rates with our advanced facial tracking technology to produce highly accurate animation without the animator's intervention. We briefly review the state-of-the-art in facial tracking and animation and present experimental results proving the effectiveness of our method.
只提供摘要形式。我们描述了一种高性能的面部跟踪和动画解决方案,用于低成本制作电视广告,电影和特效。该系统结合了一个3D传感器,可以以任意帧速率捕获整个面部表面,并结合了我们先进的面部跟踪技术,无需动画师的干预即可产生高度精确的动画。我们简要回顾了面部跟踪和动画的最新进展,并给出了实验结果,证明了我们的方法的有效性。
{"title":"Facial tracking and animation using a 3D sensor","authors":"T. Fromherz, B. Takács, E. Hueso, Dimitris N. Metaxas, P. Stucki","doi":"10.1109/AFGR.2000.840628","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840628","url":null,"abstract":"Summary form only given. We describe a high-performance face tracking and animation solution used for low-cost production of TV commercials, films and special effects. The system combines a 3D sensor that captures full facial surfaces at arbitrary frame rates with our advanced facial tracking technology to produce highly accurate animation without the animator's intervention. We briefly review the state-of-the-art in facial tracking and animation and present experimental results proving the effectiveness of our method.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133922336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video annotation for content-based retrieval using human behavior analysis and domain knowledge 基于人类行为分析和领域知识的基于内容检索的视频注释
H. Miyamori, S. Iisaku
This paper proposes the automatic annotation of sports video for content-based retrieval. Conventional methods using position information of objects such as locus, relative positions, their transitions, etc., as indices, have drawbacks that tracking errors of a certain object due to occlusions causes recognition failures, and that representation by position information essentially has a limited number of recognizable events in the retrieval. Our approach incorporates human behavior analysis and specific domain knowledge with conventional methods, to develop an integrated reasoning module for richer expressiveness of events and robust recognition. Based on the proposed method, we implemented a content-based retrieval system which can identify several actions on real tennis video. We select court and net lines, players' positions, ball positions, and players' actions, as indices. Court and net lines are extracted using a court model and Hough transforms. Players and ball positions are tracked by adaptive template matching and particular predictions against sudden changes of motion direction. Players' actions are analyzed by 2D appearance-based matching using the transition of players' silhouettes and a hidden Markov model. The results using two sets of tennis video is presented, demonstrating the performance and the validity of our approach.
提出了一种基于内容检索的体育视频自动标注方法。传统方法以物体的位置信息如轨迹、相对位置、过渡等为指标,存在由于遮挡导致的某一物体的跟踪误差导致识别失败,以及位置信息表示在检索过程中可识别事件的数量有限等缺点。我们的方法将人类行为分析和特定领域知识与传统方法相结合,开发了一个集成的推理模块,以丰富事件的表达性和鲁棒性识别。基于所提出的方法,我们实现了一个基于内容的检索系统,该系统可以识别真实网球视频中的多个动作。我们选择球场和网线、球员位置、球位置和球员动作作为指标。使用球场模型和霍夫变换提取球场和网线。球员和球的位置通过自适应模板匹配和针对突然变化的运动方向的特定预测来跟踪。利用玩家轮廓的过渡和隐马尔可夫模型,通过基于2D外观的匹配来分析玩家的动作。最后给出了两组网球视频的实验结果,验证了该方法的有效性和有效性。
{"title":"Video annotation for content-based retrieval using human behavior analysis and domain knowledge","authors":"H. Miyamori, S. Iisaku","doi":"10.1109/AFGR.2000.840653","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840653","url":null,"abstract":"This paper proposes the automatic annotation of sports video for content-based retrieval. Conventional methods using position information of objects such as locus, relative positions, their transitions, etc., as indices, have drawbacks that tracking errors of a certain object due to occlusions causes recognition failures, and that representation by position information essentially has a limited number of recognizable events in the retrieval. Our approach incorporates human behavior analysis and specific domain knowledge with conventional methods, to develop an integrated reasoning module for richer expressiveness of events and robust recognition. Based on the proposed method, we implemented a content-based retrieval system which can identify several actions on real tennis video. We select court and net lines, players' positions, ball positions, and players' actions, as indices. Court and net lines are extracted using a court model and Hough transforms. Players and ball positions are tracked by adaptive template matching and particular predictions against sudden changes of motion direction. Players' actions are analyzed by 2D appearance-based matching using the transition of players' silhouettes and a hidden Markov model. The results using two sets of tennis video is presented, demonstrating the performance and the validity of our approach.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115119451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 121
Towards model-based gesture recognition 走向基于模型的手势识别
G. Schmidt, D. House
We propose a new technique for gesture recognition that involves both physical and control models of gesture performance, and describe preliminary experiments done to validate the approach. The technique incorporates underlying dynamics and control models are used to augment a set of Kalman-filter-based recognizer modules so that each filters the input data under the a priori assumption that one of the gestures is being performed. The recognized gesture is the filter output that most closely matches the output of an unaugmented Kalman filter. In our preliminary experiments, we treated gestures made with simple motions of the right arm, done while tracking only hand position. We modeled the path that the hand traverses while performing a gesture as a point-mass moving through air. The control model for each specific gesture was simply an experimentally determined sequence of applied forces plus a proportional control based on spatial position. Our experiments showed that even using such a simple set of models we were able to obtain results reasonably comparable with a carefully hand-constructed feature-based discriminator on a limited set of spatially-distinct planar gestures.
我们提出了一种新的手势识别技术,该技术涉及手势性能的物理和控制模型,并描述了为验证该方法所做的初步实验。该技术结合了潜在的动力学和控制模型,用于增强一组基于卡尔曼滤波的识别模块,以便每个模块在一个手势正在执行的先验假设下过滤输入数据。被识别的手势是最接近非增广卡尔曼滤波器输出的滤波器输出。在我们的初步实验中,我们处理了用右臂的简单动作做出的手势,同时只跟踪手的位置。我们将手在执行手势时所经过的路径建模为在空气中移动的质点。每个特定手势的控制模型只是实验确定的施加力序列加上基于空间位置的比例控制。我们的实验表明,即使使用如此简单的一组模型,我们也能够在有限的空间不同的平面手势上获得与精心构建的基于特征的鉴别器相当的结果。
{"title":"Towards model-based gesture recognition","authors":"G. Schmidt, D. House","doi":"10.1109/AFGR.2000.840668","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840668","url":null,"abstract":"We propose a new technique for gesture recognition that involves both physical and control models of gesture performance, and describe preliminary experiments done to validate the approach. The technique incorporates underlying dynamics and control models are used to augment a set of Kalman-filter-based recognizer modules so that each filters the input data under the a priori assumption that one of the gestures is being performed. The recognized gesture is the filter output that most closely matches the output of an unaugmented Kalman filter. In our preliminary experiments, we treated gestures made with simple motions of the right arm, done while tracking only hand position. We modeled the path that the hand traverses while performing a gesture as a point-mass moving through air. The control model for each specific gesture was simply an experimentally determined sequence of applied forces plus a proportional control based on spatial position. Our experiments showed that even using such a simple set of models we were able to obtain results reasonably comparable with a carefully hand-constructed feature-based discriminator on a limited set of spatially-distinct planar gestures.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127214948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Pose invariant face recognition 姿态不变人脸识别
Fu Jie Huang, Tsuhan Chen, Zhi-Hua Zhou, HongJiang Zhang
We describe a novel neural network architecture, which can recognize human faces with any view in a certain viewing angle range (from left 30 degrees to right 30 degrees out of plane rotation). View-specific eigenface analysis is used as the front-end of the system to extract features, and the neural network ensemble is used for recognition. Experimental results show that the recognition accuracy of our network ensemble is higher than conventional methods such as using a single neural network to recognize faces of a specific view.
我们描述了一种新的神经网络架构,它可以在一定视角范围内(从左30度到右30度)识别任意视图下的人脸。该系统采用特定视图特征脸分析作为前端提取特征,并采用神经网络集成进行识别。实验结果表明,该网络集成的识别精度高于传统方法,如使用单个神经网络识别特定视图的人脸。
{"title":"Pose invariant face recognition","authors":"Fu Jie Huang, Tsuhan Chen, Zhi-Hua Zhou, HongJiang Zhang","doi":"10.1109/AFGR.2000.840642","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840642","url":null,"abstract":"We describe a novel neural network architecture, which can recognize human faces with any view in a certain viewing angle range (from left 30 degrees to right 30 degrees out of plane rotation). View-specific eigenface analysis is used as the front-end of the system to extract features, and the neural network ensemble is used for recognition. Experimental results show that the recognition accuracy of our network ensemble is higher than conventional methods such as using a single neural network to recognize faces of a specific view.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127484510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 156
Face detection using mixtures of linear subspaces 使用混合线性子空间的人脸检测
Ming-Hsuan Yang, N. Ahuja, D. Kriegman
We present two methods using mixtures of linear sub-spaces for face detection in gray level images. One method uses a mixture of factor analyzers to concurrently perform clustering and, within each cluster, perform local dimensionality reduction. The parameters of the mixture model are estimated using an EM algorithm. A face is detected if the probability of an input sample is above a predefined threshold. The other mixture of subspaces method uses Kohonen's self-organizing map for clustering and Fisher linear discriminant to find the optimal projection for pattern classification, and a Gaussian distribution to model the class-conditioned density function of the projected samples for each class. The parameters of the class-conditioned density functions are maximum likelihood estimates and the decision rule is also based on maximum likelihood. A wide range of face images including ones in different poses, with different expressions and under different lighting conditions are used as the training set to capture the variations of human faces. Our methods have been tested on three sets of 225 images which contain 871 faces. Experimental results on the first two datasets show that our methods perform as well as the best methods in the literature, yet have fewer false detects.
我们提出了两种使用混合线性子空间的方法来检测灰度图像中的人脸。一种方法使用混合因素分析器并发地执行聚类,并在每个聚类中执行局部降维。利用电磁算法对混合模型的参数进行估计。如果输入样本的概率高于预定义的阈值,则检测人脸。另一种混合子空间方法使用Kohonen的自组织映射进行聚类,使用Fisher线性判别法寻找模式分类的最佳投影,并使用高斯分布对每个类别的投影样本的类别条件密度函数进行建模。类条件密度函数的参数是极大似然估计,决策规则也是基于极大似然。利用不同姿势、不同表情、不同光照条件下的大量人脸图像作为训练集,捕捉人脸的变化。我们的方法已经在三组包含871张面孔的225张图片上进行了测试。在前两个数据集上的实验结果表明,我们的方法的性能与文献中最好的方法一样好,但错误检测较少。
{"title":"Face detection using mixtures of linear subspaces","authors":"Ming-Hsuan Yang, N. Ahuja, D. Kriegman","doi":"10.1109/AFGR.2000.840614","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840614","url":null,"abstract":"We present two methods using mixtures of linear sub-spaces for face detection in gray level images. One method uses a mixture of factor analyzers to concurrently perform clustering and, within each cluster, perform local dimensionality reduction. The parameters of the mixture model are estimated using an EM algorithm. A face is detected if the probability of an input sample is above a predefined threshold. The other mixture of subspaces method uses Kohonen's self-organizing map for clustering and Fisher linear discriminant to find the optimal projection for pattern classification, and a Gaussian distribution to model the class-conditioned density function of the projected samples for each class. The parameters of the class-conditioned density functions are maximum likelihood estimates and the decision rule is also based on maximum likelihood. A wide range of face images including ones in different poses, with different expressions and under different lighting conditions are used as the training set to capture the variations of human faces. Our methods have been tested on three sets of 225 images which contain 871 faces. Experimental results on the first two datasets show that our methods perform as well as the best methods in the literature, yet have fewer false detects.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121879805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 120
Robust facial feature localization by coupled features 基于耦合特征的鲁棒人脸特征定位
M. Zobel, A. Gebhard, D. Paulus, Joachim Denzler, H. Niemann
We consider the problem of robust localization of faces and some of their facial features. The task arises, e.g., in the medical field of visual analysis of facial paresis. We detect faces and facial features by means of appropriate DCT coefficients that we obtain by neatly using the coding capabilities of a JPEG hardware compressor. Beside an anthropometric localization approach we focus on how spatial coupling of the facial features can be used to improve robustness of the localization. Because the presented approach is embedded in a completely probabilistic framework, it is not restricted to facial features, it can be generalized to multipart objects of any kind. Therefore the notion of a "coupled structure" is introduced. Finally, the approach is applied to the problem of localizing facial features in DCT-coded images and results from our experiments are shown.
研究了人脸及其部分面部特征的鲁棒定位问题。任务出现,例如,在医学领域的面部轻瘫的视觉分析。我们通过巧妙地利用JPEG硬件压缩器的编码能力获得适当的DCT系数来检测人脸和面部特征。除了人体测量定位方法外,我们还关注如何使用面部特征的空间耦合来提高定位的鲁棒性。由于所提出的方法嵌入在一个完全概率的框架中,因此它不局限于面部特征,它可以推广到任何类型的多部分对象。因此,引入了“耦合结构”的概念。最后,将该方法应用于dct编码图像的人脸特征定位问题,并给出了实验结果。
{"title":"Robust facial feature localization by coupled features","authors":"M. Zobel, A. Gebhard, D. Paulus, Joachim Denzler, H. Niemann","doi":"10.1109/AFGR.2000.840604","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840604","url":null,"abstract":"We consider the problem of robust localization of faces and some of their facial features. The task arises, e.g., in the medical field of visual analysis of facial paresis. We detect faces and facial features by means of appropriate DCT coefficients that we obtain by neatly using the coding capabilities of a JPEG hardware compressor. Beside an anthropometric localization approach we focus on how spatial coupling of the facial features can be used to improve robustness of the localization. Because the presented approach is embedded in a completely probabilistic framework, it is not restricted to facial features, it can be generalized to multipart objects of any kind. Therefore the notion of a \"coupled structure\" is introduced. Finally, the approach is applied to the problem of localizing facial features in DCT-coded images and results from our experiments are shown.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"120 1-3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128690120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Detection and estimation of pointing gestures in dense disparity maps 密集视差图中指向手势的检测与估计
N. Jojic, Thomas S. Huang, B. Brumitt, B. Meyers, Steve Harris
We describe a real-time system for detecting pointing gestures and estimating the direction of pointing using stereo cameras. Previously, similar systems were implemented using color-based blob trackers, which relied on effective skin color detection; this approach is sensitive to lighting changes and the clothing worn by the user. In contrast, we used a stereo system that produces dense disparity maps in real-time. Disparity maps are considerably less sensitive to lighting changes. Our system subtracts the background, analyzes the foreground pixels to break the body into parts using a robust mixture model, and estimates the direction of pointing. We have tested the system on both coarse and fine pointing by selecting the targets in a room and controlling the cursor on a wall screen, respectively.
我们描述了一个实时系统,用于检测指向手势和估计使用立体摄像机指向的方向。以前,类似的系统是使用基于颜色的斑点跟踪器实现的,它依赖于有效的肤色检测;这种方法对光线变化和用户穿的衣服很敏感。相比之下,我们使用的是立体系统,可以实时生成密集的视差图。视差贴图对光照的变化不太敏感。我们的系统减去背景,分析前景像素,使用鲁棒混合模型将身体分解成各个部分,并估计指向的方向。我们分别通过选择房间中的目标和控制墙壁屏幕上的光标对系统进行了粗点和精点测试。
{"title":"Detection and estimation of pointing gestures in dense disparity maps","authors":"N. Jojic, Thomas S. Huang, B. Brumitt, B. Meyers, Steve Harris","doi":"10.1109/AFGR.2000.840676","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840676","url":null,"abstract":"We describe a real-time system for detecting pointing gestures and estimating the direction of pointing using stereo cameras. Previously, similar systems were implemented using color-based blob trackers, which relied on effective skin color detection; this approach is sensitive to lighting changes and the clothing worn by the user. In contrast, we used a stereo system that produces dense disparity maps in real-time. Disparity maps are considerably less sensitive to lighting changes. Our system subtracts the background, analyzes the foreground pixels to break the body into parts using a robust mixture model, and estimates the direction of pointing. We have tested the system on both coarse and fine pointing by selecting the targets in a room and controlling the cursor on a wall screen, respectively.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128356829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 146
Determining correspondences for statistical models of facial appearance 确定面部外观统计模型的对应关系
K. N. Walker, Tim Cootes, C. Taylor
In order to build a statistical model of facial appearance we require a set of images, each with a consistent set of landmarks. We address the problem of automatically placing a set of landmarks to define the correspondences across an image set. We can estimate correspondences between any pair of images by locating salient points on one and finding their corresponding position in the second. However, we wish to determine a globally consistent set of correspondences across all the images. We present an iterative scheme in which these pairwise correspondences are used to determine a global correspondence across the entire set. We show results on several training sets, and demonstrate that an appearance model trained on the correspondences is of higher quality than one built from hand-marked images.
为了建立面部外观的统计模型,我们需要一组图像,每个图像都有一组一致的地标。我们解决了自动放置一组地标来定义图像集上的对应关系的问题。我们可以通过定位其中一个图像上的显著点并在另一个图像上找到它们的对应位置来估计任意一对图像之间的对应关系。然而,我们希望在所有图像之间确定一个全局一致的对应集。我们提出了一种迭代方案,在该方案中,使用这些成对对应来确定整个集合上的全局对应。我们在几个训练集上展示了结果,并证明了在对应关系上训练的外观模型比从手工标记图像中构建的模型质量更高。
{"title":"Determining correspondences for statistical models of facial appearance","authors":"K. N. Walker, Tim Cootes, C. Taylor","doi":"10.1109/AFGR.2000.840646","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840646","url":null,"abstract":"In order to build a statistical model of facial appearance we require a set of images, each with a consistent set of landmarks. We address the problem of automatically placing a set of landmarks to define the correspondences across an image set. We can estimate correspondences between any pair of images by locating salient points on one and finding their corresponding position in the second. However, we wish to determine a globally consistent set of correspondences across all the images. We present an iterative scheme in which these pairwise correspondences are used to determine a global correspondence across the entire set. We show results on several training sets, and demonstrate that an appearance model trained on the correspondences is of higher quality than one built from hand-marked images.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124180658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A probabilistic framework for rigid and non-rigid appearance based tracking and recognition 一种基于刚性和非刚性外观跟踪和识别的概率框架
F. D. L. Torre, Y. Yacoob, L. Davis
This paper describes an unified probabilistic framework for appearance-based tracking of rigid and non-rigid objects. A spatio-temporal dependent shape-texture eigenspace and mixture of diagonal Gaussians are learned in a hidden Markov model (HMM)-like structure to better constrain the model and for recognition purposes. Particle filtering is used to track the object while switching between different shape/texture models. This framework allows recognition and temporal segmentation of activities. Additionally an automatic stochastic initialization is proposed, the number of states in the HMM are selected based on the Akaike information criterion and comparison with deterministic tracking for 2D models is discussed. Preliminary results of eye tracking, lip tracking and temporal segmentation of mouth events are presented.
本文描述了一种基于外观的刚性和非刚性物体跟踪的统一概率框架。在一个类似隐马尔可夫模型(HMM)的结构中学习一个时空相关的形状纹理特征空间和对角高斯分布的混合,以更好地约束模型并用于识别目的。粒子滤波用于在不同形状/纹理模型之间切换时跟踪物体。该框架允许对活动进行识别和时间分割。此外,提出了一种自动随机初始化方法,根据赤池信息准则选择HMM中的状态数,并与二维模型的确定性跟踪进行了比较。给出了眼动跟踪、唇动跟踪和嘴巴事件时间分割的初步结果。
{"title":"A probabilistic framework for rigid and non-rigid appearance based tracking and recognition","authors":"F. D. L. Torre, Y. Yacoob, L. Davis","doi":"10.1109/AFGR.2000.840679","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840679","url":null,"abstract":"This paper describes an unified probabilistic framework for appearance-based tracking of rigid and non-rigid objects. A spatio-temporal dependent shape-texture eigenspace and mixture of diagonal Gaussians are learned in a hidden Markov model (HMM)-like structure to better constrain the model and for recognition purposes. Particle filtering is used to track the object while switching between different shape/texture models. This framework allows recognition and temporal segmentation of activities. Additionally an automatic stochastic initialization is proposed, the number of states in the HMM are selected based on the Akaike information criterion and comparison with deterministic tracking for 2D models is discussed. Preliminary results of eye tracking, lip tracking and temporal segmentation of mouth events are presented.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126485897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
期刊
Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1