首页 > 最新文献

Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems最新文献

英文 中文
Fast hand gesture recognition for real-time teleconferencing applications 用于实时电话会议应用的快速手势识别
James Maclean
Work on real-time hand-gesture recognition for SAVI (stereo active vision interface) is presented. Based on the detection of frontal faces, image regions near the face are searched for the existence of skin-tone blobs. Each blob is evaluated to determine if it it is a hand held in a standard pose. A verification algorithm based on the responses of elongated oriented filters is used to decide whether a hand is present or not. Once a hand is detected, gestures are given by varying the number of fingers visible. The hand is segmented using an algorithm which detects connected skin-tone blobs in the region of interest, and a medial axis transform (skeletonization) is applied. Analysis of the resulting skeleton allows detection of the number of fingers visible, thus determining the gesture. The skeletonization is sensitive to strong shadows which may alter the detected morphology of the hand. Experimental results are given indicating good performance of the algorithm.
介绍了基于立体主动视觉界面(SAVI)的实时手势识别技术。在检测正面人脸的基础上,对人脸附近的图像区域进行搜索,寻找是否存在肤色斑点。每个斑点被评估,以确定它是否是在一个标准姿势的手。采用一种基于细长定向滤波器响应的验证算法来判断手是否存在。一旦检测到一只手,就会通过改变可见手指的数量来做出手势。使用一种检测感兴趣区域中连接的肤色斑点的算法对手进行分割,并应用内轴线变换(骨架化)。对生成的骨架进行分析,可以检测可见的手指数量,从而确定手势。骨骼化对强烈的阴影很敏感,这可能会改变手的检测形态。实验结果表明,该算法具有良好的性能。
{"title":"Fast hand gesture recognition for real-time teleconferencing applications","authors":"James Maclean","doi":"10.1109/RATFG.2001.938922","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938922","url":null,"abstract":"Work on real-time hand-gesture recognition for SAVI (stereo active vision interface) is presented. Based on the detection of frontal faces, image regions near the face are searched for the existence of skin-tone blobs. Each blob is evaluated to determine if it it is a hand held in a standard pose. A verification algorithm based on the responses of elongated oriented filters is used to decide whether a hand is present or not. Once a hand is detected, gestures are given by varying the number of fingers visible. The hand is segmented using an algorithm which detects connected skin-tone blobs in the region of interest, and a medial axis transform (skeletonization) is applied. Analysis of the resulting skeleton allows detection of the number of fingers visible, thus determining the gesture. The skeletonization is sensitive to strong shadows which may alter the detected morphology of the hand. Experimental results are given indicating good performance of the algorithm.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128992995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
The Hand Mouse: GMM hand-color classification and mean shift tracking 手鼠标:GMM手颜色分类和平均移位跟踪
T. Kurata, T. Okuma, M. Kourogi, K. Sakaue
This paper describes an algorithm to detect and track a hand in each image taken by a wearable camera. We primarily use color information, however, instead of pre-defined skin-color models, we dynamically construct hand- and background-color models by using a Gaussian mixture model (GMM) to approximate the color histogram. Not only to obtain the estimated mean of hand color necessary for the restricted EM algorithm that estimates the GMM but also to classify hand pixels based on the Bayes decision theory, we use a spatial probability distribution of hand pixels. Since the static distribution is inadequate for the hand-tracking stage, we translate the distribution with the hand motion based on the mean shift algorithm. Using the proposed method, we implemented the Hand Mouse that uses the wearer's hand as a pointing device, on our wearable vision system.
本文描述了一种检测和跟踪可穿戴相机拍摄的每张图像中的手的算法。我们主要使用颜色信息,但是,我们使用高斯混合模型(GMM)来近似颜色直方图,动态构建手色和背景色模型,而不是预定义的肤色模型。为了获得估计GMM的受限EM算法所需的手部颜色估计均值,并基于贝叶斯决策理论对手部像素进行分类,我们使用了手部像素的空间概率分布。由于静态分布不适用于手部跟踪阶段,我们基于均值移位算法将静态分布与手部运动进行平移。利用提出的方法,我们在我们的可穿戴视觉系统上实现了使用佩戴者的手作为指向设备的手鼠标。
{"title":"The Hand Mouse: GMM hand-color classification and mean shift tracking","authors":"T. Kurata, T. Okuma, M. Kourogi, K. Sakaue","doi":"10.1109/RATFG.2001.938920","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938920","url":null,"abstract":"This paper describes an algorithm to detect and track a hand in each image taken by a wearable camera. We primarily use color information, however, instead of pre-defined skin-color models, we dynamically construct hand- and background-color models by using a Gaussian mixture model (GMM) to approximate the color histogram. Not only to obtain the estimated mean of hand color necessary for the restricted EM algorithm that estimates the GMM but also to classify hand pixels based on the Bayes decision theory, we use a spatial probability distribution of hand pixels. Since the static distribution is inadequate for the hand-tracking stage, we translate the distribution with the hand motion based on the mean shift algorithm. Using the proposed method, we implemented the Hand Mouse that uses the wearer's hand as a pointing device, on our wearable vision system.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129279715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 82
Auto clustering for unsupervised learning of atomic gesture components using minimum description length 基于最小描述长度的原子手势成分无监督学习自动聚类
M. Walter, A. Psarrou, S. Gong
We present an approach to automatically segment and label a continuous observation sequence of hand gestures for a complete unsupervised model acquisition. The method is based on the assumption that gestures can be viewed as repetitive sequences of atomic components, similar to phonemes in speech, governed by a high level structure controlling the temporal sequence. We show that the generating process for the atomic components can be described in gesture space by a mixture of Gaussian, with each mixture component tied to one atomic behaviour. Mixture components are determined using a standard expectation maximisation approach while the determination of the number of components is based on an information criteria, the minimum description length.
我们提出了一种自动分割和标记手势连续观察序列的方法,用于完整的无监督模型获取。该方法基于一种假设,即手势可以被视为原子成分的重复序列,类似于语音中的音素,由一个控制时间序列的高级结构控制。我们证明了原子成分的生成过程可以在手势空间中通过高斯混合来描述,每个混合成分与一个原子行为相关联。混合成分使用标准期望最大化方法确定,而成分数量的确定基于信息标准,即最小描述长度。
{"title":"Auto clustering for unsupervised learning of atomic gesture components using minimum description length","authors":"M. Walter, A. Psarrou, S. Gong","doi":"10.1109/RATFG.2001.938925","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938925","url":null,"abstract":"We present an approach to automatically segment and label a continuous observation sequence of hand gestures for a complete unsupervised model acquisition. The method is based on the assumption that gestures can be viewed as repetitive sequences of atomic components, similar to phonemes in speech, governed by a high level structure controlling the temporal sequence. We show that the generating process for the atomic components can be described in gesture space by a mixture of Gaussian, with each mixture component tied to one atomic behaviour. Mixture components are determined using a standard expectation maximisation approach while the determination of the number of components is based on an information criteria, the minimum description length.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123377703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A vision-based microphone switch for speech intent detection 用于语音意图检测的基于视觉的麦克风开关
G. Iyengar, C. Neti
We present our system for speech intent detection. In traditional desktop speech applications, the user has to explicitly indicate intent-to-speak to the computer by turning the microphone on. This is to alleviate problems associated with an open microphone in an automatic speech recognition system. In this paper, we use cues derived from user pose, proximity and visual speech activity to detect speech intent and enable automatic control of the microphone. We achieve real-time performance using pre-attentive cues to eliminate redundant computation.
我们提出了我们的语音意图检测系统。在传统的桌面语音应用程序中,用户必须通过打开麦克风来明确地向计算机表明说话的意图。这是为了减轻自动语音识别系统中与开放式麦克风相关的问题。在本文中,我们使用来自用户姿势,接近度和视觉语音活动的线索来检测语音意图并实现麦克风的自动控制。我们使用预先注意的线索来消除冗余计算,从而实现实时性能。
{"title":"A vision-based microphone switch for speech intent detection","authors":"G. Iyengar, C. Neti","doi":"10.1109/RATFG.2001.938917","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938917","url":null,"abstract":"We present our system for speech intent detection. In traditional desktop speech applications, the user has to explicitly indicate intent-to-speak to the computer by turning the microphone on. This is to alleviate problems associated with an open microphone in an automatic speech recognition system. In this paper, we use cues derived from user pose, proximity and visual speech activity to detect speech intent and enable automatic control of the microphone. We achieve real-time performance using pre-attentive cues to eliminate redundant computation.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121566160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Facial expression recognition using continuous dynamic programming 基于连续动态规划的面部表情识别
H. Zhang, Y. Guo
Describes an approach to facial expression recognition (FER). We represent facial expressions by a facial motion graph (FMG), which is based on feature points and muscle movements. FER is achieved by analyzing the similarity between an unknown expression's FMG and FMG models of known expressions by employing continuous dynamic programming. Furthermore we propose a method to evaluate edge weights in FMG similarity calculation, and use these edge weights to achieve a more accurate and robust system. Experiments show the excellent performance of this system on our video database, which contains video data captured under various conditions with multiple motion patterns.
描述了一种面部表情识别(FER)方法。我们通过基于特征点和肌肉运动的面部运动图(FMG)来表示面部表情。利用连续动态规划方法,分析未知表达式的FMG模型与已知表达式的FMG模型之间的相似性,从而实现FMG模型。此外,我们还提出了一种评估FMG相似度计算中边缘权重的方法,并利用这些边缘权重来获得更准确和鲁棒的系统。实验表明,该系统在我们的视频数据库中具有优异的性能,该数据库包含在各种条件下以多种运动模式捕获的视频数据。
{"title":"Facial expression recognition using continuous dynamic programming","authors":"H. Zhang, Y. Guo","doi":"10.1109/RATFG.2001.938926","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938926","url":null,"abstract":"Describes an approach to facial expression recognition (FER). We represent facial expressions by a facial motion graph (FMG), which is based on feature points and muscle movements. FER is achieved by analyzing the similarity between an unknown expression's FMG and FMG models of known expressions by employing continuous dynamic programming. Furthermore we propose a method to evaluate edge weights in FMG similarity calculation, and use these edge weights to achieve a more accurate and robust system. Experiments show the excellent performance of this system on our video database, which contains video data captured under various conditions with multiple motion patterns.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126282002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Boosting for fast face recognition 增强快速人脸识别
G. Guo, HongJiang Zhang
We propose to use the AdaBoost algorithm for face recognition. AdaBoost is a kind of large margin classifiers and is efficient for online learning. In order to adapt the AdaBoost algorithm to fast face recognition, the original AdaBoost which uses all given features is compared with the boosting feature dimensions. The comparable results assure the use of the latter, which is faster for classification. The AdaBoost is typically a classification between two classes. To solve the multi-class recognition problem, we propose to use a constrained majority voting strategy to largely reduce the number of pairwise comparisons, without losing the recognition accuracy. Experimental results on a large face database of 1079 faces of 137 individuals show the feasibility of our approach for fast face recognition.
我们建议使用AdaBoost算法进行人脸识别。AdaBoost是一种大边际分类器,对在线学习非常有效。为了使AdaBoost算法适应快速人脸识别,将使用所有给定特征的原始AdaBoost算法与增强的特征维数进行了比较。可比较的结果保证了后者的使用,后者的分类速度更快。AdaBoost通常分为两类。为了解决多类识别问题,我们提出使用约束多数投票策略,在不损失识别精度的前提下,大大减少两两比较的次数。在137个个体的1079张人脸的大型人脸数据库上进行的实验结果表明,该方法具有快速人脸识别的可行性。
{"title":"Boosting for fast face recognition","authors":"G. Guo, HongJiang Zhang","doi":"10.1109/RATFG.2001.938916","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938916","url":null,"abstract":"We propose to use the AdaBoost algorithm for face recognition. AdaBoost is a kind of large margin classifiers and is efficient for online learning. In order to adapt the AdaBoost algorithm to fast face recognition, the original AdaBoost which uses all given features is compared with the boosting feature dimensions. The comparable results assure the use of the latter, which is faster for classification. The AdaBoost is typically a classification between two classes. To solve the multi-class recognition problem, we propose to use a constrained majority voting strategy to largely reduce the number of pairwise comparisons, without losing the recognition accuracy. Experimental results on a large face database of 1079 faces of 137 individuals show the feasibility of our approach for fast face recognition.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115992999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
An integrated approach to 3D face model reconstruction from video 一种基于视频的三维人脸模型重建集成方法
Chia-Ming Cheng, S. Lai
We present an integrated system for reconstruction an individualized 3D head model from a video sequence. Our reconstruction algorithm is based on the adaptation of a generic 3D head model. 3D geometric constraints on the head model are computed from the robust bundle adjustment algorithm and the structure from silhouette method. These 3D constraints are integrated to adapt the generic head model via radial basis function interpolation. Then the texture map of the reconstructed 3D head model is obtained by integrating all the images in the sequence through appropriate weighting. The proposed face model reconstruction method has the advantages of efficient computation as well as robustness against noises and outliers.
我们提出了一个从视频序列中重建个性化3D头部模型的集成系统。我们的重建算法是基于一个通用的三维头部模型的适应。采用鲁棒束平差法和轮廓法计算头部模型的三维几何约束。通过径向基函数插值,将这些三维约束整合到通用头部模型中。然后对序列中的所有图像进行适当的加权积分,得到重建的三维头部模型的纹理图。所提出的人脸模型重建方法具有计算效率高、对噪声和离群值具有较强的鲁棒性等优点。
{"title":"An integrated approach to 3D face model reconstruction from video","authors":"Chia-Ming Cheng, S. Lai","doi":"10.1109/RATFG.2001.938905","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938905","url":null,"abstract":"We present an integrated system for reconstruction an individualized 3D head model from a video sequence. Our reconstruction algorithm is based on the adaptation of a generic 3D head model. 3D geometric constraints on the head model are computed from the robust bundle adjustment algorithm and the structure from silhouette method. These 3D constraints are integrated to adapt the generic head model via radial basis function interpolation. Then the texture map of the reconstructed 3D head model is obtained by integrating all the images in the sequence through appropriate weighting. The proposed face model reconstruction method has the advantages of efficient computation as well as robustness against noises and outliers.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124023640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Eyes 'n ears: face detection utilizing audio and video cues 眼睛和耳朵:利用音频和视频线索进行人脸检测
B. Kapralos, Michael Jenkin, E. Milios, John K. Tsotsos
This work investigates the development of a robust and portable teleconferencing system utilizing both audio and video cues. An omnidirectional video sensor is used to provide a view of the entire visual hemisphere thereby providing multiple dynamic views of the participants. Regions of skin are detected using simple statistical methods, along with histogram color models for both skin and non-skin color classes. Skin regions belonging to the same person are grouped together. Using simple geometrical properties, the location of each person's face in the "real world" is estimated and provided to the audio system as a possible sound source direction. Beamforming and sound detection techniques with a small, compact microphone array allows the audio system to detect and attend to the speech of each participant, thereby reducing unwanted noise and sounds emanating from other locations. The results of experiments conducted in normal, reverberant environments indicate the effectiveness of both the audio and video systems.
这项工作调查了一个强大的和便携式的电话会议系统利用音频和视频线索的发展。全向视频传感器用于提供整个视觉半球的视图,从而提供参与者的多个动态视图。使用简单的统计方法检测皮肤区域,以及皮肤和非皮肤颜色类的直方图颜色模型。属于同一个人的皮肤区域被分组在一起。利用简单的几何属性,估计每个人在“真实世界”中的位置,并作为可能的声源方向提供给音频系统。波束成形和声音检测技术与一个小,紧凑的麦克风阵列允许音频系统检测和关注每个参与者的讲话,从而减少不必要的噪音和声音从其他位置发出。在正常混响环境下进行的实验结果表明了音频和视频系统的有效性。
{"title":"Eyes 'n ears: face detection utilizing audio and video cues","authors":"B. Kapralos, Michael Jenkin, E. Milios, John K. Tsotsos","doi":"10.1109/RATFG.2001.938918","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938918","url":null,"abstract":"This work investigates the development of a robust and portable teleconferencing system utilizing both audio and video cues. An omnidirectional video sensor is used to provide a view of the entire visual hemisphere thereby providing multiple dynamic views of the participants. Regions of skin are detected using simple statistical methods, along with histogram color models for both skin and non-skin color classes. Skin regions belonging to the same person are grouped together. Using simple geometrical properties, the location of each person's face in the \"real world\" is estimated and provided to the audio system as a possible sound source direction. Beamforming and sound detection techniques with a small, compact microphone array allows the audio system to detect and attend to the speech of each participant, thereby reducing unwanted noise and sounds emanating from other locations. The results of experiments conducted in normal, reverberant environments indicate the effectiveness of both the audio and video systems.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132002195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Hybrid face recognition systems for profile views using the MUGSHOT database 使用MUGSHOT数据库的混合人脸识别系统
F. Wallhoff, Stefan Müller, G. Rigoll
Face recognition has established itself as an important sub-branch of pattern recognition within the field of computer science. Many state-of-the-art systems have focused on the task of recognizing frontal views or images with just slight variations in head pose and facial expression of people. We concentrate on two approaches to recognize profile views (90 degrees) with previous knowledge of only the frontal view, which is a challenging task even for human beings. The first presented system makes use of synthesized profile views and the second one uses a joint parameter estimation technique. The systems we present combine artificial neural networks (NN) and a modeling technique based on hidden Markov models (HMM). One of the main ideas of these systems is to perform the recognition task without the use of any 3D-information of heads and faces such as a physical 3D-models, for instance. Instead, we represent the rotation process by a NN, which has been trained with prior knowledge derived from image pairs showing the same person's frontal and profile view. Another important restriction to this task is that we use exactly one example frontal view to train the system to recognize the corresponding profile view for a previously unseen individual. The presented systems are tested with a sub-set of the MUGSHOT database.
人脸识别已成为计算机科学领域中模式识别的一个重要分支。许多最先进的系统都专注于识别正面视图或图像的任务,这些图像只是头部姿势和面部表情的轻微变化。我们专注于两种方法来识别侧面视图(90度),而之前只知道正面视图,这对人类来说也是一项具有挑战性的任务。第一个系统采用综合轮廓视图,第二个系统采用联合参数估计技术。我们提出的系统结合了人工神经网络(NN)和基于隐马尔可夫模型(HMM)的建模技术。这些系统的主要思想之一是在不使用任何头部和面部的3d信息(例如物理3d模型)的情况下执行识别任务。相反,我们用一个神经网络来表示旋转过程,这个神经网络是用来自显示同一个人正面和侧面视图的图像对的先验知识训练的。这项任务的另一个重要限制是,我们只使用一个正面视图示例来训练系统识别先前未见过的个体的相应侧面视图。使用MUGSHOT数据库的一个子集对所提出的系统进行了测试。
{"title":"Hybrid face recognition systems for profile views using the MUGSHOT database","authors":"F. Wallhoff, Stefan Müller, G. Rigoll","doi":"10.1109/RATFG.2001.938924","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938924","url":null,"abstract":"Face recognition has established itself as an important sub-branch of pattern recognition within the field of computer science. Many state-of-the-art systems have focused on the task of recognizing frontal views or images with just slight variations in head pose and facial expression of people. We concentrate on two approaches to recognize profile views (90 degrees) with previous knowledge of only the frontal view, which is a challenging task even for human beings. The first presented system makes use of synthesized profile views and the second one uses a joint parameter estimation technique. The systems we present combine artificial neural networks (NN) and a modeling technique based on hidden Markov models (HMM). One of the main ideas of these systems is to perform the recognition task without the use of any 3D-information of heads and faces such as a physical 3D-models, for instance. Instead, we represent the rotation process by a NN, which has been trained with prior knowledge derived from image pairs showing the same person's frontal and profile view. Another important restriction to this task is that we use exactly one example frontal view to train the system to recognize the corresponding profile view for a previously unseen individual. The presented systems are tested with a sub-set of the MUGSHOT database.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124199782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Dynamic time warping for off-line recognition of a small gesture vocabulary 用于离线识别小手势词汇的动态时间扭曲
A. Corradini
We focus on the visual sensory information to recognize human activity in form of hand-arm movements from a small, predefined vocabulary. We accomplish this task by means of a matching technique by determining the distance between the unknown input and a set of previously defined templates. A dynamic time warping algorithm is used to perform the time alignment and normalization by computing a temporal transformation allowing the two signals to be matched. The system is trained with finite video sequences of single gesture performances whose start and end-point are accurately known. Preliminary experiments are accomplished off-line and result in a recognition accuracy of up to 92%.
我们专注于视觉感官信息,从一个小的,预定义的词汇表中识别手部运动形式的人类活动。我们通过确定未知输入与一组先前定义的模板之间的距离来匹配技术来完成这项任务。动态时间规整算法通过计算时间变换使两个信号匹配,从而实现时间对齐和归一化。该系统使用有限的单手势表演视频序列进行训练,这些视频序列的起点和终点都是精确已知的。初步实验在离线状态下完成,识别准确率高达92%。
{"title":"Dynamic time warping for off-line recognition of a small gesture vocabulary","authors":"A. Corradini","doi":"10.1109/RATFG.2001.938914","DOIUrl":"https://doi.org/10.1109/RATFG.2001.938914","url":null,"abstract":"We focus on the visual sensory information to recognize human activity in form of hand-arm movements from a small, predefined vocabulary. We accomplish this task by means of a matching technique by determining the distance between the unknown input and a set of previously defined templates. A dynamic time warping algorithm is used to perform the time alignment and normalization by computing a temporal transformation allowing the two signals to be matched. The system is trained with finite video sequences of single gesture performances whose start and end-point are accurately known. Preliminary experiments are accomplished off-line and result in a recognition accuracy of up to 92%.","PeriodicalId":355094,"journal":{"name":"Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128817920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 192
期刊
Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1