首页 > 最新文献

Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)最新文献

英文 中文
A fast and accurate face detector for indexation of face images 一个快速和准确的人脸检测器的索引的人脸图像
Raphaël Féraud, O. Bernier, J. Viallet, M. Collobert
Detecting faces in images with complex backgrounds is a difficult task. Our approach, which obtains state-of-the-art results, is based on a generative neural network model: the constrained generative model (CGM). To detect side-view faces and to decrease the number of false alarms, a conditional mixture of networks is used. To decrease the computational time cost, a fast search algorithm is proposed. The level of performance reached, in terms of detection accuracy and processing time, allows us to apply this detector to a real-world application: the indexation of face images on the WWW.
在具有复杂背景的图像中检测人脸是一项艰巨的任务。我们的方法,得到了最先进的结果,是基于生成神经网络模型:约束生成模型(CGM)。为了检测侧视面并减少误报的数量,使用了一种条件混合网络。为了减少计算时间开销,提出了一种快速搜索算法。就检测精度和处理时间而言,所达到的性能水平使我们能够将该检测器应用于现实世界的应用:WWW上人脸图像的索引。
{"title":"A fast and accurate face detector for indexation of face images","authors":"Raphaël Féraud, O. Bernier, J. Viallet, M. Collobert","doi":"10.1109/AFGR.2000.840615","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840615","url":null,"abstract":"Detecting faces in images with complex backgrounds is a difficult task. Our approach, which obtains state-of-the-art results, is based on a generative neural network model: the constrained generative model (CGM). To detect side-view faces and to decrease the number of false alarms, a conditional mixture of networks is used. To decrease the computational time cost, a fast search algorithm is proposed. The level of performance reached, in terms of detection accuracy and processing time, allows us to apply this detector to a real-world application: the indexation of face images on the WWW.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123849727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Comparison of confidence measures for face recognition 人脸识别置信测度的比较
S. Eickeler, Mirco Jabs, G. Rigoll
This paper compares different confidence measures for the results of statistical face recognition systems. The main applications of a confidence measure are rejection of unknown people and the detection of recognition errors. Some of the confidence measures are based on the posterior probability and some on the ranking of the recognition results. The posterior probability is calculated by applying Bayes' rule with different ways to approximate the unconditional likelihood. The confidence measure based on the ranking is a new method. Experiments to evaluate the confidence measures are carried out on a pseudo 2D hidden Markov model-based face recognition system and the Bochum face database.
本文比较了统计人脸识别系统结果的不同置信度度量。置信度的主要应用是拒绝未知的人和识别错误的检测。一些置信度度量基于后验概率,一些基于识别结果的排序。后验概率的计算采用贝叶斯规则,用不同的方法逼近无条件似然。基于排序的置信度度量是一种新的度量方法。在基于伪二维隐马尔可夫模型的人脸识别系统和Bochum人脸数据库上进行了置信度评价实验。
{"title":"Comparison of confidence measures for face recognition","authors":"S. Eickeler, Mirco Jabs, G. Rigoll","doi":"10.1109/AFGR.2000.840644","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840644","url":null,"abstract":"This paper compares different confidence measures for the results of statistical face recognition systems. The main applications of a confidence measure are rejection of unknown people and the detection of recognition errors. Some of the confidence measures are based on the posterior probability and some on the ranking of the recognition results. The posterior probability is calculated by applying Bayes' rule with different ways to approximate the unconditional likelihood. The confidence measure based on the ranking is a new method. Experiments to evaluate the confidence measures are carried out on a pseudo 2D hidden Markov model-based face recognition system and the Bochum face database.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129739061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Real-time detection of nodding and head-shaking by directly detecting and tracking the "between-eyes" 直接检测和跟踪“眉心”实时检测点头和摇头
S. Kawato, J. Ohya
Among head gestures, nodding and head-shaking are very common and used often. Thus the detection of such gestures is basic to a visual understanding of human responses. However it is difficult to detect them in real-time, because nodding and head-shaking are fairly small and fast head movements. We propose an approach for detecting nodding and head-shaking in real time from a single color video stream by directly detecting and tracking a point between the eyes, or what we call the "between-eyes". Along a circle of a certain radius centered at the "between-eyes", the pixel value has two cycles of bright parts (forehead and nose bridge) and dark parts (eyes and brows). The output of the proposed circle-frequency filter has a local maximum at these characteristic points. To distinguish the true "between-eyes" from similar characteristic points in other face parts, we do a confirmation with eye detection. Once the "between-eyes" is detected, a small area around it is copied as a template and the system enters the tracking mode. Combining with the circle-frequency filtering and the template, the tracking is done not by searching around but by selecting candidates using the template; the template is then updated. Due to this special tracking algorithm, the system can track the "between-eyes" stably and accurately. It runs at 13 frames/s rate without special hardware. By analyzing the movement of the point, we can detect nodding and head-shaking. Some experimental results are shown.
在头部动作中,点头和摇头是非常常见和经常使用的。因此,检测这些手势是视觉理解人类反应的基础。然而,很难实时检测到它们,因为点头和摇头是相当小而快速的头部运动。我们提出了一种方法,通过直接检测和跟踪眼睛之间的一个点,或者我们称之为“眼睛之间”,从单个彩色视频流中实时检测点头和摇头。沿着以“眉心”为中心的一定半径的圆圈,像素值有两个循环,明亮部分(前额和鼻梁)和黑暗部分(眼睛和眉毛)。所提出的圆频滤波器的输出在这些特征点处具有局部最大值。为了区分真正的“两眼之间”和其他面部部位的相似特征点,我们用眼睛检测进行了确认。一旦检测到“眼间”,它周围的一小块区域就会被复制为模板,系统就会进入跟踪模式。将圆频滤波与模板相结合,利用模板选择候选目标,而不是搜索目标;然后更新模板。由于这种特殊的跟踪算法,系统可以稳定准确地跟踪“眼间”。它运行在13帧/秒的速率没有特殊的硬件。通过分析点的运动,我们可以检测到点头和摇头。给出了一些实验结果。
{"title":"Real-time detection of nodding and head-shaking by directly detecting and tracking the \"between-eyes\"","authors":"S. Kawato, J. Ohya","doi":"10.1109/AFGR.2000.840610","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840610","url":null,"abstract":"Among head gestures, nodding and head-shaking are very common and used often. Thus the detection of such gestures is basic to a visual understanding of human responses. However it is difficult to detect them in real-time, because nodding and head-shaking are fairly small and fast head movements. We propose an approach for detecting nodding and head-shaking in real time from a single color video stream by directly detecting and tracking a point between the eyes, or what we call the \"between-eyes\". Along a circle of a certain radius centered at the \"between-eyes\", the pixel value has two cycles of bright parts (forehead and nose bridge) and dark parts (eyes and brows). The output of the proposed circle-frequency filter has a local maximum at these characteristic points. To distinguish the true \"between-eyes\" from similar characteristic points in other face parts, we do a confirmation with eye detection. Once the \"between-eyes\" is detected, a small area around it is copied as a template and the system enters the tracking mode. Combining with the circle-frequency filtering and the template, the tracking is done not by searching around but by selecting candidates using the template; the template is then updated. Due to this special tracking algorithm, the system can track the \"between-eyes\" stably and accurately. It runs at 13 frames/s rate without special hardware. By analyzing the movement of the point, we can detect nodding and head-shaking. Some experimental results are shown.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114357078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 139
Understanding purposeful human motion 理解有目的的人体动作
C. Wren, B. Clarkson, A. Pentland
Human motion can be understood on many levels. The most basic level is the notion that humans are collections of things that have predictable visual appearance. Next is the notion that humans exist in a physical universe, as a consequence of this, a large part of human motion can be modeled and predicted with the laws of physics. Finally there is the notion that humans utilize muscles to actively shape purposeful motion. We employ a recursive framework for real-time, 3D tracking of human motion that enables pixel-level, probabilistic processes to take advantage of the contextual knowledge encoded in the higher-level models, including models of dynamic constraints on human motion. We show that models of purposeful action arise naturally from this framework, and further, that those models can be used to improve the perception of human motion. Results are shown that demonstrate automatic discovery of features in this new feature space.
人类的运动可以从很多层面来理解。最基本的概念是,人类是具有可预测视觉外观的事物的集合。其次是人类存在于一个物理宇宙的概念,因此,人类的大部分运动可以用物理定律来建模和预测。最后一种观点认为,人类利用肌肉主动塑造有目的的动作。我们采用递归框架对人体运动进行实时3D跟踪,使像素级概率过程能够利用高级模型中编码的上下文知识,包括人体运动的动态约束模型。我们表明,有目的的行为模型从这个框架中自然产生,而且,这些模型可以用来提高对人类运动的感知。结果表明,在新的特征空间中实现了特征的自动发现。
{"title":"Understanding purposeful human motion","authors":"C. Wren, B. Clarkson, A. Pentland","doi":"10.1109/AFGR.2000.840662","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840662","url":null,"abstract":"Human motion can be understood on many levels. The most basic level is the notion that humans are collections of things that have predictable visual appearance. Next is the notion that humans exist in a physical universe, as a consequence of this, a large part of human motion can be modeled and predicted with the laws of physics. Finally there is the notion that humans utilize muscles to actively shape purposeful motion. We employ a recursive framework for real-time, 3D tracking of human motion that enables pixel-level, probabilistic processes to take advantage of the contextual knowledge encoded in the higher-level models, including models of dynamic constraints on human motion. We show that models of purposeful action arise naturally from this framework, and further, that those models can be used to improve the perception of human motion. Results are shown that demonstrate automatic discovery of features in this new feature space.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133011684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
A framework for modeling the appearance of 3D articulated figures 一种用于建模三维铰接图形外观的框架
H. Kjellström, F. D. L. Torre, Michael J. Black
This paper describes a framework for constructing a linear subspace model of image appearance for complex articulated 3D figures such as humans and other animals. A commercial motion capture system provides 3D data that is aligned with images of subjects performing various activities. Portions of a limb's image appearance are seen from multiple views and for multiple subjects. From these partial views, weighted principal component analysis is used to construct a linear subspace representation of the "unwrapped" image appearance of each limb. The linear subspaces provide a generative model of the object appearance that is exploited in a Bayesian particle filtering tracking system. Results of tracking single limbs and walking humans are presented.
本文描述了一个用于构建复杂三维图形(如人类和其他动物)图像外观的线性子空间模型的框架。商业动作捕捉系统提供与执行各种活动的受试者的图像对齐的3D数据。肢体的部分图像外观可以从多个角度和多个主题中看到。从这些部分视图中,加权主成分分析用于构建每个肢体“未包裹”图像外观的线性子空间表示。线性子空间提供了在贝叶斯粒子滤波跟踪系统中利用的对象外观的生成模型。给出了单肢跟踪和行走人体跟踪的结果。
{"title":"A framework for modeling the appearance of 3D articulated figures","authors":"H. Kjellström, F. D. L. Torre, Michael J. Black","doi":"10.1109/AFGR.2000.840661","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840661","url":null,"abstract":"This paper describes a framework for constructing a linear subspace model of image appearance for complex articulated 3D figures such as humans and other animals. A commercial motion capture system provides 3D data that is aligned with images of subjects performing various activities. Portions of a limb's image appearance are seen from multiple views and for multiple subjects. From these partial views, weighted principal component analysis is used to construct a linear subspace representation of the \"unwrapped\" image appearance of each limb. The linear subspaces provide a generative model of the object appearance that is exploited in a Bayesian particle filtering tracking system. Results of tracking single limbs and walking humans are presented.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127856275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Hallucinating faces 产生幻觉的脸
Simon Baker, T. Kanade
Faces often appear very small in surveillance imagery because of the wide fields of view that are typically used and the relatively large distance between the cameras and the scene. For tasks such as face recognition, resolution enhancement techniques are therefore generally needed. Although numerous resolution enhancement algorithms have been proposed in the literature, most of them are limited by the fact that they make weak, if any, assumptions about the scene. We propose an algorithm to learn a prior on the spatial distribution of the image gradient for frontal images of faces. We proceed to show how such a prior can be incorporated into a resolution enhancement algorithm to yield 4- to 8-fold improvements in resolution (i.e., 16 to 64 times as many pixels). The additional pixels are, in effect, hallucinated.
在监控图像中,人脸通常显得非常小,因为通常使用的是宽视场,并且摄像机与场景之间的距离相对较大。因此,对于人脸识别等任务,通常需要分辨率增强技术。虽然文献中已经提出了许多分辨率增强算法,但它们中的大多数都受到这样一个事实的限制:它们对场景的假设很弱(如果有的话)。提出了一种学习人脸正面图像梯度空间分布先验的算法。我们继续展示如何将这样的先验合并到分辨率增强算法中,以产生4到8倍的分辨率改进(即16到64倍的像素)。实际上,额外的像素是幻觉。
{"title":"Hallucinating faces","authors":"Simon Baker, T. Kanade","doi":"10.1109/AFGR.2000.840616","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840616","url":null,"abstract":"Faces often appear very small in surveillance imagery because of the wide fields of view that are typically used and the relatively large distance between the cameras and the scene. For tasks such as face recognition, resolution enhancement techniques are therefore generally needed. Although numerous resolution enhancement algorithms have been proposed in the literature, most of them are limited by the fact that they make weak, if any, assumptions about the scene. We propose an algorithm to learn a prior on the spatial distribution of the image gradient for frontal images of faces. We proceed to show how such a prior can be incorporated into a resolution enhancement algorithm to yield 4- to 8-fold improvements in resolution (i.e., 16 to 64 times as many pixels). The additional pixels are, in effect, hallucinated.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134504098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 616
Person tracking in real-world scenarios using statistical methods 使用统计方法在真实场景中进行人员跟踪
G. Rigoll, S. Eickeler, Stefan Müller
This paper presents a novel approach to robust and flexible person tracking using an algorithm that combines two powerful stochastic modeling techniques: pseudo-2D hidden Markov models (P2DHMM) used for capturing the shape of a person within an image frame, and the well-known Kalman-filtering algorithm, that uses the output of the P2DHMM for tracking the person by estimation of a bounding box trajectory indicating the location of the person within the entire video sequence. Both algorithms cooperate together in an optimal way, and with this co-operative feedback, the proposed approach even makes the tracking of people possible in the presence of background motions caused by moving objects or by camera operations as, e.g., panning or zooming. Our results are confirmed by several tracking examples in real scenarios, shown at the end of the paper and provided on the Web server of our institute.
本文提出了一种新颖的鲁棒和灵活的人跟踪方法,使用一种结合了两种强大的随机建模技术的算法:伪2d隐马尔可夫模型(P2DHMM),用于捕获图像帧内的人的形状,以及著名的卡尔曼滤波算法,该算法使用P2DHMM的输出通过估计一个边界框轨迹来跟踪人,该边界框轨迹指示整个视频序列中人的位置。两种算法以最优的方式协同工作,并且通过这种协同反馈,所提出的方法甚至可以在由移动物体或摄像机操作(例如平移或缩放)引起的背景运动存在的情况下跟踪人。我们的结果被几个真实场景中的跟踪示例所证实,这些示例在论文的末尾显示,并在我们研究所的Web服务器上提供。
{"title":"Person tracking in real-world scenarios using statistical methods","authors":"G. Rigoll, S. Eickeler, Stefan Müller","doi":"10.1109/AFGR.2000.840657","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840657","url":null,"abstract":"This paper presents a novel approach to robust and flexible person tracking using an algorithm that combines two powerful stochastic modeling techniques: pseudo-2D hidden Markov models (P2DHMM) used for capturing the shape of a person within an image frame, and the well-known Kalman-filtering algorithm, that uses the output of the P2DHMM for tracking the person by estimation of a bounding box trajectory indicating the location of the person within the entire video sequence. Both algorithms cooperate together in an optimal way, and with this co-operative feedback, the proposed approach even makes the tracking of people possible in the presence of background motions caused by moving objects or by camera operations as, e.g., panning or zooming. Our results are confirmed by several tracking examples in real scenarios, shown at the end of the paper and provided on the Web server of our institute.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115393313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Comprehensive database for facial expression analysis 面部表情分析综合数据库
T. Kanade, Ying-li Tian, J. Cohn
Within the past decade, significant effort has occurred in developing methods of facial expression analysis. Because most investigators have used relatively limited data sets, the generalizability of these various methods remains unknown. We describe the problem space for facial expression analysis, which includes level of description, transitions among expressions, eliciting conditions, reliability and validity of training and test data, individual differences in subjects, head orientation and scene complexity image characteristics, and relation to non-verbal behavior. We then present the CMU-Pittsburgh AU-Coded Face Expression Image Database, which currently includes 2105 digitized image sequences from 182 adult subjects of varying ethnicity, performing multiple tokens of most primary FACS action units. This database is the most comprehensive testbed to date for comparative studies of facial expression analysis.
在过去的十年里,人们在开发面部表情分析方法方面做出了巨大的努力。由于大多数研究人员使用的数据集相对有限,这些不同方法的普遍性仍然未知。我们描述了面部表情分析的问题空间,包括描述水平、表情之间的转换、引出条件、训练和测试数据的信度和效度、受试者的个体差异、头部方向和场景复杂性、图像特征以及与非语言行为的关系。然后,我们提出了CMU-Pittsburgh au编码的面部表情图像数据库,该数据库目前包括来自182个不同种族的成人受试者的2105个数字化图像序列,执行了大多数主要FACS动作单元的多个标记。这个数据库是迄今为止最全面的面部表情分析比较研究的试验台。
{"title":"Comprehensive database for facial expression analysis","authors":"T. Kanade, Ying-li Tian, J. Cohn","doi":"10.1109/AFGR.2000.840611","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840611","url":null,"abstract":"Within the past decade, significant effort has occurred in developing methods of facial expression analysis. Because most investigators have used relatively limited data sets, the generalizability of these various methods remains unknown. We describe the problem space for facial expression analysis, which includes level of description, transitions among expressions, eliciting conditions, reliability and validity of training and test data, individual differences in subjects, head orientation and scene complexity image characteristics, and relation to non-verbal behavior. We then present the CMU-Pittsburgh AU-Coded Face Expression Image Database, which currently includes 2105 digitized image sequences from 182 adult subjects of varying ethnicity, performing multiple tokens of most primary FACS action units. This database is the most comprehensive testbed to date for comparative studies of facial expression analysis.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"355 14-15","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120931368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2809
An incremental learning method for face recognition under continuous video stream 连续视频流下人脸识别的增量学习方法
J. Weng, C. Evans, Wey-Shiuan Hwang
The current technology in computer vision requires humans to collect images, store images, segment images for computers and train computer recognition systems using these images. It is unlikely that such a manual labor process can meet the demands of many challenging recognition tasks. Our goal is to enable machines to learn directly from sensory input streams while interacting with the environment including human teachers. We propose a new technique which incrementally derives discriminating features in the input space. Virtual labels are formed by clustering in the output space to extract discriminating features in the input space. We organize the resulting discriminating subspace in a coarse-to-fine fashion and store the information in a decision tree. Such an incremental hierarchical discriminating regression (IHDR) decision tree can be modeled by a hierarchical probability distribution model. We demonstrate the performance of the algorithm on the problem of face recognition using video sequences of 33889 frames in length from 143 different subjects. A correct recognition rate of 95.1% has been achieved.
目前的计算机视觉技术需要人类为计算机收集图像、存储图像、分割图像并使用这些图像训练计算机识别系统。这种手工劳动过程不太可能满足许多具有挑战性的识别任务的要求。我们的目标是使机器能够直接从感官输入流中学习,同时与环境(包括人类教师)互动。我们提出了一种在输入空间中增量提取判别特征的新技术。在输出空间中聚类形成虚拟标签,提取输入空间中的判别特征。我们以粗到精的方式组织得到的判别子空间,并将信息存储在决策树中。这种增量层次判别回归(IHDR)决策树可以用层次概率分布模型来建模。我们使用来自143个不同主题的长度为33889帧的视频序列来演示该算法在人脸识别问题上的性能。正确识别率达到95.1%。
{"title":"An incremental learning method for face recognition under continuous video stream","authors":"J. Weng, C. Evans, Wey-Shiuan Hwang","doi":"10.1109/AFGR.2000.840643","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840643","url":null,"abstract":"The current technology in computer vision requires humans to collect images, store images, segment images for computers and train computer recognition systems using these images. It is unlikely that such a manual labor process can meet the demands of many challenging recognition tasks. Our goal is to enable machines to learn directly from sensory input streams while interacting with the environment including human teachers. We propose a new technique which incrementally derives discriminating features in the input space. Virtual labels are formed by clustering in the output space to extract discriminating features in the input space. We organize the resulting discriminating subspace in a coarse-to-fine fashion and store the information in a decision tree. Such an incremental hierarchical discriminating regression (IHDR) decision tree can be modeled by a hierarchical probability distribution model. We demonstrate the performance of the algorithm on the problem of face recognition using video sequences of 33889 frames in length from 143 different subjects. A correct recognition rate of 95.1% has been achieved.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116626421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
A virtual 3D blackboard: 3D finger tracking using a single camera 虚拟3D黑板:使用单个摄像头进行3D手指跟踪
Andrew Wu, M. Shah, N. Lobo
We present a method for tracking the 3D position of a finger, using a single camera placed several meters away from the user. After skin detection, we use motion to identify the gesticulating arm. The finger point is found by analyzing the arm's outline. To derive a 3D trajectory, we first track 2D positions of the user's elbow and shoulder. Given that a human's upper arm and lower arm have consistent length, we observe that the possible locations of a finger and elbow form two spheres with constant radii. From the previously tracked body points, we can reconstruct these spheres, computing the 3D position of the elbow and finger. These steps are fully automated and do not require human intervention. The system presented can be used as a visualization tool, or as a user input interface, in cases when the user would rather not be constrained by the camera system.
我们提出了一种方法来跟踪一个手指的3D位置,使用一个单独的相机放置在几米远的用户。在皮肤检测之后,我们使用动作来识别手势手臂。手指点是通过分析手臂的轮廓找到的。为了导出3D轨迹,我们首先跟踪用户肘部和肩部的2D位置。鉴于人的上臂和下臂长度一致,我们观察到手指和肘部的可能位置形成半径恒定的两个球体。从之前跟踪的身体点,我们可以重建这些球体,计算肘部和手指的3D位置。这些步骤是完全自动化的,不需要人为干预。当用户不希望被相机系统所约束时,所呈现的系统可以用作可视化工具,或者作为用户输入界面。
{"title":"A virtual 3D blackboard: 3D finger tracking using a single camera","authors":"Andrew Wu, M. Shah, N. Lobo","doi":"10.1109/AFGR.2000.840686","DOIUrl":"https://doi.org/10.1109/AFGR.2000.840686","url":null,"abstract":"We present a method for tracking the 3D position of a finger, using a single camera placed several meters away from the user. After skin detection, we use motion to identify the gesticulating arm. The finger point is found by analyzing the arm's outline. To derive a 3D trajectory, we first track 2D positions of the user's elbow and shoulder. Given that a human's upper arm and lower arm have consistent length, we observe that the possible locations of a finger and elbow form two spheres with constant radii. From the previously tracked body points, we can reconstruct these spheres, computing the 3D position of the elbow and finger. These steps are fully automated and do not require human intervention. The system presented can be used as a visualization tool, or as a user input interface, in cases when the user would rather not be constrained by the camera system.","PeriodicalId":360065,"journal":{"name":"Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127555462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
期刊
Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1