Face recognition can be considered as a one-class classification problem and associative memory (AM) based approaches have been proven efficient in previous studies. In this paper, a kernel associative memory (KAM) based face recognition scheme with a multiscale Gabor transform, is proposed, in our method, face images of each person are first decomposed into their multiscale representations by a quasi-complete Gabor transform, which are then modelled by kernel associative memories, The pyramidal multi-scale Gabor wavelet transform not only provides a very efficient implementation of Gabor transform in spatial domain, but also permits a fast reconstruction. In the testing phase, a query face image is also represented by a Gabor multiresolution pyramid and the recalled results from different KAM models corresponding to even Gabor channels are then simply added together to provide a reconstruction. The recognition scheme was thoroughly tested using several benchmark face datasets, including the AR faces, UMIST faces, JAFFE faces and Yale A faces. The experiment results have demonstrated strong robustness in recognizing faces under different conditions, particularly the poses alterations, varying occlusions and expression changes
{"title":"Robust face recognition by multiscale kernel associative memory models based on hierarchical spatial-domain Gabor transforms","authors":"Bailing Zhang, C. Leung","doi":"10.1109/FGR.2006.95","DOIUrl":"https://doi.org/10.1109/FGR.2006.95","url":null,"abstract":"Face recognition can be considered as a one-class classification problem and associative memory (AM) based approaches have been proven efficient in previous studies. In this paper, a kernel associative memory (KAM) based face recognition scheme with a multiscale Gabor transform, is proposed, in our method, face images of each person are first decomposed into their multiscale representations by a quasi-complete Gabor transform, which are then modelled by kernel associative memories, The pyramidal multi-scale Gabor wavelet transform not only provides a very efficient implementation of Gabor transform in spatial domain, but also permits a fast reconstruction. In the testing phase, a query face image is also represented by a Gabor multiresolution pyramid and the recalled results from different KAM models corresponding to even Gabor channels are then simply added together to provide a reconstruction. The recognition scheme was thoroughly tested using several benchmark face datasets, including the AR faces, UMIST faces, JAFFE faces and Yale A faces. The experiment results have demonstrated strong robustness in recognizing faces under different conditions, particularly the poses alterations, varying occlusions and expression changes","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130940778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a full-body gesture database which contains 2D video data and 3D motion data of 14 normal gestures, 10 abnormal gestures and 30 command gestures for 20 subjects. We call this database the Korea University Gesture (KUG) database. Using 3D motion cameras and 3 sets of stereo cameras, we captured 3D motion data and 3 pairs of stereo-video data at 3 different directions for normal and abnormal gestures. In case of command gestures, 2 pairs of stereo-video data is obtained by 2 sets of stereo cameras with different focal length in order to effectively capture views of whole body and upper body, simultaneously. In addition to these, the 2D silhouette data is synthesized by separating a subject and background in 2D stereo-video data and saved as binary mask images. In this paper, we describe the gesture capture system, the organization of database, the potential usages of the database and the way of obtaining the KUG database
{"title":"A full-body gesture database for automatic gesture recognition","authors":"Bon-Woo Hwang, Sungmin Kim, Seong-Whan Lee","doi":"10.1109/FGR.2006.8","DOIUrl":"https://doi.org/10.1109/FGR.2006.8","url":null,"abstract":"This paper presents a full-body gesture database which contains 2D video data and 3D motion data of 14 normal gestures, 10 abnormal gestures and 30 command gestures for 20 subjects. We call this database the Korea University Gesture (KUG) database. Using 3D motion cameras and 3 sets of stereo cameras, we captured 3D motion data and 3 pairs of stereo-video data at 3 different directions for normal and abnormal gestures. In case of command gestures, 2 pairs of stereo-video data is obtained by 2 sets of stereo cameras with different focal length in order to effectively capture views of whole body and upper body, simultaneously. In addition to these, the 2D silhouette data is synthesized by separating a subject and background in 2D stereo-video data and saved as binary mask images. In this paper, we describe the gesture capture system, the organization of database, the potential usages of the database and the way of obtaining the KUG database","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"2008 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125597368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe an accurate and robust method of locating facial features. The method utilises a set of feature templates in conjunction with a shape constrained search technique. The current feature templates are correlated with the target image to generate a set of response surfaces. The parameters of a statistical shape model are optimised to maximise the sum of responses. Given the new feature locations the feature templates are updated using a nearest neighbour approach to select likely feature templates from the training set. We find that this template selection tracker (TST) method outperforms previous approaches using fixed template feature detectors. It gives results similar to the more complex active appearance model (AAM) algorithm on two publicly available static image sets and outperforms the AAM on a more challenging set of in-car face sequences
{"title":"Facial feature detection and tracking with automatic template selection","authors":"David Cristinacce, Tim Cootes","doi":"10.1109/FGR.2006.50","DOIUrl":"https://doi.org/10.1109/FGR.2006.50","url":null,"abstract":"We describe an accurate and robust method of locating facial features. The method utilises a set of feature templates in conjunction with a shape constrained search technique. The current feature templates are correlated with the target image to generate a set of response surfaces. The parameters of a statistical shape model are optimised to maximise the sum of responses. Given the new feature locations the feature templates are updated using a nearest neighbour approach to select likely feature templates from the training set. We find that this template selection tracker (TST) method outperforms previous approaches using fixed template feature detectors. It gives results similar to the more complex active appearance model (AAM) algorithm on two publicly available static image sets and outperforms the AAM on a more challenging set of in-car face sequences","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115431907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Artificial face recognition systems typically do not attempt to handle very variable images. By comparison, human perceivers can recognize familiar faces over much more varied conditions. We describe a prototype face representation based on simple image-averaging. We have argued that this forms a good candidate for understanding human face perception. Here we examine the stability of these representations by asking (i) how quickly they converge; and (U) how resistant they are to contamination due to previous misidentifications. We conclude that face averages provide promising representations for use in artificial recognition
{"title":"Face Recognition from Unconstrained Images: Progress with Prototypes","authors":"R. Jenkins, A. Burton, D. White","doi":"10.1109/FGR.2006.45","DOIUrl":"https://doi.org/10.1109/FGR.2006.45","url":null,"abstract":"Artificial face recognition systems typically do not attempt to handle very variable images. By comparison, human perceivers can recognize familiar faces over much more varied conditions. We describe a prototype face representation based on simple image-averaging. We have argued that this forms a good candidate for understanding human face perception. Here we examine the stability of these representations by asking (i) how quickly they converge; and (U) how resistant they are to contamination due to previous misidentifications. We conclude that face averages provide promising representations for use in artificial recognition","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129208528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatically recovering human poses from visual input is useful but challenging due to variations in image space and the high dimensionality of the pose space. In this paper, we assume that a human silhouette can be extracted from monocular visual input. We compare three shape descriptors that are used in the encoding of silhouettes: Fourier descriptors, shape contexts and Hu moments. An example-based approach is taken to recover upper body poses from these descriptors. We perform experiments with deformed silhouettes to test each descriptor's robustness against variations in body dimensions, viewpoint and noise. It is shown that Fourier descriptors and shape context histograms outperform Hu moments for all deformations
{"title":"Comparison of silhouette shape descriptors for example-based human pose recovery","authors":"R. Poppe, M. Poel","doi":"10.1109/FGR.2006.32","DOIUrl":"https://doi.org/10.1109/FGR.2006.32","url":null,"abstract":"Automatically recovering human poses from visual input is useful but challenging due to variations in image space and the high dimensionality of the pose space. In this paper, we assume that a human silhouette can be extracted from monocular visual input. We compare three shape descriptors that are used in the encoding of silhouettes: Fourier descriptors, shape contexts and Hu moments. An example-based approach is taken to recover upper body poses from these descriptors. We perform experiments with deformed silhouettes to test each descriptor's robustness against variations in body dimensions, viewpoint and noise. It is shown that Fourier descriptors and shape context histograms outperform Hu moments for all deformations","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127890458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A new method for face recognition, landmark model matching, is proposed in this paper. It is based on the concepts of elastic bunch graph matching and active shape model, and optimised with particle swarm optimisation. It is a fully automatic algorithm and can be used for face databases where only one image per person is available. A face is represented by a landmark model consisting of nodes labelled with jets and gray-level profiles. A landmark distribution model is created from a few training images. The model similarity between the landmark distribution model and the deformable landmark model that has to be fitted to the face in the image is maximised by particle swarm optimisation, to find the optimal model to represent the face. Improved results were obtained by this method compared with elastic bunch graph matching without optimisation
{"title":"Optimised landmark model matching for face recognition","authors":"R. Senaratne, S. Halgamuge","doi":"10.1109/FGR.2006.85","DOIUrl":"https://doi.org/10.1109/FGR.2006.85","url":null,"abstract":"A new method for face recognition, landmark model matching, is proposed in this paper. It is based on the concepts of elastic bunch graph matching and active shape model, and optimised with particle swarm optimisation. It is a fully automatic algorithm and can be used for face databases where only one image per person is available. A face is represented by a landmark model consisting of nodes labelled with jets and gray-level profiles. A landmark distribution model is created from a few training images. The model similarity between the landmark distribution model and the deformable landmark model that has to be fitted to the face in the image is maximised by particle swarm optimisation, to find the optimal model to represent the face. Improved results were obtained by this method compared with elastic bunch graph matching without optimisation","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126408187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a framework of age-group classification using facial images under various lighting conditions. Our method is based on the appearance-based approach that projects images from the original image space into a face-subspace. We propose a two-phased approach (2DLDA+LDA), which is based on 2DPCA and LDA. Our experimental results show that the new 2DLDA+LDA-based approach improves classification accuracy more than the conventional PCA-based and LDA-based approach. Moreover, the effectiveness of eliminating dimensions that do not contain important discriminative information is confirmed. The accuracy rates are 46.3%, 67.8% and 78.1% for age-groups that are in the 5-year, 10-year and 15-year range respectively
{"title":"Subspace-based age-group classification using facial images under various lighting conditions","authors":"K. Ueki, T. Hayashida, Tetsunori Kobayashi","doi":"10.1109/FGR.2006.102","DOIUrl":"https://doi.org/10.1109/FGR.2006.102","url":null,"abstract":"This paper presents a framework of age-group classification using facial images under various lighting conditions. Our method is based on the appearance-based approach that projects images from the original image space into a face-subspace. We propose a two-phased approach (2DLDA+LDA), which is based on 2DPCA and LDA. Our experimental results show that the new 2DLDA+LDA-based approach improves classification accuracy more than the conventional PCA-based and LDA-based approach. Moreover, the effectiveness of eliminating dimensions that do not contain important discriminative information is confirmed. The accuracy rates are 46.3%, 67.8% and 78.1% for age-groups that are in the 5-year, 10-year and 15-year range respectively","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126255139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We address the task of accurately localizing the eyes in face images extracted by a face detector, an important problem to be solved because of the negative effect of poor localization on face recognition accuracy. We investigate three approaches to the task: a regression approach aiming to directly minimize errors in the predicted eye positions, a simple Bayesian model of eye and non-eye appearance, and a discriminative eye detector trained using AdaBoost. By using identical training and test data for each method we are able to perform an unbiased comparison. We show that, perhaps surprisingly, the simple Bayesian approach performs best on databases including challenging images, and performance is comparable to more complex state-of-the-art methods
{"title":"Regression and classification approaches to eye localization in face images","authors":"M. Everingham, Andrew Zisserman","doi":"10.1109/FGR.2006.90","DOIUrl":"https://doi.org/10.1109/FGR.2006.90","url":null,"abstract":"We address the task of accurately localizing the eyes in face images extracted by a face detector, an important problem to be solved because of the negative effect of poor localization on face recognition accuracy. We investigate three approaches to the task: a regression approach aiming to directly minimize errors in the predicted eye positions, a simple Bayesian model of eye and non-eye appearance, and a discriminative eye detector trained using AdaBoost. By using identical training and test data for each method we are able to perform an unbiased comparison. We show that, perhaps surprisingly, the simple Bayesian approach performs best on databases including challenging images, and performance is comparable to more complex state-of-the-art methods","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126028870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gait recognition is used to identify individuals in image sequences by the way they walk. Nearly all of the approaches proposed for gait recognition are 2D methods based on analyzing image sequences captured by a single camera. In this paper, video sequences captured by multiple cameras are used as input, and then a human 3D model is set up. The motion is tracked by applying a local optimization algorithm. The lengths of key segments are extracted as static parameters, and the motion trajectories of lower limbs are used as dynamic features. Finally, linear time normalization is exploited for matching and recognition. The proposed method based on 3D tracking and recognition is robust to the changes of viewpoints. Moreover, better results are achieved for sequences containing difficult surface variations than with 2D methods, which prove the efficiency of our algorithm
{"title":"3D gait recognition using multiple cameras","authors":"Guoying Zhao, Guoyi Liu, Hua Li, M. Pietikäinen","doi":"10.1109/FGR.2006.2","DOIUrl":"https://doi.org/10.1109/FGR.2006.2","url":null,"abstract":"Gait recognition is used to identify individuals in image sequences by the way they walk. Nearly all of the approaches proposed for gait recognition are 2D methods based on analyzing image sequences captured by a single camera. In this paper, video sequences captured by multiple cameras are used as input, and then a human 3D model is set up. The motion is tracked by applying a local optimization algorithm. The lengths of key segments are extracted as static parameters, and the motion trajectories of lower limbs are used as dynamic features. Finally, linear time normalization is exploited for matching and recognition. The proposed method based on 3D tracking and recognition is robust to the changes of viewpoints. Moreover, better results are achieved for sequences containing difficult surface variations than with 2D methods, which prove the efficiency of our algorithm","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127380104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We examined the open issue of whether FACS action units (AUs) can be recognized more accurately by classifying local regions around the eyes, brows, and mouth compared to analyzing the face as a whole. Our empirical results showed that, contrary to our intuition, local expression analysis showed no consistent improvement in recognition accuracy. Moreover, global analysis outperformed local analysis on certain AUs of the eye and brow regions. We attributed this unexpected result partly to high correlations between different AUs in the Cohn-Kanade expression database. This underlines the importance of establishing a large, publicly available AU database with singly-occurring AUs to facilitate future research
{"title":"Local versus global segmentation for facial expression recognition","authors":"J. Whitehill, C. Omlin","doi":"10.1109/FGR.2006.74","DOIUrl":"https://doi.org/10.1109/FGR.2006.74","url":null,"abstract":"We examined the open issue of whether FACS action units (AUs) can be recognized more accurately by classifying local regions around the eyes, brows, and mouth compared to analyzing the face as a whole. Our empirical results showed that, contrary to our intuition, local expression analysis showed no consistent improvement in recognition accuracy. Moreover, global analysis outperformed local analysis on certain AUs of the eye and brow regions. We attributed this unexpected result partly to high correlations between different AUs in the Cohn-Kanade expression database. This underlines the importance of establishing a large, publicly available AU database with singly-occurring AUs to facilitate future research","PeriodicalId":109260,"journal":{"name":"7th International Conference on Automatic Face and Gesture Recognition (FGR06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130462364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}