Pub Date : 2015-05-19DOI: 10.1109/ICB.2015.7139093
Dayron Rizo-Rodriguez, D. Ziou
Face recognition under varying illumination conditions has benefited from the encapsulation of face features into a quaternion representation. In particular, a quaternion-based representation encoding four LBP descriptors (one-pixel to four-pixel radius) has delivered interesting recognition scores when using just one training image per subject. However, each coefficient of such a representation only encapsulates the eight sampling values required for computing a classic LBP code. In this paper, we propose a quaternion representation which encodes additional LBP codes at each radius. Consequently, a wider pixel descriptor is obtained because further sampling values are considered in the region surrounding the reference pixel. Illumination invariant face verification and identification experiments are conducted by using only one training face image. The representation proposed improves the recognition rates reported by the representation only encapsulating classic LBP codes.
{"title":"Quaternion-based Local Binary Patterns for illumination invariant face recognition","authors":"Dayron Rizo-Rodriguez, D. Ziou","doi":"10.1109/ICB.2015.7139093","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139093","url":null,"abstract":"Face recognition under varying illumination conditions has benefited from the encapsulation of face features into a quaternion representation. In particular, a quaternion-based representation encoding four LBP descriptors (one-pixel to four-pixel radius) has delivered interesting recognition scores when using just one training image per subject. However, each coefficient of such a representation only encapsulates the eight sampling values required for computing a classic LBP code. In this paper, we propose a quaternion representation which encodes additional LBP codes at each radius. Consequently, a wider pixel descriptor is obtained because further sampling values are considered in the region surrounding the reference pixel. Illumination invariant face verification and identification experiments are conducted by using only one training face image. The representation proposed improves the recognition rates reported by the representation only encapsulating classic LBP codes.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130403258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-19DOI: 10.1109/ICB.2015.7139105
C. Rathgeb, Harald Baier, C. Busch, Frank Breitinger
Conventional biometric identification systems require exhaustive 1 : N comparisons in order to identify biometric probes, i.e. comparison time frequently dominates the overall computational workload. Biometric database indexing represents a challenging task since biometric data is fuzzy and does not exhibit any natural sorting order. In this paper we present a preliminary study on the feasibility of applying Bloom filters for the purpose of iris biometric database indexing. It is shown, that by constructing a binary tree data structure of Bloom filters extracted from binary iris biometric templates (iris-codes) the search space can be reduced to O(logN). In experiments, which are carried out on a database of N = 256 classes, biometric performance (accuracy) is maintained for different conventional identification systems. Further, perspectives on how to employ the proposed scheme on large-scale databases are given.
{"title":"Towards Bloom filter-based indexing of iris biometric data","authors":"C. Rathgeb, Harald Baier, C. Busch, Frank Breitinger","doi":"10.1109/ICB.2015.7139105","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139105","url":null,"abstract":"Conventional biometric identification systems require exhaustive 1 : N comparisons in order to identify biometric probes, i.e. comparison time frequently dominates the overall computational workload. Biometric database indexing represents a challenging task since biometric data is fuzzy and does not exhibit any natural sorting order. In this paper we present a preliminary study on the feasibility of applying Bloom filters for the purpose of iris biometric database indexing. It is shown, that by constructing a binary tree data structure of Bloom filters extracted from binary iris biometric templates (iris-codes) the search space can be reduced to O(logN). In experiments, which are carried out on a database of N = 256 classes, biometric performance (accuracy) is maintained for different conventional identification systems. Further, perspectives on how to employ the proposed scheme on large-scale databases are given.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130692631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-19DOI: 10.1109/ICB.2015.7139081
S. Aly, Andrea Trubanova, A. L. Abbott, S. White, A. Youssef
Human facial expressions have been extensively studied using 2D static images or 2D video sequences. The main limitations of 2D-based analysis are problems associated with large variations in pose and illumination. Therefore, an alternative is to utilize depth information, captured from 3D sensors, which is both pose and illumination invariant. The Kinect sensor is an inexpensive, portable, and fast way to capture the depth information. However, only a few researchers have utilized the Kinect sensor for the automatic recognition of facial expressions. This is partly due to the lack of a Kinect-based publicly available RGBD facial expression recognition (FER) dataset that contains the relevant facial expressions and their associated semantic labels. This paper addresses this problem by presenting the first publicly available RGBD+time facial expression recognition dataset using the Kinect 1.0 sensor in both scripted (acted) and unscripted (spontaneous) scenarios. Our fully annotated dataset includes seven expressions (happiness, sadness, surprise, disgust, fear, anger, and neutral) for 32 subjects (males and females) aged from 10 to 30 and with different skin tones. Both human and machine evaluation were conducted. Each scripted expression was ranked quantitatively by two research assistants in the Psychology department. Baseline machine evaluation resulted in average recognition accuracy levels of 60% and 58.3% for 6 expressions and 7 expressions recognition, respectively, when features from 2D and 3D data were combined.
{"title":"VT-KFER: A Kinect-based RGBD+time dataset for spontaneous and non-spontaneous facial expression recognition","authors":"S. Aly, Andrea Trubanova, A. L. Abbott, S. White, A. Youssef","doi":"10.1109/ICB.2015.7139081","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139081","url":null,"abstract":"Human facial expressions have been extensively studied using 2D static images or 2D video sequences. The main limitations of 2D-based analysis are problems associated with large variations in pose and illumination. Therefore, an alternative is to utilize depth information, captured from 3D sensors, which is both pose and illumination invariant. The Kinect sensor is an inexpensive, portable, and fast way to capture the depth information. However, only a few researchers have utilized the Kinect sensor for the automatic recognition of facial expressions. This is partly due to the lack of a Kinect-based publicly available RGBD facial expression recognition (FER) dataset that contains the relevant facial expressions and their associated semantic labels. This paper addresses this problem by presenting the first publicly available RGBD+time facial expression recognition dataset using the Kinect 1.0 sensor in both scripted (acted) and unscripted (spontaneous) scenarios. Our fully annotated dataset includes seven expressions (happiness, sadness, surprise, disgust, fear, anger, and neutral) for 32 subjects (males and females) aged from 10 to 30 and with different skin tones. Both human and machine evaluation were conducted. Each scripted expression was ranked quantitatively by two research assistants in the Psychology department. Baseline machine evaluation resulted in average recognition accuracy levels of 60% and 58.3% for 6 expressions and 7 expressions recognition, respectively, when features from 2D and 3D data were combined.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133611462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-19DOI: 10.1109/ICB.2015.7139092
Paritosh Mittal, Mayank Vatsa, Richa Singh
Sketch recognition is one of the integral components used by law enforcement agencies in solving crime. In recent past, software generated composite sketches are being preferred as they are more consistent and faster to construct than hand drawn sketches. Matching these composite sketches to face photographs is a complex task because the composite sketches are drawn based on the witness description and lack minute details which are present in photographs. This paper presents a novel algorithm for matching composite sketches with photographs using transfer learning with deep learning representation. In the proposed algorithm, first the deep learning architecture based facial representation is learned using large face database of photos and then the representation is updated using small problem-specific training database. Experiments are performed on the extended PRIP database and it is observed that the proposed algorithm outperforms recently proposed approach and a commercial face recognition system.
{"title":"Composite sketch recognition via deep network - a transfer learning approach","authors":"Paritosh Mittal, Mayank Vatsa, Richa Singh","doi":"10.1109/ICB.2015.7139092","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139092","url":null,"abstract":"Sketch recognition is one of the integral components used by law enforcement agencies in solving crime. In recent past, software generated composite sketches are being preferred as they are more consistent and faster to construct than hand drawn sketches. Matching these composite sketches to face photographs is a complex task because the composite sketches are drawn based on the witness description and lack minute details which are present in photographs. This paper presents a novel algorithm for matching composite sketches with photographs using transfer learning with deep learning representation. In the proposed algorithm, first the deep learning architecture based facial representation is learned using large face database of photos and then the representation is updated using small problem-specific training database. Experiments are performed on the extended PRIP database and it is observed that the proposed algorithm outperforms recently proposed approach and a commercial face recognition system.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114492410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-19DOI: 10.1109/ICB.2015.7139087
L. Best-Rowden, Anil K. Jain
With the deployment of automatic face recognition systems for many large-scale applications, it is crucial that we gain a thorough understanding of how facial aging affects the recognition performance, particularly across a large population. Because aging is a complex process involving genetic and environmental factors, some faces “age well” while the appearance of others changes drastically over time. This heterogeneity (inter-subject variability) suggests the need for a subject-specific aging analysis. In this paper, we conduct such an analysis using a longitudinal database of 147,784 operational mug shots of 18,007 repeat criminal offenders, where each subject has at least five face images acquired over a minimum of five years. By fitting multilevel statistical models to genuine similarity scores from two commercial-off-the-shelf (COTS) matchers, we quantify (i) the population average rate of change in genuine scores with respect to the elapsed time between two face images, and (ii) how closely the subject-specific rates of change follow the population average. Longitudinal analysis of the scores from the more accurate COTS matcher shows that despite decreasing genuine scores over time, the average subject can still be correctly verified at a false accept rate (FAR) of 0.01% across all 16 years of elapsed time in our database. We also investigate (i) the effects of several other covariates (gender, race, face quality), and (ii) the probability of true acceptance over time.
{"title":"A longitudinal study of automatic face recognition","authors":"L. Best-Rowden, Anil K. Jain","doi":"10.1109/ICB.2015.7139087","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139087","url":null,"abstract":"With the deployment of automatic face recognition systems for many large-scale applications, it is crucial that we gain a thorough understanding of how facial aging affects the recognition performance, particularly across a large population. Because aging is a complex process involving genetic and environmental factors, some faces “age well” while the appearance of others changes drastically over time. This heterogeneity (inter-subject variability) suggests the need for a subject-specific aging analysis. In this paper, we conduct such an analysis using a longitudinal database of 147,784 operational mug shots of 18,007 repeat criminal offenders, where each subject has at least five face images acquired over a minimum of five years. By fitting multilevel statistical models to genuine similarity scores from two commercial-off-the-shelf (COTS) matchers, we quantify (i) the population average rate of change in genuine scores with respect to the elapsed time between two face images, and (ii) how closely the subject-specific rates of change follow the population average. Longitudinal analysis of the scores from the more accurate COTS matcher shows that despite decreasing genuine scores over time, the average subject can still be correctly verified at a false accept rate (FAR) of 0.01% across all 16 years of elapsed time in our database. We also investigate (i) the effects of several other covariates (gender, race, face quality), and (ii) the probability of true acceptance over time.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"531 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133432349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The increasing use of smartphones to access sensitive and privacy data has given rise to the need of secure authentication technique. Existing authentication mechanisms on smartphones can only provide one-time verification, but the verified users are still vulnerable to the session hijacking. In this article, we propose a transparent and continuous authentication system based on users' touch interaction data. We consider four common types of touch operations, and extract behavioral features to characterize users' touch behavior. Distance-measurement technique is applied to mitigate the behavioral variability. Then a multiple decision procedure is developed to perform continuous authentication, in which one-class classification algorithms are applied for classification. The efficacy of our approach is validated through a series of experiments in four real-world application scenarios. Our experimental results show that the proposed approach could verify a user in an accurate and timely manner, and suffice to be an enhancement for traditional authentication mechanisms in smartphones.
{"title":"Touch-interaction behavior for continuous user authentication on smartphones","authors":"Chao Shen, Yong Zhang, Zhongmin Cai, Tianwen Yu, X. Guan","doi":"10.1109/ICB.2015.7139046","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139046","url":null,"abstract":"The increasing use of smartphones to access sensitive and privacy data has given rise to the need of secure authentication technique. Existing authentication mechanisms on smartphones can only provide one-time verification, but the verified users are still vulnerable to the session hijacking. In this article, we propose a transparent and continuous authentication system based on users' touch interaction data. We consider four common types of touch operations, and extract behavioral features to characterize users' touch behavior. Distance-measurement technique is applied to mitigate the behavioral variability. Then a multiple decision procedure is developed to perform continuous authentication, in which one-class classification algorithms are applied for classification. The efficacy of our approach is validated through a series of experiments in four real-world application scenarios. Our experimental results show that the proposed approach could verify a user in an accurate and timely manner, and suffice to be an enhancement for traditional authentication mechanisms in smartphones.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"335 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123779202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Face recognition is traditionally based on features extracted from face images which capture various intrinsic characteristics of faces to distinguish between individuals. However, humans do not perform face recognition in isolation and instead utilize a wide variety of contextual cues as well in order to perform accurate recognition. Social context or co-occurrence of individuals is one such cue that humans utilize to reinforce face recognition output. A social graph can adequately model social-relationships between different individuals and this can be utilized to augment traditional face recognition methods. In this research, we propose a novel method to generate a social-graph based on a collection of group photographs and learn the social context information. We also propose a novel algorithm to combine results from a commercial face recognition system and social context information to perform face identification. Experimental results on two publicly available datasets show that social context information can improve face recognition and help bridge the gap between humans and machines in face recognition.
{"title":"Harnessing social context for improved face recognition","authors":"Romil Bhardwaj, Gaurav Goswami, Richa Singh, Mayank Vatsa","doi":"10.1109/ICB.2015.7139085","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139085","url":null,"abstract":"Face recognition is traditionally based on features extracted from face images which capture various intrinsic characteristics of faces to distinguish between individuals. However, humans do not perform face recognition in isolation and instead utilize a wide variety of contextual cues as well in order to perform accurate recognition. Social context or co-occurrence of individuals is one such cue that humans utilize to reinforce face recognition output. A social graph can adequately model social-relationships between different individuals and this can be utilized to augment traditional face recognition methods. In this research, we propose a novel method to generate a social-graph based on a collection of group photographs and learn the social context information. We also propose a novel algorithm to combine results from a commercial face recognition system and social context information to perform face identification. Experimental results on two publicly available datasets show that social context information can improve face recognition and help bridge the gap between humans and machines in face recognition.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"5 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130095918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-19DOI: 10.1109/ICB.2015.7139074
L. Uzan, Lior Wolf
Intentional voice modifications by electronic or nonelectronic means challenge automatic speaker recognition systems. Previous work focused on detecting the act of disguise or identifying everyday speakers disguising their voices. Here, we propose a benchmark for the study of voice disguise, by studying the voice variability of professional voice actors. A dataset of 114 actors playing 647 characters is created. It contains 19 hours of captured speech, divided into 29,733 utterances tagged by character and actor names, which is then further sampled. Text-independent speaker identification of the actors based on a novel benchmark training on a subset of the characters they play, while testing on new unseen characters, shows an EER of 17.1%, HTER of 15.9%, and rank-1 recognition rate of 63.5% per utterance when training a Convolutional Neural Network on spectrograms generated from the utterances. An I-Vector based system was trained and tested on the same data, resulting in 39.7% EER, 39.4% HTER, and rank-1 recognition rate of 13.6%.
{"title":"I know that voice: Identifying the voice actor behind the voice","authors":"L. Uzan, Lior Wolf","doi":"10.1109/ICB.2015.7139074","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139074","url":null,"abstract":"Intentional voice modifications by electronic or nonelectronic means challenge automatic speaker recognition systems. Previous work focused on detecting the act of disguise or identifying everyday speakers disguising their voices. Here, we propose a benchmark for the study of voice disguise, by studying the voice variability of professional voice actors. A dataset of 114 actors playing 647 characters is created. It contains 19 hours of captured speech, divided into 29,733 utterances tagged by character and actor names, which is then further sampled. Text-independent speaker identification of the actors based on a novel benchmark training on a subset of the characters they play, while testing on new unseen characters, shows an EER of 17.1%, HTER of 15.9%, and rank-1 recognition rate of 63.5% per utterance when training a Convolutional Neural Network on spectrograms generated from the utterances. An I-Vector based system was trained and tested on the same data, resulting in 39.7% EER, 39.4% HTER, and rank-1 recognition rate of 13.6%.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129754565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-19DOI: 10.1109/ICB.2015.7139109
Jan Svoboda, M. Bronstein, M. Drahanský
In the past decade, the interest in using 3D data for biometric person authentication has increased significantly, propelled by the availability of affordable 3D sensors. The adoption of 3D features has been especially successful in face recognition applications, leading to several commercial 3D face recognition products. In other biometric modalities such as hand recognition, several studies have shown the potential advantage of using 3D geometric information, however, no commercial-grade systems are currently available. In this paper, we present a contactless 3D hand recognition system based on the novel Intel RealSense camera, the first mass-produced embeddable 3D sensor. The small form factor and low cost make this sensor especially appealing for commercial biometric applications, however, they come at the price of lower resolution compared to more expensive 3D scanners used in previous research. We analyze the robustness of several existing 2D and 3D features that can be extracted from the images captured by the RealSense camera and study the use of metric learning for their fusion.
{"title":"Contactless biometric hand geometry recognition using a low-cost 3D camera","authors":"Jan Svoboda, M. Bronstein, M. Drahanský","doi":"10.1109/ICB.2015.7139109","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139109","url":null,"abstract":"In the past decade, the interest in using 3D data for biometric person authentication has increased significantly, propelled by the availability of affordable 3D sensors. The adoption of 3D features has been especially successful in face recognition applications, leading to several commercial 3D face recognition products. In other biometric modalities such as hand recognition, several studies have shown the potential advantage of using 3D geometric information, however, no commercial-grade systems are currently available. In this paper, we present a contactless 3D hand recognition system based on the novel Intel RealSense camera, the first mass-produced embeddable 3D sensor. The small form factor and low cost make this sensor especially appealing for commercial biometric applications, however, they come at the price of lower resolution compared to more expensive 3D scanners used in previous research. We analyze the robustness of several existing 2D and 3D features that can be extracted from the images captured by the RealSense camera and study the use of metric learning for their fusion.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"509 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122758073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Despite the high facial expression recognition accuracy reported on individual databases, cross-database facial expression recognition is still a challenging problem. This is essentially a problem of generalizing a facial expression recognizer trained with data of certain subjects under certain conditions to different subjects and/or different conditions. Such generalization capability is crucial in real-world applications. However, little attention has been focused on this problem in the literature. Transfer learning, a domain adaptation approach, provides effective techniques for transferring knowledge from source (training) data to target (testing) data when they are characterized by different properties. This paper makes the first attempt to apply transferring learning to cross-database facial expression recognition. It proposes a transfer learning based cross-database facial expression recognition approach, in which two training stages are involved: One for learning knowledge from source data, and the other for adapting the learned knowledge to target data. This approach has been implemented based on Gabor features extracted from facial images, regression tree classifiers, the AdaBoosting algorithm, and support vector machines. Evaluation experiments have been done on the JAFFE, FEED, and extended Cohn-Kanade databases. The results demonstrate that using the proposed transferring learning approach the cross-database facial expression recognition accuracy can be improved by more than 20%.
{"title":"A transfer learning approach to cross-database facial expression recognition","authors":"Ronghang Zhu, Tingting Zhang, Qijun Zhao, Zhihong Wu","doi":"10.1109/ICB.2015.7139098","DOIUrl":"https://doi.org/10.1109/ICB.2015.7139098","url":null,"abstract":"Despite the high facial expression recognition accuracy reported on individual databases, cross-database facial expression recognition is still a challenging problem. This is essentially a problem of generalizing a facial expression recognizer trained with data of certain subjects under certain conditions to different subjects and/or different conditions. Such generalization capability is crucial in real-world applications. However, little attention has been focused on this problem in the literature. Transfer learning, a domain adaptation approach, provides effective techniques for transferring knowledge from source (training) data to target (testing) data when they are characterized by different properties. This paper makes the first attempt to apply transferring learning to cross-database facial expression recognition. It proposes a transfer learning based cross-database facial expression recognition approach, in which two training stages are involved: One for learning knowledge from source data, and the other for adapting the learned knowledge to target data. This approach has been implemented based on Gabor features extracted from facial images, regression tree classifiers, the AdaBoosting algorithm, and support vector machines. Evaluation experiments have been done on the JAFFE, FEED, and extended Cohn-Kanade databases. The results demonstrate that using the proposed transferring learning approach the cross-database facial expression recognition accuracy can be improved by more than 20%.","PeriodicalId":237372,"journal":{"name":"2015 International Conference on Biometrics (ICB)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123212384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}