Driven by desirable attributes such as topological characterization and invariance to isometric transformations, the use of the Laplace-Beltrami operator (LBO) and its associated spectrum have been widely adopted among the shape analysis community. Here we demonstrate a novel use of the LBO for shape matching and retrieval by estimating probability densities on its Eigen space, and subsequently using the intrinsic geometry of the density manifold to categorize similar shapes. In our framework, each 3D shape's rich geometric structure, as captured by the low order eigenvectors of its LBO, is robustly characterized via a nonparametric density estimated directly on these eigenvectors. By utilizing a probabilistic model where the square root of the density is expanded in a wavelet basis, the space of LBO-shape densities is identifiable with the unit hyper sphere. We leverage this simple geometry for retrieval by computing an intrinsic Karcher mean (on the hyper sphere of LBO-shape densities) for each shape category, and use the closed-form distance between a query shape and the means to classify shapes. Our method alleviates the need for superfluous feature extraction schemes-required for popular bag-of-features approaches-and experiments demonstrate it to be robust and competitive with the state-of-the-art in 3D shape retrieval algorithms.
{"title":"LBO-Shape Densities: Efficient 3D Shape Retrieval Using Wavelet Density Estimation","authors":"Mark Moyou, Koffi Eddy Ihou, A. Peter","doi":"10.1109/ICPR.2014.19","DOIUrl":"https://doi.org/10.1109/ICPR.2014.19","url":null,"abstract":"Driven by desirable attributes such as topological characterization and invariance to isometric transformations, the use of the Laplace-Beltrami operator (LBO) and its associated spectrum have been widely adopted among the shape analysis community. Here we demonstrate a novel use of the LBO for shape matching and retrieval by estimating probability densities on its Eigen space, and subsequently using the intrinsic geometry of the density manifold to categorize similar shapes. In our framework, each 3D shape's rich geometric structure, as captured by the low order eigenvectors of its LBO, is robustly characterized via a nonparametric density estimated directly on these eigenvectors. By utilizing a probabilistic model where the square root of the density is expanded in a wavelet basis, the space of LBO-shape densities is identifiable with the unit hyper sphere. We leverage this simple geometry for retrieval by computing an intrinsic Karcher mean (on the hyper sphere of LBO-shape densities) for each shape category, and use the closed-form distance between a query shape and the means to classify shapes. Our method alleviates the need for superfluous feature extraction schemes-required for popular bag-of-features approaches-and experiments demonstrate it to be robust and competitive with the state-of-the-art in 3D shape retrieval algorithms.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122769169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A key factor in building effective writer identification/verification systems is the amount of data required to build the underlying models. In this research we systematically examine data sufficiency bounds for two broad approaches to online writer identification -- feature space models vs. writer-style space models. We report results from 40 experiments conducted on two publicly available datasets and also test identification performance for the target models using two different feature functions. Our findings show that the writer-style space model gives higher identification performance for a given level of data and further, achieves high performance levels with lesser data costs. This model appears to require as less as 20 words per page to achieve identification performance close to 80% and reaches more than 90% accuracy with higher levels of data enrollment.
{"title":"Data Sufficiency for Online Writer Identification: A Comparative Study of Writer-Style Space vs. Feature Space Models","authors":"Arti Shivram, Chetan Ramaiah, V. Govindaraju","doi":"10.1109/ICPR.2014.538","DOIUrl":"https://doi.org/10.1109/ICPR.2014.538","url":null,"abstract":"A key factor in building effective writer identification/verification systems is the amount of data required to build the underlying models. In this research we systematically examine data sufficiency bounds for two broad approaches to online writer identification -- feature space models vs. writer-style space models. We report results from 40 experiments conducted on two publicly available datasets and also test identification performance for the target models using two different feature functions. Our findings show that the writer-style space model gives higher identification performance for a given level of data and further, achieves high performance levels with lesser data costs. This model appears to require as less as 20 words per page to achieve identification performance close to 80% and reaches more than 90% accuracy with higher levels of data enrollment.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122602367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Metric learning to learn a good distance metric for distinguishing different people while being insensitive to intra-person variations is widely applied to person re-identification. In previous works, local histograms are densely sampled to extract spatially localized information of each person image. The extracted local histograms are then concatenated into one vector that is used as an input of metric learning. However, the dimensionality of such a concatenated vector often becomes large while the number of training samples is limited. This leads to an over fitting problem. In this work, we argue that such a problem of over-fitting comes from that it is each local histogram dimension (e.g. color brightness bin) in the same position is treated separately to examine which part of the image is more discriminative. To solve this problem, we propose a method that analyzes discriminative image positions shared by different local histogram dimensions. A common weight map shared by different dimensions and a distance metric which emphasizes discriminative dimensions in the local histogram are jointly learned with a unified discriminative criterion. Our experiments using four different public datasets confirmed the effectiveness of the proposed method.
{"title":"Person Re-identification via Discriminative Accumulation of Local Features","authors":"Tetsu Matsukawa, Takahiro Okabe, Yoichi Sato","doi":"10.1109/ICPR.2014.681","DOIUrl":"https://doi.org/10.1109/ICPR.2014.681","url":null,"abstract":"Metric learning to learn a good distance metric for distinguishing different people while being insensitive to intra-person variations is widely applied to person re-identification. In previous works, local histograms are densely sampled to extract spatially localized information of each person image. The extracted local histograms are then concatenated into one vector that is used as an input of metric learning. However, the dimensionality of such a concatenated vector often becomes large while the number of training samples is limited. This leads to an over fitting problem. In this work, we argue that such a problem of over-fitting comes from that it is each local histogram dimension (e.g. color brightness bin) in the same position is treated separately to examine which part of the image is more discriminative. To solve this problem, we propose a method that analyzes discriminative image positions shared by different local histogram dimensions. A common weight map shared by different dimensions and a distance metric which emphasizes discriminative dimensions in the local histogram are jointly learned with a unified discriminative criterion. Our experiments using four different public datasets confirmed the effectiveness of the proposed method.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"363 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122830724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Markus Kächele, D. Zharkov, S. Meudt, F. Schwenker
Emotion recognition from speech is an important field of research in human-machine-interfaces, and has begun to influence everyday life by employment in different areas such as call centers or wearable companions in the form of smartphones. In the proposed classification architecture, different spectral, prosodic and the relatively novel voice quality features are extracted from the speech signals. These features are then used to represent long-term information of the speech, leading to utterance-wise suprasegmental features. The most promising of these features are selected using a forward-selection/backward-elimination algorithm with a novel long-term termination criterion for the selection. The overall system has been evaluated using recordings from the public Berlin emotion database. Utilizing the resulted features, a recognition rate of 88,97% has been achieved which surpasses the performance of humans on this database and is comparable to the state of the art performance on this dataset.
{"title":"Prosodic, Spectral and Voice Quality Feature Selection Using a Long-Term Stopping Criterion for Audio-Based Emotion Recognition","authors":"Markus Kächele, D. Zharkov, S. Meudt, F. Schwenker","doi":"10.1109/ICPR.2014.148","DOIUrl":"https://doi.org/10.1109/ICPR.2014.148","url":null,"abstract":"Emotion recognition from speech is an important field of research in human-machine-interfaces, and has begun to influence everyday life by employment in different areas such as call centers or wearable companions in the form of smartphones. In the proposed classification architecture, different spectral, prosodic and the relatively novel voice quality features are extracted from the speech signals. These features are then used to represent long-term information of the speech, leading to utterance-wise suprasegmental features. The most promising of these features are selected using a forward-selection/backward-elimination algorithm with a novel long-term termination criterion for the selection. The overall system has been evaluated using recordings from the public Berlin emotion database. Utilizing the resulted features, a recognition rate of 88,97% has been achieved which surpasses the performance of humans on this database and is comparable to the state of the art performance on this dataset.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114580524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classical adaptive mathematical morphology is based on operators which locally adapt the structuring elements to the image properties. Connected morphological operators act on the level of the flat zones of an image, such that only flat zones are filtered out, and hence the object edges are preserved. Area opening (resp. area closing) is one of the most useful connected operators, which filters out the bright (resp. dark) regions. It intrinsically involves the adaptation of the shape of the structuring element parameterized by its area. In this paper, we introduce the notion of reference-driven adaptive area opening according to two spatially-variant paradigms. First, the parameter of area is locally adapted by the reference image. This approach is applied to processing intensity depth images where the depth image is used to adapt the scale-size processing. Second, a self-dual area opening, where the reference image determines if the area filter is an opening or a closing with respect to the relationship between the image and the reference. Its natural application domain are the video sequences.
{"title":"Spatially-Variant Area Openings for Reference-Driven Adaptive Contour Preserving Filtering","authors":"G. Franchi, J. Angulo","doi":"10.1109/ICPR.2014.189","DOIUrl":"https://doi.org/10.1109/ICPR.2014.189","url":null,"abstract":"Classical adaptive mathematical morphology is based on operators which locally adapt the structuring elements to the image properties. Connected morphological operators act on the level of the flat zones of an image, such that only flat zones are filtered out, and hence the object edges are preserved. Area opening (resp. area closing) is one of the most useful connected operators, which filters out the bright (resp. dark) regions. It intrinsically involves the adaptation of the shape of the structuring element parameterized by its area. In this paper, we introduce the notion of reference-driven adaptive area opening according to two spatially-variant paradigms. First, the parameter of area is locally adapted by the reference image. This approach is applied to processing intensity depth images where the depth image is used to adapt the scale-size processing. Second, a self-dual area opening, where the reference image determines if the area filter is an opening or a closing with respect to the relationship between the image and the reference. Its natural application domain are the video sequences.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122087143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Video text which contains rich semantic information can be utilized for video indexing and summarization. However, compared with scanned documents, text recogniton for video text is still a challenging problem due to complex background. Segmenting text line into single characters before text extraction can achieve higher recognition accuracy, since background of single character is less complex compared with whole text line. Therefore, we first perform character segmentation, which can accurately locate the character gap in the text line. More specifically, we get a fusion map which fuses the results of color gradient and log-gabor filter. Then, candidate segmentation points are obtained by vertical projection analysis of the fusion map. We get segmentation points by finding minimum projection value of candidate points in a limited range. Finally, we get the binary image of the single character image by applying K-means clustering and combine their results to form binary image of the whole text line. The binary image is further refined by inward filling and the fusion map. The experimental results on a large amount of data show that the proposed method can contribute to better binarization result which leads to a higher character recognition rate of OCR engine.
{"title":"Video Text Extraction Using the Fusion of Color Gradient and Log-Gabor Filter","authors":"Zhike Zhang, Weiqiang Wang, K. Lu","doi":"10.1109/ICPR.2014.506","DOIUrl":"https://doi.org/10.1109/ICPR.2014.506","url":null,"abstract":"Video text which contains rich semantic information can be utilized for video indexing and summarization. However, compared with scanned documents, text recogniton for video text is still a challenging problem due to complex background. Segmenting text line into single characters before text extraction can achieve higher recognition accuracy, since background of single character is less complex compared with whole text line. Therefore, we first perform character segmentation, which can accurately locate the character gap in the text line. More specifically, we get a fusion map which fuses the results of color gradient and log-gabor filter. Then, candidate segmentation points are obtained by vertical projection analysis of the fusion map. We get segmentation points by finding minimum projection value of candidate points in a limited range. Finally, we get the binary image of the single character image by applying K-means clustering and combine their results to form binary image of the whole text line. The binary image is further refined by inward filling and the fusion map. The experimental results on a large amount of data show that the proposed method can contribute to better binarization result which leads to a higher character recognition rate of OCR engine.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122132963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chunchao Guo, Shi-Zhe Chen, J. Lai, Xiao-Jun Hu, Shi-Chang Shi
This work tackles the challenging problem of multi-shot person re-identification in realistic unconstrained scenarios. While most previous research within re-identification field is based on single-shot mode due to the constraint of scales of conventional datasets, multi-shot case provides a more natural way for person recognition in surveillance systems. Multiple frames can be easily captured in a camera network, thus more complementary information can be extracted for a more robust signature. To re-identify targets in real world, a key issue named identity ambiguity that commonly occurs must be solved preferentially, which is not considered by most previous studies. During the offline stage, we train an ambiguity classifier based on the shape context extracted from foreground responses in videos. Given a probe pedestrian, this paper employs the offline trained classifier to recognize and remove ambiguous samples, and then utilizes an improved hierarchical appearance representation to match humans between multiple-shots. Evaluations of this approach are conducted on two challenging real-world datasets, both of which are newly released in this paper, and yield impressive performance.
{"title":"Multi-shot Person Re-identification with Automatic Ambiguity Inference and Removal","authors":"Chunchao Guo, Shi-Zhe Chen, J. Lai, Xiao-Jun Hu, Shi-Chang Shi","doi":"10.1109/ICPR.2014.609","DOIUrl":"https://doi.org/10.1109/ICPR.2014.609","url":null,"abstract":"This work tackles the challenging problem of multi-shot person re-identification in realistic unconstrained scenarios. While most previous research within re-identification field is based on single-shot mode due to the constraint of scales of conventional datasets, multi-shot case provides a more natural way for person recognition in surveillance systems. Multiple frames can be easily captured in a camera network, thus more complementary information can be extracted for a more robust signature. To re-identify targets in real world, a key issue named identity ambiguity that commonly occurs must be solved preferentially, which is not considered by most previous studies. During the offline stage, we train an ambiguity classifier based on the shape context extracted from foreground responses in videos. Given a probe pedestrian, this paper employs the offline trained classifier to recognize and remove ambiguous samples, and then utilizes an improved hierarchical appearance representation to match humans between multiple-shots. Evaluations of this approach are conducted on two challenging real-world datasets, both of which are newly released in this paper, and yield impressive performance.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122157355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper addresses the problem of learning semantic compact binary codes for efficient retrieval in large-scale image collections. Our contributions are three-fold. Firstly, we introduce semantic codes, of which each bit corresponds to an attribute that describes a property of an object (e.g. dogs have furry). Secondly, we propose to use matrix factorization (MF) to learn the semantic codes by encoding attributes. Unlike traditional PCA-based encoding methods which quantize data into orthogonal bases, MF assumes no constraints on bases, and this scheme is coincided with that attributes are correlated. Finally, to augment semantic codes, MF is extended to encode extra non-semantic codes to preserve similarity in origin data space. Evaluations on a-Pascal dataset show that our method is comparable to the state-of-the-art when using Euclidean distance as ground truth, and even outperforms state-of-the-art when using class label as ground truth. Furthermore, in experiments, our method can retrieve images that share the same semantic properties with the query image, which can be used to other vision tasks, e.g. re-training classifiers.
{"title":"Learning Semantic Binary Codes by Encoding Attributes for Image Retrieval","authors":"Jianwei Luo, Zhi-guo Jiang","doi":"10.1109/ICPR.2014.57","DOIUrl":"https://doi.org/10.1109/ICPR.2014.57","url":null,"abstract":"This paper addresses the problem of learning semantic compact binary codes for efficient retrieval in large-scale image collections. Our contributions are three-fold. Firstly, we introduce semantic codes, of which each bit corresponds to an attribute that describes a property of an object (e.g. dogs have furry). Secondly, we propose to use matrix factorization (MF) to learn the semantic codes by encoding attributes. Unlike traditional PCA-based encoding methods which quantize data into orthogonal bases, MF assumes no constraints on bases, and this scheme is coincided with that attributes are correlated. Finally, to augment semantic codes, MF is extended to encode extra non-semantic codes to preserve similarity in origin data space. Evaluations on a-Pascal dataset show that our method is comparable to the state-of-the-art when using Euclidean distance as ground truth, and even outperforms state-of-the-art when using class label as ground truth. Furthermore, in experiments, our method can retrieve images that share the same semantic properties with the query image, which can be used to other vision tasks, e.g. re-training classifiers.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122220577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Human gait is of essential importance for its wide use in biometric person-identification applications. In this work, we introduce a novel spatio-temporal gait representation, Flow Histogram Energy Image (FHEI), to characterize distinctive motion information of individual gait. We first extract the Histograms of Optical Flow (HOF) descriptors of each silhouette image of gait sequence, and construct an FHEI by averaging all the HOF features of a full gait cycle. We also propose a novel approach to generate two different synthetic gait templates. Real and synthetic gait templates are then fused to enhance the recognition accuracy of FHEI. We also adopt the Non-negative Matrix Factorization (NMF) to learn a part-based representation of FHEI templates. Extensive experiments conducted on the USF HumanID gait database indicate that the proposed FHEI approach achieves superior or comparable performance in comparison with a number of competitive gait recognition algorithms.
{"title":"Gait Recognition Using Flow Histogram Energy Image","authors":"Yazhou Yang, D. Tu, Guohui Li","doi":"10.1109/ICPR.2014.85","DOIUrl":"https://doi.org/10.1109/ICPR.2014.85","url":null,"abstract":"Human gait is of essential importance for its wide use in biometric person-identification applications. In this work, we introduce a novel spatio-temporal gait representation, Flow Histogram Energy Image (FHEI), to characterize distinctive motion information of individual gait. We first extract the Histograms of Optical Flow (HOF) descriptors of each silhouette image of gait sequence, and construct an FHEI by averaging all the HOF features of a full gait cycle. We also propose a novel approach to generate two different synthetic gait templates. Real and synthetic gait templates are then fused to enhance the recognition accuracy of FHEI. We also adopt the Non-negative Matrix Factorization (NMF) to learn a part-based representation of FHEI templates. Extensive experiments conducted on the USF HumanID gait database indicate that the proposed FHEI approach achieves superior or comparable performance in comparison with a number of competitive gait recognition algorithms.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"160 Pt 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128740749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the past decade, expression variations have been one of the most challenging sources of variability in 3D face recognition, especially for scenarios where there are a large number of face samples to discriminate between. In this paper, an expression robust reject or is proposed that first robustly locates landmarks on the relatively stable structure of the nose and its environs, termed the cheek/nose region. Then, by defining curves connecting the landmarks, a small set of features (4 curves with only 15 points each) on the cheek/nose surface are selected using the Bosphorus database. The resulting reject or, which can quickly eliminate a large number of candidates at an early stage, is further evaluated on the FRGC database for both the identification and verification scenarios. The classification performance using only 60 points from 4 curves shows the effectiveness of this efficient expression robust rejector.
{"title":"A Low Dimensionality Expression Robust Rejector for 3D Face Recognition","authors":"Jiangning Gao, Mehryar Emambakhsh, A. Evans","doi":"10.1109/ICPR.2014.96","DOIUrl":"https://doi.org/10.1109/ICPR.2014.96","url":null,"abstract":"In the past decade, expression variations have been one of the most challenging sources of variability in 3D face recognition, especially for scenarios where there are a large number of face samples to discriminate between. In this paper, an expression robust reject or is proposed that first robustly locates landmarks on the relatively stable structure of the nose and its environs, termed the cheek/nose region. Then, by defining curves connecting the landmarks, a small set of features (4 curves with only 15 points each) on the cheek/nose surface are selected using the Bosphorus database. The resulting reject or, which can quickly eliminate a large number of candidates at an early stage, is further evaluated on the FRGC database for both the identification and verification scenarios. The classification performance using only 60 points from 4 curves shows the effectiveness of this efficient expression robust rejector.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"145 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129633005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}