One of the key challenges for face recognition is finding efficient and discriminative facial appearance descriptors that are resistant to large variations in illumination, pose, face expression, ageing, face misalignment and other changes. In this paper, we propose a novel facial appearance descriptor based on local binary pattern (LBP), which presents several advantages. (1) It is more discriminative. (2) It is not sensitive to variations in illumination, pose, face expression, ageing and face misalignment. (3) It can be computed very efficiently and the feature sets are low-dimensional. Experiments on FERET database show that the proposed operator significantly outperforms other feature descriptors.
{"title":"A Novel Facial Appearance Descriptor Based on Local Binary Pattern","authors":"Shihu Zhu, Jufu Feng","doi":"10.1109/CCPR.2008.58","DOIUrl":"https://doi.org/10.1109/CCPR.2008.58","url":null,"abstract":"One of the key challenges for face recognition is finding efficient and discriminative facial appearance descriptors that are resistant to large variations in illumination, pose, face expression, ageing, face misalignment and other changes. In this paper, we propose a novel facial appearance descriptor based on local binary pattern (LBP), which presents several advantages. (1) It is more discriminative. (2) It is not sensitive to variations in illumination, pose, face expression, ageing and face misalignment. (3) It can be computed very efficiently and the feature sets are low-dimensional. Experiments on FERET database show that the proposed operator significantly outperforms other feature descriptors.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"239 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121398382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
How to resist geometry transformation attacks effectively has become a focus of digital watermarking. In this paper a video blind watermarking algorithm resistant to geometry transformation attacks is proposed. In the scheme of watermark embedding, video shot segmentation is used first. Then, background of a video segment in the video shot is extracted by independent component analysis(ICA). Finally, the background image is decomposed by nonsubsampled Contourlet transform(NSCT) and meaningful watermark is embedded into the lowpass subband. In the scheme of watermark extraction, video segment embedded with watermark in the video shot is analyzed by ICA and the background image with watermark information is extracted first. Then, video segment embedded with watermark and other frames in the video shot are analyzed by ICA, and the background image without watermark information is extracted. Finally, the two background images are decomposed by NSCT, and the watermark is extracted though detecting the distinction of the lowpass subbands of the two background images. Experimental results show that, this algorithm can make the watermark resist geometry transformation attacks effectively and keep the visual quality of the video. In addition, it also has enough robustness to resist other attacks.
{"title":"A Robust Video Watermarking Algorithm Resistant to Geometry Transformation Attacks Based on Background","authors":"L. Pang, Yiquan Wu","doi":"10.1109/CCPR.2008.70","DOIUrl":"https://doi.org/10.1109/CCPR.2008.70","url":null,"abstract":"How to resist geometry transformation attacks effectively has become a focus of digital watermarking. In this paper a video blind watermarking algorithm resistant to geometry transformation attacks is proposed. In the scheme of watermark embedding, video shot segmentation is used first. Then, background of a video segment in the video shot is extracted by independent component analysis(ICA). Finally, the background image is decomposed by nonsubsampled Contourlet transform(NSCT) and meaningful watermark is embedded into the lowpass subband. In the scheme of watermark extraction, video segment embedded with watermark in the video shot is analyzed by ICA and the background image with watermark information is extracted first. Then, video segment embedded with watermark and other frames in the video shot are analyzed by ICA, and the background image without watermark information is extracted. Finally, the two background images are decomposed by NSCT, and the watermark is extracted though detecting the distinction of the lowpass subbands of the two background images. Experimental results show that, this algorithm can make the watermark resist geometry transformation attacks effectively and keep the visual quality of the video. In addition, it also has enough robustness to resist other attacks.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129345188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a two-dimensional Inverse Fisher Discriminant Analysis (2DIFDA) method for feature extraction and face recognition. This method combines the ideas of two-dimensional principal component analysis and Inverse FDA and it can directly extracts the optimal projective vectors from 2D image matrices rather than image vectors based on the inverse fisher discriminant criterion. Experiments on the FERET face databases show that the new method outperforms the PCA , 2DPCA, Fisherfaces and the inverse fisher discriminant analysis.
{"title":"Two-Dimensional Inverse FDA for Face Recognition","authors":"Wankou Yang, Hui Yan, Jun Yin, Jingyu Yang","doi":"10.1109/CCPR.2008.51","DOIUrl":"https://doi.org/10.1109/CCPR.2008.51","url":null,"abstract":"In this paper, we propose a two-dimensional Inverse Fisher Discriminant Analysis (2DIFDA) method for feature extraction and face recognition. This method combines the ideas of two-dimensional principal component analysis and Inverse FDA and it can directly extracts the optimal projective vectors from 2D image matrices rather than image vectors based on the inverse fisher discriminant criterion. Experiments on the FERET face databases show that the new method outperforms the PCA , 2DPCA, Fisherfaces and the inverse fisher discriminant analysis.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"11647 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128744898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a mispronunciation detection system which uses automatic speech recognition to effectively detect the phone-level mispronunciations in the Cantonese learners of English. Our approach extends a target pronunciation lexicon with possible phonetic confusions that may lead to pronunciation errors to generate an extended pronunciation lexicon that contains both target pronunciations for each word and pronunciation variants. The Viterbi decoding is then run with the extended pronunciation lexicon to detect phone-level mispronunciation in learners' speech. This paper introduces a data-driven approach by performing automatic phone recognition on the Cantonese learners' speech and analyzing the recognition errors to derive the possible phonetic confusions. The rule-based generation process leads to many implausible mispronunciations. We present a method to automatically prune the extended pronunciation lexicon. Experimental results show that the use of extended pronunciation lexicon after pruning can detect phone-level mispronunciation better than using a fully extended pronunciation lexicon.
{"title":"Phone-Level Mispronunciation Detection for Computer-Assisted Language Learning","authors":"Xin Feng, Lan Wang","doi":"10.1109/CCPR.2008.83","DOIUrl":"https://doi.org/10.1109/CCPR.2008.83","url":null,"abstract":"This paper presents a mispronunciation detection system which uses automatic speech recognition to effectively detect the phone-level mispronunciations in the Cantonese learners of English. Our approach extends a target pronunciation lexicon with possible phonetic confusions that may lead to pronunciation errors to generate an extended pronunciation lexicon that contains both target pronunciations for each word and pronunciation variants. The Viterbi decoding is then run with the extended pronunciation lexicon to detect phone-level mispronunciation in learners' speech. This paper introduces a data-driven approach by performing automatic phone recognition on the Cantonese learners' speech and analyzing the recognition errors to derive the possible phonetic confusions. The rule-based generation process leads to many implausible mispronunciations. We present a method to automatically prune the extended pronunciation lexicon. Experimental results show that the use of extended pronunciation lexicon after pruning can detect phone-level mispronunciation better than using a fully extended pronunciation lexicon.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116891163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the first step in a face normalization procedure, accurate eye localization technique has the fundamental importance for the performance of face recognition systems. One of the most classical methods to address this is the pictorial model where the appearance model and shape constraints are optimized together. However, under extreme illumination changes and large expression variations, the simple Gaussian appearance model and the localization-based shape constraints used in the pictorial model are not capable to handle the complex appearance and structural changes appeared in the given face image. In this paper, we enhanced the pictorial model by combining the strength of illumination preprocessing, robust image descriptors, probabilistic SVM and an improved structural model which are invariant to scale, rotation and other transforms. Experimental results on CAS-PEAL dataset demonstrated that the proposed model can accurately localize eyes in spite of large illumination and expression variations in face images.
{"title":"Accurate Eye Localization under Large Illumination and Expression Variations with Enhanced Pictorial Model","authors":"F. Song, Xiaoyang Tan, Songcan Chen","doi":"10.1109/CCPR.2008.25","DOIUrl":"https://doi.org/10.1109/CCPR.2008.25","url":null,"abstract":"As the first step in a face normalization procedure, accurate eye localization technique has the fundamental importance for the performance of face recognition systems. One of the most classical methods to address this is the pictorial model where the appearance model and shape constraints are optimized together. However, under extreme illumination changes and large expression variations, the simple Gaussian appearance model and the localization-based shape constraints used in the pictorial model are not capable to handle the complex appearance and structural changes appeared in the given face image. In this paper, we enhanced the pictorial model by combining the strength of illumination preprocessing, robust image descriptors, probabilistic SVM and an improved structural model which are invariant to scale, rotation and other transforms. Experimental results on CAS-PEAL dataset demonstrated that the proposed model can accurately localize eyes in spite of large illumination and expression variations in face images.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"2005 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125607744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nearest feature line (NFL) (S.Z. Li and J. Lu, 1999) is an efficient yet simple classification method for pattern recognition. This paper presents a theoretical analysis and interpretation of NFL from the perspective of manifold analysis, and explains the geometric nature of NFL based similarity measures. It is illustrated that NFL, nearest feature plane (NFP) and nearest feature space (NFS) are special cases of tangent approximation. Under the assumption of manifold, we introduce localized NFL (LNFL) and nearest feature spline (NFB) to further enhance classification ability and reduce computational complexity. The LNFL extends NFL's Euclidean distance to a manifold distance. And for NFB, feature lines are constructed along with a manifold's variation which is defined on a tangent bundle. The proposed methods are validated on a synthetic dataset and two standard face recognition databases (FRGC version 2 and FERET). Experimental results illustrate its efficiency and effectiveness.
最近特征线(Nearest feature line, NFL)是一种简单有效的模式识别分类方法(Li S.Z. and J. Lu, 1999)。本文从流形分析的角度对NFL进行了理论分析和解释,并解释了基于NFL的相似性度量的几何性质。说明了NFL、最近特征平面(NFP)和最近特征空间(NFS)是切线近似的特殊情况。在流形假设下,为了进一步提高分类能力和降低计算复杂度,我们引入了局部特征样条(nlfl)和最近特征样条(NFB)。LNFL将NFL的欧氏距离扩展为流形距离。对于NFB,特征线是与在切线束上定义的流形变化一起构建的。在一个合成数据集和两个标准人脸识别数据库(FRGC version 2和FERET)上验证了所提出的方法。实验结果表明了该方法的有效性。
{"title":"Nearest Feature Line: A Tangent Approximation","authors":"R. He, Meng Ao, Shi-ming Xiang, S.Z. Li","doi":"10.1109/CCPR.2008.22","DOIUrl":"https://doi.org/10.1109/CCPR.2008.22","url":null,"abstract":"Nearest feature line (NFL) (S.Z. Li and J. Lu, 1999) is an efficient yet simple classification method for pattern recognition. This paper presents a theoretical analysis and interpretation of NFL from the perspective of manifold analysis, and explains the geometric nature of NFL based similarity measures. It is illustrated that NFL, nearest feature plane (NFP) and nearest feature space (NFS) are special cases of tangent approximation. Under the assumption of manifold, we introduce localized NFL (LNFL) and nearest feature spline (NFB) to further enhance classification ability and reduce computational complexity. The LNFL extends NFL's Euclidean distance to a manifold distance. And for NFB, feature lines are constructed along with a manifold's variation which is defined on a tangent bundle. The proposed methods are validated on a synthetic dataset and two standard face recognition databases (FRGC version 2 and FERET). Experimental results illustrate its efficiency and effectiveness.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131441013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To overcome the drawback of overly dependence on the input parameters in intelligence single particle optimization (ISPO), an improved algorithm, called simplified intelligence single particle optimization (SISPO), is proposed in this paper. While maintaining similar performance as ISPO, no special parameter settings are required by SISPO. The proposed SISPO was successfully applied to train neural network classifier for digit recognition. Experimental results demonstrated that, the proposed neural network training algorithm, simplified intelligence single particle optimization neural network (SISPONN), achieved less training error and test error than traditional BP algorithms like gradient methods.
{"title":"Simplified Intelligence Single Particle Optimization Based Neural Network for Digit Recognition","authors":"Jiarui Zhou, Z. Ji, L. Shen","doi":"10.1109/CCPR.2008.74","DOIUrl":"https://doi.org/10.1109/CCPR.2008.74","url":null,"abstract":"To overcome the drawback of overly dependence on the input parameters in intelligence single particle optimization (ISPO), an improved algorithm, called simplified intelligence single particle optimization (SISPO), is proposed in this paper. While maintaining similar performance as ISPO, no special parameter settings are required by SISPO. The proposed SISPO was successfully applied to train neural network classifier for digit recognition. Experimental results demonstrated that, the proposed neural network training algorithm, simplified intelligence single particle optimization neural network (SISPONN), achieved less training error and test error than traditional BP algorithms like gradient methods.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133527255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many problems in information processing involve some form of dimensionality reduction. This paper develops a new approach for dimensionality reduction of high dimensional data, called local maximal marginal (interclass) embedding (LMME), to manifold learning and pattern recognition. LMME can be seen as a linear approach of a multimanifolds-based learning framework which integrates the information of neighbor and class relations. LMME characterize the local maximal marginal scatter as well as the local intraclass compactness, seeking to find a projection that maximizes the local maximal margin and minimizes the local intraclass scatter. This characteristic makes LMME more powerful than the most up-to-data method, Marginal Fisher Analysis (MFA), and maintain all the advantages of MFA. The proposed algorithm is applied to face recognition and is examined using the Yale, AR, ORL and face image databases. The experimental results show LMME consistently outperforms PCA, LDA and MFA, owing to the locally discriminating nature. This demonstrates that LMME is an effective method for face recognition.
{"title":"Local Maximal Marginal Embedding with Application to Face Recognition","authors":"Cairong Zhao, Zhihui Lai, Yuelei Sui, Yi Chen","doi":"10.1109/CCPR.2008.49","DOIUrl":"https://doi.org/10.1109/CCPR.2008.49","url":null,"abstract":"Many problems in information processing involve some form of dimensionality reduction. This paper develops a new approach for dimensionality reduction of high dimensional data, called local maximal marginal (interclass) embedding (LMME), to manifold learning and pattern recognition. LMME can be seen as a linear approach of a multimanifolds-based learning framework which integrates the information of neighbor and class relations. LMME characterize the local maximal marginal scatter as well as the local intraclass compactness, seeking to find a projection that maximizes the local maximal margin and minimizes the local intraclass scatter. This characteristic makes LMME more powerful than the most up-to-data method, Marginal Fisher Analysis (MFA), and maintain all the advantages of MFA. The proposed algorithm is applied to face recognition and is examined using the Yale, AR, ORL and face image databases. The experimental results show LMME consistently outperforms PCA, LDA and MFA, owing to the locally discriminating nature. This demonstrates that LMME is an effective method for face recognition.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128415321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In Chinese text categorization system, for most classifiers using vector space model (VSM), all attributes of documents construct a high dimensional feature space. And the high dimensionality of feature space is the bottleneck of categorization. TFIDF is a kind of common methods used to measure the terms in a document. The method is easy but it doesn't consider the unbalance distribution of terms among classes. This paper analyzed the TFIDF feature selection algorithm deeply, and proposed a new TFIDF feature selection method based on Gini index theory. Experimental results show the method is valid in improving the accuracy of text categorization.
{"title":"A Text Feature Selection Algorithm Based on Improved TFIDF","authors":"Cheng-San Yang, Xingshi He","doi":"10.1109/CCPR.2008.87","DOIUrl":"https://doi.org/10.1109/CCPR.2008.87","url":null,"abstract":"In Chinese text categorization system, for most classifiers using vector space model (VSM), all attributes of documents construct a high dimensional feature space. And the high dimensionality of feature space is the bottleneck of categorization. TFIDF is a kind of common methods used to measure the terms in a document. The method is easy but it doesn't consider the unbalance distribution of terms among classes. This paper analyzed the TFIDF feature selection algorithm deeply, and proposed a new TFIDF feature selection method based on Gini index theory. Experimental results show the method is valid in improving the accuracy of text categorization.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134349121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Unsupervised discriminant projection (UDP) has a good effect on face recognition problem, but it has not made full use of the training samples' class information that is useful for classification. Linear discrimination analysis (LDA) is a classical face recognition method. It is effective for classification, but it can not discover the samples' nonlinear structure. This paper develops a manifold-based supervised feature extraction method, which combines the manifold learning method UDP and the class-label information. It seeks to find a projection that maximizes the nonlocal scatter, while minimizes the local scatter and the within-class scatter. This method not only finds the intrinsic low-dimensional nonlinear representation of original high-dimensional data, but also is effective for classification. The experimental results on Yale face image database show that the proposed method outperforms the current UDP and LDA.
{"title":"Manifold-Based Supervised Feature Extraction and Face Recognition","authors":"Caikou Chen, Cao Li, Jing-yu Yang","doi":"10.1109/CCPR.2008.16","DOIUrl":"https://doi.org/10.1109/CCPR.2008.16","url":null,"abstract":"Unsupervised discriminant projection (UDP) has a good effect on face recognition problem, but it has not made full use of the training samples' class information that is useful for classification. Linear discrimination analysis (LDA) is a classical face recognition method. It is effective for classification, but it can not discover the samples' nonlinear structure. This paper develops a manifold-based supervised feature extraction method, which combines the manifold learning method UDP and the class-label information. It seeks to find a projection that maximizes the nonlocal scatter, while minimizes the local scatter and the within-class scatter. This method not only finds the intrinsic low-dimensional nonlinear representation of original high-dimensional data, but also is effective for classification. The experimental results on Yale face image database show that the proposed method outperforms the current UDP and LDA.","PeriodicalId":292956,"journal":{"name":"2008 Chinese Conference on Pattern Recognition","volume":"3 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132757022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}