Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272733
Pengfei Dou, I. Kakadiaris
Image-based 3D face reconstruction has great potential in different areas, such as facial recognition, facial analysis, and facial animation. Due to the variations in image quality, single-image-based 3D face reconstruction might not be sufficient to accurately reconstruct a 3D face. To overcome this limitation, multi-view 3D face reconstruction uses multiple images of the same subject and aggregates complementary information for better accuracy. Though theoretically appealing, there are multiple challenges in practice. Among these challenges, the most significant is that it is difficult to establish coherent and accurate correspondence among a set of images, especially when these images are captured in different conditions. In this paper, we propose a method, Deep Recurrent 3D FAce Reconstruction (DRFAR), to solve the task ofmulti-view 3D face reconstruction using a subspace representation of the 3D facial shape and a deep recurrent neural network that consists of both a deep con-volutional neural network (DCNN) and a recurrent neural network (RNN). The DCNN disentangles the facial identity and the facial expression components for each single image independently, while the RNN fuses identity-related features from the DCNN and aggregates the identity specific contextual information, or the identity signal, from the whole set of images to predict the facial identity parameter, which is robust to variations in image quality and is consistent over the whole set of images. Through extensive experiments, we evaluate our proposed method and demonstrate its superiority over existing methods.
基于图像的三维人脸重建在人脸识别、人脸分析和人脸动画等不同领域具有巨大的潜力。由于图像质量的差异,基于单图像的3D人脸重建可能不足以准确地重建3D人脸。为了克服这一限制,多视图3D人脸重建使用同一主题的多幅图像并聚合互补信息以提高准确性。虽然理论上很有吸引力,但在实践中存在多重挑战。在这些挑战中,最重要的是很难在一组图像之间建立连贯和准确的对应关系,特别是当这些图像在不同条件下捕获时。本文提出了一种深度递归3D人脸重建(Deep Recurrent 3D FAce Reconstruction, DRFAR)方法,该方法利用三维人脸形状的子空间表示和由深度卷积神经网络(DCNN)和递归神经网络(RNN)组成的深度递归神经网络来解决多视图3D人脸重建任务。DCNN可以独立分离每张图像的面部身份和面部表情成分,而RNN则融合来自DCNN的身份相关特征,并聚合来自整组图像的身份特定上下文信息或身份信号来预测面部身份参数,该参数对图像质量的变化具有鲁棒性,并且在整组图像上保持一致。通过大量的实验,我们评估了我们提出的方法,并证明了它比现有方法的优越性。
{"title":"Multi-view 3D face reconstruction with deep recurrent neural networks","authors":"Pengfei Dou, I. Kakadiaris","doi":"10.1109/BTAS.2017.8272733","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272733","url":null,"abstract":"Image-based 3D face reconstruction has great potential in different areas, such as facial recognition, facial analysis, and facial animation. Due to the variations in image quality, single-image-based 3D face reconstruction might not be sufficient to accurately reconstruct a 3D face. To overcome this limitation, multi-view 3D face reconstruction uses multiple images of the same subject and aggregates complementary information for better accuracy. Though theoretically appealing, there are multiple challenges in practice. Among these challenges, the most significant is that it is difficult to establish coherent and accurate correspondence among a set of images, especially when these images are captured in different conditions. In this paper, we propose a method, Deep Recurrent 3D FAce Reconstruction (DRFAR), to solve the task ofmulti-view 3D face reconstruction using a subspace representation of the 3D facial shape and a deep recurrent neural network that consists of both a deep con-volutional neural network (DCNN) and a recurrent neural network (RNN). The DCNN disentangles the facial identity and the facial expression components for each single image independently, while the RNN fuses identity-related features from the DCNN and aggregates the identity specific contextual information, or the identity signal, from the whole set of images to predict the facial identity parameter, which is robust to variations in image quality and is consistent over the whole set of images. Through extensive experiments, we evaluate our proposed method and demonstrate its superiority over existing methods.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126606987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272719
Guangcan Mai, M. Lim, P. Yuen
A security index for biometric systems is essential because biometrics have been widely adopted as a secure authentication component in critical systems. Most of bio-metric systems secured by template protection schemes are based on binary templates. To adopt popular template protection schemes such as fuzzy commitment and fuzzy extractor that can be applied on binary templates only, non-binary templates (e.g., real-valued, point-set based) need to be converted to binary. However, existing security measurements for binary template based biometric systems either cannot reflect the actual attack difficulties or are too computationally expensive to be practical. This paper presents an acceleration of the guessing entropy which reflects the expected number of guessing trials in attacking the binary template based biometric systems. The acceleration benefits from computation reuse and pruning. Experimental results on two datasets show that the acceleration has more than 6x, 20x, and 200x speed up without losing the estimation accuracy in different system settings.
{"title":"On the guessability of binary biometric templates: A practical guessing entropy based approach","authors":"Guangcan Mai, M. Lim, P. Yuen","doi":"10.1109/BTAS.2017.8272719","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272719","url":null,"abstract":"A security index for biometric systems is essential because biometrics have been widely adopted as a secure authentication component in critical systems. Most of bio-metric systems secured by template protection schemes are based on binary templates. To adopt popular template protection schemes such as fuzzy commitment and fuzzy extractor that can be applied on binary templates only, non-binary templates (e.g., real-valued, point-set based) need to be converted to binary. However, existing security measurements for binary template based biometric systems either cannot reflect the actual attack difficulties or are too computationally expensive to be practical. This paper presents an acceleration of the guessing entropy which reflects the expected number of guessing trials in attacking the binary template based biometric systems. The acceleration benefits from computation reuse and pruning. Experimental results on two datasets show that the acceleration has more than 6x, 20x, and 200x speed up without losing the estimation accuracy in different system settings.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126350922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272703
Huibin Li, Jian Sun, Liming Chen
This paper presents a straight-forward yet efficient, and expression-robust 3D face recognition approach by exploring location sensitive sparse representation of deep normal patterns (DNP). In particular, given raw 3D facial surfaces, we first run 3D face pre-processing pipeline, including nose tip detection, face region cropping, and pose normalization. The 3D coordinates of each normalized 3D facial surface are then projected into 2D plane to generate geometry images, from which three images of facial surface normal components are estimated. Each normal image is then fed into a pre-trained deep face net to generate deep representations of facial surface normals, i.e., deep normal patterns. Considering the importance of different facial locations, we propose a location sensitive sparse representation classifier (LS-SRC) for similarity measure among deep normal patterns associated with different 3D faces. Finally, simple score-level fusion of different normal components are used for the final decision. The proposed approach achieves significantly high performance, and reporting rank-one scores of 98.01%, 97.60%, and 96.13% on the FRGC v2.0, Bosphorus, and BU-3DFE databases when only one sample per subject is used in the gallery. These experimental results reveals that the performance of 3D face recognition would be constantly improved with the aid of training deep models from massive 2D face images, which opens the door for future directions of 3D face recognition.
本文通过探索深度正常模式(deep normal patterns, DNP)的位置敏感稀疏表示,提出了一种简单、高效、表达鲁棒的3D人脸识别方法。特别是,给定原始的3D面部表面,我们首先运行3D面部预处理管道,包括鼻尖检测,面部区域裁剪和姿态归一化。然后将每个归一化的三维人脸表面的三维坐标投影到二维平面上生成几何图像,从中估计出人脸表面法线分量的三幅图像。然后将每个法线图像馈送到预训练的深度人脸网络中,以生成面部表面法线的深度表示,即深度法线模式。考虑到不同人脸位置的重要性,我们提出了一种位置敏感的稀疏表示分类器(LS-SRC),用于测量与不同3D人脸相关的深度法向模式之间的相似性。最后,使用不同正常分量的简单分数级融合进行最终判定。该方法取得了显著的高性能,当图库中每个受试者仅使用一个样本时,在FRGC v2.0、Bosphorus和BU-3DFE数据库上的排名得分分别为98.01%、97.60%和96.13%。这些实验结果表明,通过对大量二维人脸图像进行深度模型训练,可以不断提高三维人脸识别的性能,为未来的三维人脸识别方向打开了大门。
{"title":"Location-sensitive sparse representation of deep normal patterns for expression-robust 3D face recognition","authors":"Huibin Li, Jian Sun, Liming Chen","doi":"10.1109/BTAS.2017.8272703","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272703","url":null,"abstract":"This paper presents a straight-forward yet efficient, and expression-robust 3D face recognition approach by exploring location sensitive sparse representation of deep normal patterns (DNP). In particular, given raw 3D facial surfaces, we first run 3D face pre-processing pipeline, including nose tip detection, face region cropping, and pose normalization. The 3D coordinates of each normalized 3D facial surface are then projected into 2D plane to generate geometry images, from which three images of facial surface normal components are estimated. Each normal image is then fed into a pre-trained deep face net to generate deep representations of facial surface normals, i.e., deep normal patterns. Considering the importance of different facial locations, we propose a location sensitive sparse representation classifier (LS-SRC) for similarity measure among deep normal patterns associated with different 3D faces. Finally, simple score-level fusion of different normal components are used for the final decision. The proposed approach achieves significantly high performance, and reporting rank-one scores of 98.01%, 97.60%, and 96.13% on the FRGC v2.0, Bosphorus, and BU-3DFE databases when only one sample per subject is used in the gallery. These experimental results reveals that the performance of 3D face recognition would be constantly improved with the aid of training deep models from massive 2D face images, which opens the door for future directions of 3D face recognition.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128974683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272760
Yi Zhang, Houjun Huang, Haifeng Zhang, Liao Ni, W. Xu, N. U. Ahmed, Md. Shakil Ahmed, Yilun Jin, Ying Chen, Jingxuan Wen, Wenxin Li
In recent years, finger vein recognition has become an important sub-field in biometrics and been applied to real-world applications. The development of finger vein recognition algorithms heavily depends on large-scale real-world data sets. In order to motivate research on finger vein recognition, we released the largest finger vein data set up to now and hold finger vein recognition competitions based on our data set every year. In 2017, International Competition on Finger Vein Recognition (ICFVR) is held jointly with IJCB 2017. 11 teams registered and 10 of them joined the final evaluation. The winner of this year dramatically improved the EER from 2.64% to 0.483% compared to the 'winner of last year. In this paper, we introduce the process and results of ICFVR 2017 and give insights on development of state-of-art finger vein recognition algorithms.
{"title":"ICFVR 2017: 3rd international competition on finger vein recognition","authors":"Yi Zhang, Houjun Huang, Haifeng Zhang, Liao Ni, W. Xu, N. U. Ahmed, Md. Shakil Ahmed, Yilun Jin, Ying Chen, Jingxuan Wen, Wenxin Li","doi":"10.1109/BTAS.2017.8272760","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272760","url":null,"abstract":"In recent years, finger vein recognition has become an important sub-field in biometrics and been applied to real-world applications. The development of finger vein recognition algorithms heavily depends on large-scale real-world data sets. In order to motivate research on finger vein recognition, we released the largest finger vein data set up to now and hold finger vein recognition competitions based on our data set every year. In 2017, International Competition on Finger Vein Recognition (ICFVR) is held jointly with IJCB 2017. 11 teams registered and 10 of them joined the final evaluation. The winner of this year dramatically improved the EER from 2.64% to 0.483% compared to the 'winner of last year. In this paper, we introduce the process and results of ICFVR 2017 and give insights on development of state-of-art finger vein recognition algorithms.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124421728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272747
Abdulaziz Alorf, A. L. Abbott
The current trend in image analysis is to employ automatically detected feature types, such as those obtained using deep-learning techniques. For some applications, however, manually crafted features such as Histogram of Oriented Gradients (HOG) continue to yield better performance in demanding situations. This paper considers both approaches for the problem of facial attribute classification, for images obtained “in the wild.” Attributes of particular interest are eye state (open/closed), mouth state (open/closed), and eyeglasses (present/absent). We present a full face-processing pipeline that employs conventional machine learning techniques, from detection to attribute classification. Experimental results have indicated better performance using RootSIFT with a conventional support-vector machine (SVM) approach, as compared to deep-learning approaches that have been reported in the literature. Our proposed open/closed eye classifier has yielded an accuracy of 99.3% on the CEW dataset, and an accuracy of 98.7% on the ZJU dataset. Similarly, our proposed open/closed mouth classifier has achieved performance similar to deep learning. Also, our proposed presence/absence eyeglasses classifier delivered very good performance, being the best method on LFWA, and second best for the CelebA dataset. The system reported here runs at 30 fps on HD-sized video using a CPU-only implementation.
{"title":"In defense of low-level structural features and SVMs for facial attribute classification: Application to detection of eye state, Mouth State, and eyeglasses in the wild","authors":"Abdulaziz Alorf, A. L. Abbott","doi":"10.1109/BTAS.2017.8272747","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272747","url":null,"abstract":"The current trend in image analysis is to employ automatically detected feature types, such as those obtained using deep-learning techniques. For some applications, however, manually crafted features such as Histogram of Oriented Gradients (HOG) continue to yield better performance in demanding situations. This paper considers both approaches for the problem of facial attribute classification, for images obtained “in the wild.” Attributes of particular interest are eye state (open/closed), mouth state (open/closed), and eyeglasses (present/absent). We present a full face-processing pipeline that employs conventional machine learning techniques, from detection to attribute classification. Experimental results have indicated better performance using RootSIFT with a conventional support-vector machine (SVM) approach, as compared to deep-learning approaches that have been reported in the literature. Our proposed open/closed eye classifier has yielded an accuracy of 99.3% on the CEW dataset, and an accuracy of 98.7% on the ZJU dataset. Similarly, our proposed open/closed mouth classifier has achieved performance similar to deep learning. Also, our proposed presence/absence eyeglasses classifier delivered very good performance, being the best method on LFWA, and second best for the CelebA dataset. The system reported here runs at 30 fps on HD-sized video using a CPU-only implementation.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132609026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272718
P. Drozdowski, C. Rathgeb, H. Hofbauer, J. Wagner, A. Uhl, C. Busch
The necessity of biometric template alignment imposes a significant computational load and increases the probability of false positive occurrences in biometric systems. While for some modalities, automatic pre-alignment of biometric samples is utilised, this topic has not yet been explored for systems based on the iris. This paper presents a method for pre-alignment of iris images based on the positions ofautomatically detected eye corners. Existing work in the area of automatic eye corner detection has hitherto only involved visible wavelength images; for the near-infrared images, used in the vast majority of current iris recognition systems, this task is significantly more challenging and as of yet unexplored. A comparative study of two methods for solving this problem is presented in this paper. The eye corners detected by the two methods are then used for the pre-alignment and biometric performance evaluation experiments. The system utilising image pre-alignment is benchmarked against a baseline iris recognition system on the iris subset of the BioSecure database. In the benchmark, the workload associated with alignment compensation is significantly reduced, while the biometric performance remains unchanged or even improves slightly.
{"title":"Towards pre-alignment of near-infrared iris images","authors":"P. Drozdowski, C. Rathgeb, H. Hofbauer, J. Wagner, A. Uhl, C. Busch","doi":"10.1109/BTAS.2017.8272718","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272718","url":null,"abstract":"The necessity of biometric template alignment imposes a significant computational load and increases the probability of false positive occurrences in biometric systems. While for some modalities, automatic pre-alignment of biometric samples is utilised, this topic has not yet been explored for systems based on the iris. This paper presents a method for pre-alignment of iris images based on the positions ofautomatically detected eye corners. Existing work in the area of automatic eye corner detection has hitherto only involved visible wavelength images; for the near-infrared images, used in the vast majority of current iris recognition systems, this task is significantly more challenging and as of yet unexplored. A comparative study of two methods for solving this problem is presented in this paper. The eye corners detected by the two methods are then used for the pre-alignment and biometric performance evaluation experiments. The system utilising image pre-alignment is benchmarked against a baseline iris recognition system on the iris subset of the BioSecure database. In the benchmark, the workload associated with alignment compensation is significantly reduced, while the biometric performance remains unchanged or even improves slightly.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114911039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272749
Sudipta Banerjee, A. Ross
Iris recognition entails the use of iris images to recognize an individual. In some cases, the iris image acquired from an individual can be modified by subjecting it to successive photometric transformations such as brightening, gamma correction, median filtering and Gaussian smoothing, resulting in a family of transformed images. Automatically inferring the relationship between the set of transformed images is important in the context of digital image forensics. In this regard, we develop a method to generate an Image Phylogeny Tree (IPT) from a set of such transformed images. Our strategy entails modeling an arbitrary photometric transformation as a linear or non-linear function and utilizing the parameters of the model to quantify the relationship between pairs of images. The estimated parameters are then used to generate the IPT. Modest, yet promising, results are obtained in terms of parameter estimation and IPT generation.
{"title":"Computing an image Phylogeny Tree from photometrically modified iris images","authors":"Sudipta Banerjee, A. Ross","doi":"10.1109/BTAS.2017.8272749","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272749","url":null,"abstract":"Iris recognition entails the use of iris images to recognize an individual. In some cases, the iris image acquired from an individual can be modified by subjecting it to successive photometric transformations such as brightening, gamma correction, median filtering and Gaussian smoothing, resulting in a family of transformed images. Automatically inferring the relationship between the set of transformed images is important in the context of digital image forensics. In this regard, we develop a method to generate an Image Phylogeny Tree (IPT) from a set of such transformed images. Our strategy entails modeling an arbitrary photometric transformation as a linear or non-linear function and utilizing the parameters of the model to quantify the relationship between pairs of images. The estimated parameters are then used to generate the IPT. Modest, yet promising, results are obtained in terms of parameter estimation and IPT generation.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130798426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272748
Anurag Chowdhury, A. Ross
We present a deep learning based algorithm for speaker recognition from degraded audio signals. We use the commonly employed Mel-Frequency Cepstral Coefficients (MFCC) for representing the audio signals. A convolutional neural network (CNN) based on 1D filters, rather than 2D filters, is then designed. The filters in the CNN are designed to learn inter-dependency between cepstral coefficients extracted from audio frames of fixed temporal expanse. Our approach aims at extracting speaker dependent features, like Sub-glottal and Supra-glottal features, of the human speech production apparatus for identifying speakers from degraded audio signals. The performance of the proposed method is compared against existing baseline schemes on both synthetically and naturally corrupted speech data. Experiments convey the efficacy of the proposed architecture for speaker recognition.
{"title":"Extracting sub-glottal and Supra-glottal features from MFCC using convolutional neural networks for speaker identification in degraded audio signals","authors":"Anurag Chowdhury, A. Ross","doi":"10.1109/BTAS.2017.8272748","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272748","url":null,"abstract":"We present a deep learning based algorithm for speaker recognition from degraded audio signals. We use the commonly employed Mel-Frequency Cepstral Coefficients (MFCC) for representing the audio signals. A convolutional neural network (CNN) based on 1D filters, rather than 2D filters, is then designed. The filters in the CNN are designed to learn inter-dependency between cepstral coefficients extracted from audio frames of fixed temporal expanse. Our approach aims at extracting speaker dependent features, like Sub-glottal and Supra-glottal features, of the human speech production apparatus for identifying speakers from degraded audio signals. The performance of the proposed method is compared against existing baseline schemes on both synthetically and naturally corrupted speech data. Experiments convey the efficacy of the proposed architecture for speaker recognition.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132225129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272713
Yousef Atoum, Yaojie Liu, Amin Jourabloo, Xiaoming Liu
The face image is the most accessible biometric modality which is used for highly accurate face recognition systems, while it is vulnerable to many different types of presentation attacks. Face anti-spoofing is a very critical step before feeding the face image to biometric systems. In this paper, we propose a novel two-stream CNN-based approach for face anti-spoofing, by extracting the local features and holistic depth maps from the face images. The local features facilitate CNN to discriminate the spoof patches independent of the spatial face areas. On the other hand, holistic depth map examine whether the input image has a face-like depth. Extensive experiments are conducted on the challenging databases (CASIA-FASD, MSU-USSA, and Replay Attack), with comparison to the state of the art.
{"title":"Face anti-spoofing using patch and depth-based CNNs","authors":"Yousef Atoum, Yaojie Liu, Amin Jourabloo, Xiaoming Liu","doi":"10.1109/BTAS.2017.8272713","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272713","url":null,"abstract":"The face image is the most accessible biometric modality which is used for highly accurate face recognition systems, while it is vulnerable to many different types of presentation attacks. Face anti-spoofing is a very critical step before feeding the face image to biometric systems. In this paper, we propose a novel two-stream CNN-based approach for face anti-spoofing, by extracting the local features and holistic depth maps from the face images. The local features facilitate CNN to discriminate the spoof patches independent of the spatial face areas. On the other hand, holistic depth map examine whether the input image has a face-like depth. Extensive experiments are conducted on the challenging databases (CASIA-FASD, MSU-USSA, and Replay Attack), with comparison to the state of the art.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121188977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/BTAS.2017.8272769
Ning Jia, Victor Sanchez, Chang-Tsun Li
Gait recognition can be performed without subject cooperation under harsh conditions, thus it is an important tool in forensic gait analysis, security control, and other commercial applications. One critical issue that prevents gait recognition systems from being widely accepted is the performance drop when the camera viewpoint varies between the registered templates and the query data. In this paper, we explore the potential of combining feature optimisers and representations learned by convolutional neural networks (CNN) to achieve efficient view-invariant gait recognition. The experimental results indicate that CNN learns highly discriminative representations across moderate view variations, and these representations can be further improved using view-invariant feature selectors, achieving a high matching accuracy across views.
{"title":"Learning optimised representations for view-invariant gait recognition","authors":"Ning Jia, Victor Sanchez, Chang-Tsun Li","doi":"10.1109/BTAS.2017.8272769","DOIUrl":"https://doi.org/10.1109/BTAS.2017.8272769","url":null,"abstract":"Gait recognition can be performed without subject cooperation under harsh conditions, thus it is an important tool in forensic gait analysis, security control, and other commercial applications. One critical issue that prevents gait recognition systems from being widely accepted is the performance drop when the camera viewpoint varies between the registered templates and the query data. In this paper, we explore the potential of combining feature optimisers and representations learned by convolutional neural networks (CNN) to achieve efficient view-invariant gait recognition. The experimental results indicate that CNN learns highly discriminative representations across moderate view variations, and these representations can be further improved using view-invariant feature selectors, achieving a high matching accuracy across views.","PeriodicalId":372008,"journal":{"name":"2017 IEEE International Joint Conference on Biometrics (IJCB)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116031842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}