Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987247
O. Nikisins, Anjith George, S. Marcel
While the performance of face recognition systems has improved significantly in the last decade, they are proved to be highly vulnerable to presentation attacks (spoofing). Most of the research in the field of face presentation attack detection (PAD), was focused on boosting the performance of the systems within a single database. Face PAD datasets are usually captured with RGB cameras, and have very limited number of both bona-fide samples and presentation attack instruments. Training face PAD systems on such data leads to poor performance, even in the closed-set scenario, especially when sophisticated attacks are involved. We explore two paths to boost the performance of the face PAD system against challenging attacks. First, by using multichannel (RGB, Depth and NIR) data, which is still easily accessible in a number of mass production devices. Second, we develop a novel Autoencoders + MLP based face PAD algorithm. Moreover, instead of collecting more data for training of the proposed deep architecture, the domain adaptation technique is proposed, transferring the knowledge of facial appearance from RGB to multi-channel domain. We also demonstrate, that learning the features of individual facial regions, is more discriminative than the features learned from an entire face. The proposed system is tested on a very recent publicly available multi-channel PAD database with a wide variety of presentation attacks.
{"title":"Domain Adaptation in Multi-Channel Autoencoder based Features for Robust Face Anti-Spoofing","authors":"O. Nikisins, Anjith George, S. Marcel","doi":"10.1109/ICB45273.2019.8987247","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987247","url":null,"abstract":"While the performance of face recognition systems has improved significantly in the last decade, they are proved to be highly vulnerable to presentation attacks (spoofing). Most of the research in the field of face presentation attack detection (PAD), was focused on boosting the performance of the systems within a single database. Face PAD datasets are usually captured with RGB cameras, and have very limited number of both bona-fide samples and presentation attack instruments. Training face PAD systems on such data leads to poor performance, even in the closed-set scenario, especially when sophisticated attacks are involved. We explore two paths to boost the performance of the face PAD system against challenging attacks. First, by using multichannel (RGB, Depth and NIR) data, which is still easily accessible in a number of mass production devices. Second, we develop a novel Autoencoders + MLP based face PAD algorithm. Moreover, instead of collecting more data for training of the proposed deep architecture, the domain adaptation technique is proposed, transferring the knowledge of facial appearance from RGB to multi-channel domain. We also demonstrate, that learning the features of individual facial regions, is more discriminative than the features learned from an entire face. The proposed system is tested on a very recent publicly available multi-channel PAD database with a wide variety of presentation attacks.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130883337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987237
Dominik Söllinger, Babak Maser, A. Uhl
In this work, we study the applicability of PRNU-based sensor identification methods for finger vein imagery. We also investigate the effect of different image regions on the identification performance by looking at five different crop-pings with different sizes. The proposed method is tested on eight publicly available finger vein datasets. For each finger vein sensor a noise reference pattern is generated and subsequently matched with noise residuals extracted from previously unseen finger vein images. Although the final result strongly encourages the use of PRNU-based approaches for sensor identification, it can also be observed that the choice of image region for PRNU extraction is crucial. The result clearly shows that regions containing biometric trait (varying content) should be preferred over background regions containing non-biometric trait (identical content).
{"title":"PRNU-based finger vein sensor identification: On the effect of different sensor croppings","authors":"Dominik Söllinger, Babak Maser, A. Uhl","doi":"10.1109/ICB45273.2019.8987237","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987237","url":null,"abstract":"In this work, we study the applicability of PRNU-based sensor identification methods for finger vein imagery. We also investigate the effect of different image regions on the identification performance by looking at five different crop-pings with different sizes. The proposed method is tested on eight publicly available finger vein datasets. For each finger vein sensor a noise reference pattern is generated and subsequently matched with noise residuals extracted from previously unseen finger vein images. Although the final result strongly encourages the use of PRNU-based approaches for sensor identification, it can also be observed that the choice of image region for PRNU extraction is crucial. The result clearly shows that regions containing biometric trait (varying content) should be preferred over background regions containing non-biometric trait (identical content).","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130182380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987241
Xiaoting Wu, Eric Granger, T. Kinnunen, Xiaoyi Feng, A. Hadid
Kinship verification is a challenging problem, where recognition systems are trained to establish a kin relation between two individuals based on facial images or videos. However, due to variations in capture conditions (background, pose, expression, illumination and occlusion), state-of-the-art systems currently provide a low level of accuracy. As in many visual recognition and affective computing applications, kinship verification may benefit from a combination of discriminant information extracted from both video and audio signals. In this paper, we investigate for the first time the fusion audio-visual information from both face and voice modalities to improve kinship verification accuracy. First, we propose a new multi-modal kinship dataset called TALking KINship (TALKIN), that is comprised of several pairs of video sequences with subjects talking. State-of-the-art conventional and deep learning models are assessed and compared for kinship verification using this dataset. Finally, we propose a deep Siamese network for multi-modal fusion of kinship relations. Experiments with the TALKIN dataset indicate that the proposed Siamese network provides a significantly higher level of accuracy over baseline uni-modal and multi-modal fusion techniques for kinship verification. Results also indicate that audio (vocal) information is complementary and useful for kinship verification problem.
{"title":"Audio-Visual Kinship Verification in the Wild","authors":"Xiaoting Wu, Eric Granger, T. Kinnunen, Xiaoyi Feng, A. Hadid","doi":"10.1109/ICB45273.2019.8987241","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987241","url":null,"abstract":"Kinship verification is a challenging problem, where recognition systems are trained to establish a kin relation between two individuals based on facial images or videos. However, due to variations in capture conditions (background, pose, expression, illumination and occlusion), state-of-the-art systems currently provide a low level of accuracy. As in many visual recognition and affective computing applications, kinship verification may benefit from a combination of discriminant information extracted from both video and audio signals. In this paper, we investigate for the first time the fusion audio-visual information from both face and voice modalities to improve kinship verification accuracy. First, we propose a new multi-modal kinship dataset called TALking KINship (TALKIN), that is comprised of several pairs of video sequences with subjects talking. State-of-the-art conventional and deep learning models are assessed and compared for kinship verification using this dataset. Finally, we propose a deep Siamese network for multi-modal fusion of kinship relations. Experiments with the TALKIN dataset indicate that the proposed Siamese network provides a significantly higher level of accuracy over baseline uni-modal and multi-modal fusion techniques for kinship verification. Results also indicate that audio (vocal) information is complementary and useful for kinship verification problem.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130346253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987330
Jan Spooren, D. Preuveneers, W. Joosen
This paper presents a novel solution based on PPG to strengthen face authentication. Our method leverages and combines different PPG signals from multiple channels to meet two objectives. First, it complements face authentication with an additional authentication factor, and second, it strengthens the liveness detection to be more resistant against presentation attacks. Our solution can be implemented as an unlock screen for mobile phones having a front-side and back-side camera or a paired smart watch as well as for webcam-enabled laptops augmented with a PPG sensor. Our evaluation shows that our method can significantly improve the resilience against presentation attacks for face recognition-based user authentication.
{"title":"PPG2Live: Using dual PPG for active authentication and liveness detection","authors":"Jan Spooren, D. Preuveneers, W. Joosen","doi":"10.1109/ICB45273.2019.8987330","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987330","url":null,"abstract":"This paper presents a novel solution based on PPG to strengthen face authentication. Our method leverages and combines different PPG signals from multiple channels to meet two objectives. First, it complements face authentication with an additional authentication factor, and second, it strengthens the liveness detection to be more resistant against presentation attacks. Our solution can be implemented as an unlock screen for mobile phones having a front-side and back-side camera or a paired smart watch as well as for webcam-enabled laptops augmented with a PPG sensor. Our evaluation shows that our method can significantly improve the resilience against presentation attacks for face recognition-based user authentication.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114258378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987270
Caiyong Wang, Yong He, Yunfan Liu, Zhaofeng He, R. He, Zhenan Sun
Accurate sclera segmentation is critical for successful sclera recognition. However, studies on sclera segmentation algorithms are still limited in the literature. In this paper, we propose a novel sclera segmentation method based on the improved U-Net model, named as ScleraSegNet. We perform in-depth analysis regarding the structure of U-Net model, and propose to embed an attention module into the central bottleneck part between the contracting path and the expansive path of U-Net to strengthen the ability of learning discriminative representations. We compare different attention modules and find that channel-wise attention is the most effective in improving the performance of the segmentation network. Besides, we evaluate the effectiveness of data augmentation process in improving the generalization ability of the segmentation network. Experiment results show that the best performing configuration of the proposed method achieves state-of-the-art performance with F-measure values of 91.43%, 89.54% on UBIRIS.v2 and MICHE, respectively.
{"title":"ScleraSegNet: an Improved U-Net Model with Attention for Accurate Sclera Segmentation","authors":"Caiyong Wang, Yong He, Yunfan Liu, Zhaofeng He, R. He, Zhenan Sun","doi":"10.1109/ICB45273.2019.8987270","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987270","url":null,"abstract":"Accurate sclera segmentation is critical for successful sclera recognition. However, studies on sclera segmentation algorithms are still limited in the literature. In this paper, we propose a novel sclera segmentation method based on the improved U-Net model, named as ScleraSegNet. We perform in-depth analysis regarding the structure of U-Net model, and propose to embed an attention module into the central bottleneck part between the contracting path and the expansive path of U-Net to strengthen the ability of learning discriminative representations. We compare different attention modules and find that channel-wise attention is the most effective in improving the performance of the segmentation network. Besides, we evaluate the effectiveness of data augmentation process in improving the generalization ability of the segmentation network. Experiment results show that the best performing configuration of the proposed method achieves state-of-the-art performance with F-measure values of 91.43%, 89.54% on UBIRIS.v2 and MICHE, respectively.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125121955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987418
T. Neal, M. A. Noor, P. Gera, Khadija Zanna, G. Kaptan
Collectively, user-friendly interfaces, small but impactful sensing technologies, intuitive device designs, and the variety of mobile applications (or apps) have transformed the expectations for cellular phones. Apps are a primary factor in device functionality; they allow users to quickly carry out tasks directly on their device. This paper leverages mobile apps for continuous authentication of mobile device users. We borrow from a gait-based approach by continuously extracting n-bin histograms from numerically encoded app data. Since more active subjects will generate more data, it would be trivial to distinguish between these subjects and others which are not as active. Thus, we divided a dataset of 19 months of app data from 181 subjects into three datasets to determine if minimally active, moderately active, or very active subjects were more challenging to authenticate. Using the absolute distance between two histograms, our approach yielded a worst-case EER of 0.188 and a best-case EER of 0.036 with a worst-case initial training period of 1.06 hours. We also show a positive correlation between user activity level and performance, and template size and performance. Our method is characterized by minimal training samples and a context-independent evaluation, addressing important factors which are known to affect the practicality of continuous authentication systems.
{"title":"Authenticating Phone Users Using a Gait-Based Histogram Approach on Mobile App Sessions","authors":"T. Neal, M. A. Noor, P. Gera, Khadija Zanna, G. Kaptan","doi":"10.1109/ICB45273.2019.8987418","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987418","url":null,"abstract":"Collectively, user-friendly interfaces, small but impactful sensing technologies, intuitive device designs, and the variety of mobile applications (or apps) have transformed the expectations for cellular phones. Apps are a primary factor in device functionality; they allow users to quickly carry out tasks directly on their device. This paper leverages mobile apps for continuous authentication of mobile device users. We borrow from a gait-based approach by continuously extracting n-bin histograms from numerically encoded app data. Since more active subjects will generate more data, it would be trivial to distinguish between these subjects and others which are not as active. Thus, we divided a dataset of 19 months of app data from 181 subjects into three datasets to determine if minimally active, moderately active, or very active subjects were more challenging to authenticate. Using the absolute distance between two histograms, our approach yielded a worst-case EER of 0.188 and a best-case EER of 0.036 with a worst-case initial training period of 1.06 hours. We also show a positive correlation between user activity level and performance, and template size and performance. Our method is characterized by minimal training samples and a context-independent evaluation, addressing important factors which are known to affect the practicality of continuous authentication systems.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121914155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987244
Javier Galbally, Rudolf Haraksim, P. Ferrara, Laurent Beslay, Elham Tabassi
Over the last two decades of biometric research, it has been shown in numerous occasions the key impact that the quality of biometric samples has on the performance of biometric recognition systems. Few other biometric characteristics, if any, have been analysed so in depth from a quality perspective than fingerprints. This has been largely due to the development by the US NIST of two successive systemindependent metrics that have become a standard to estimate fingerprint quality: NFIQ1 and NFIQ2. However, in spite of their unquestionable influence in the development of fingerprint technology, there is still a lack of understanding of how these two metrics relate to each other. The present article is an attempt to bridge this gap, presenting new insight into the meaningfulness of both metrics, and describing a mapping function between NFIQ2 values and NFIQ1 classes.
{"title":"Fingerprint Quality: Mapping NFIQ1 Classes and NFIQ2 Values","authors":"Javier Galbally, Rudolf Haraksim, P. Ferrara, Laurent Beslay, Elham Tabassi","doi":"10.1109/ICB45273.2019.8987244","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987244","url":null,"abstract":"Over the last two decades of biometric research, it has been shown in numerous occasions the key impact that the quality of biometric samples has on the performance of biometric recognition systems. Few other biometric characteristics, if any, have been analysed so in depth from a quality perspective than fingerprints. This has been largely due to the development by the US NIST of two successive systemindependent metrics that have become a standard to estimate fingerprint quality: NFIQ1 and NFIQ2. However, in spite of their unquestionable influence in the development of fingerprint technology, there is still a lack of understanding of how these two metrics relate to each other. The present article is an attempt to bridge this gap, presenting new insight into the meaningfulness of both metrics, and describing a mapping function between NFIQ2 values and NFIQ1 classes.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129626296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987435
Ewelina Bartuzi, N. Damer
Synthesizing visual-like images from those captured in the thermal spectrum allows for direct cross-domain comparisons. Moreover, it enables thermal-to-thermal comparisons that take advantage of feature extraction methodologies developed for the visual domain. Hand based biometrics are socially accepted and can operate in a touchless mode. However, certain deployment scenarios requires captures in non-visual spectrums due to impractical illumination requirements. Generating visual-like palm images from thermal ones faces challenges related to the nature of hand biometrics. Such challenges are the dynamic nature of the hand and the difficulties in accurately aligning hand’s scale and rotation, especially in the understudied thermal domain. Building such a synthetic solution is also challenged by the lack of large-scale databases that contain images collected in both spectra, as well as generating images of appropriate resolutions. Driven by these challenges, this paper presents a novel solution to transfer thermal palm images into high-quality visual-like images, regardless of the limited training data, or scale and rotational variations. We proved quality similarity and high correlation of the generated images to the original visual images. We used the synthesized images within verification approaches based on CNN and hand crafted-features. This allowed significantly improved the cross-spectral and thermal-to-thermal verification performances, reducing the EER from 37.12% to 16.25% and from 3.04% to 1.65%, respectively in both cases when using CNN-based features.
{"title":"Thermal and Cross-spectral Palm Image Matching in the Visual Domain by Robust Image Transformation","authors":"Ewelina Bartuzi, N. Damer","doi":"10.1109/ICB45273.2019.8987435","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987435","url":null,"abstract":"Synthesizing visual-like images from those captured in the thermal spectrum allows for direct cross-domain comparisons. Moreover, it enables thermal-to-thermal comparisons that take advantage of feature extraction methodologies developed for the visual domain. Hand based biometrics are socially accepted and can operate in a touchless mode. However, certain deployment scenarios requires captures in non-visual spectrums due to impractical illumination requirements. Generating visual-like palm images from thermal ones faces challenges related to the nature of hand biometrics. Such challenges are the dynamic nature of the hand and the difficulties in accurately aligning hand’s scale and rotation, especially in the understudied thermal domain. Building such a synthetic solution is also challenged by the lack of large-scale databases that contain images collected in both spectra, as well as generating images of appropriate resolutions. Driven by these challenges, this paper presents a novel solution to transfer thermal palm images into high-quality visual-like images, regardless of the limited training data, or scale and rotational variations. We proved quality similarity and high correlation of the generated images to the original visual images. We used the synthesized images within verification approaches based on CNN and hand crafted-features. This allowed significantly improved the cross-spectral and thermal-to-thermal verification performances, reducing the EER from 37.12% to 16.25% and from 3.04% to 1.65%, respectively in both cases when using CNN-based features.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117145550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987285
Lei Shi, Xiang Xu, I. Kakadiaris
Recently, significant effort has been devoted to exploring the role of feature fusion and enriching contextual information on detecting multi-scale faces. However, simply integrating features of different levels could lead to introducing significant noise. Moreover, recently proposed approaches of enriching contextual information are not efficient or ignore the gridding artifacts produced by dilated convolution. To tackle these issues, we developed a smoothed attention network (dubbed SANet), which introduces an Attention-guided Feature Fusion Module (AFFM) and a Smoothed Context Enhancement Module (SCEM). In particular, the AFFM applies an attention module to high-level semantic features and fuses attention-focused features with low-level semantic features to reduce the noise of the fused feature map. The SCEM stacks dilated convolution and convolution layers alternately to re-learn the relationship among completely separate sets of units produced by dilated convolution to maintain consistency of local information. The SANet achieves promising results on the WIDER FACE validation and testing datasets and is state-of-the-art on the UFDD dataset.
{"title":"SANet: Smoothed Attention Network for Single Stage Face Detector","authors":"Lei Shi, Xiang Xu, I. Kakadiaris","doi":"10.1109/ICB45273.2019.8987285","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987285","url":null,"abstract":"Recently, significant effort has been devoted to exploring the role of feature fusion and enriching contextual information on detecting multi-scale faces. However, simply integrating features of different levels could lead to introducing significant noise. Moreover, recently proposed approaches of enriching contextual information are not efficient or ignore the gridding artifacts produced by dilated convolution. To tackle these issues, we developed a smoothed attention network (dubbed SANet), which introduces an Attention-guided Feature Fusion Module (AFFM) and a Smoothed Context Enhancement Module (SCEM). In particular, the AFFM applies an attention module to high-level semantic features and fuses attention-focused features with low-level semantic features to reduce the noise of the fused feature map. The SCEM stacks dilated convolution and convolution layers alternately to re-learn the relationship among completely separate sets of units produced by dilated convolution to maintain consistency of local information. The SANet achieves promising results on the WIDER FACE validation and testing datasets and is state-of-the-art on the UFDD dataset.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126531655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.1109/ICB45273.2019.8987260
M. Gomez-Barrero, Jascha Kolberg, C. Busch
The deployment of biometric recognition systems has seen a considerable increase over the last decade, in particular for fingerprint based systems. To tackle the security issues derived from presentation attacks launched on the biometric capture device, automatic presentation attack detection (PAD) methods have been proposed. In spite of their high detection rates on the LivDet databases, the vast majority of the methods rely on the samples provided by traditional capture devices, which may fail to detect more sophisticated presentation attack instrument (PAI) species. In this paper, we propose a multi-modal fingerprint PAD which relies on an analysis of: i) the surface of the finger within the short wave infrared (SWIR) spectrum, and ii) the inside of the finger thanks to the laser speckle contrast imaging (LSCI) technology. On the experimental evaluation over a database comprising more than 4700 samples and 35 PAI species, and including unknown attacks to model a realistic scenario, a Detection Equal Error Rate (D-EER) of 0.5% has been achieved. Moreover, for a BPCER ≤ 0.1% (i.e., highly convenient system), the APCER remains around 3%.
{"title":"Multi-Modal Fingerprint Presentation Attack Detection: Analysing the Surface and the Inside","authors":"M. Gomez-Barrero, Jascha Kolberg, C. Busch","doi":"10.1109/ICB45273.2019.8987260","DOIUrl":"https://doi.org/10.1109/ICB45273.2019.8987260","url":null,"abstract":"The deployment of biometric recognition systems has seen a considerable increase over the last decade, in particular for fingerprint based systems. To tackle the security issues derived from presentation attacks launched on the biometric capture device, automatic presentation attack detection (PAD) methods have been proposed. In spite of their high detection rates on the LivDet databases, the vast majority of the methods rely on the samples provided by traditional capture devices, which may fail to detect more sophisticated presentation attack instrument (PAI) species. In this paper, we propose a multi-modal fingerprint PAD which relies on an analysis of: i) the surface of the finger within the short wave infrared (SWIR) spectrum, and ii) the inside of the finger thanks to the laser speckle contrast imaging (LSCI) technology. On the experimental evaluation over a database comprising more than 4700 samples and 35 PAI species, and including unknown attacks to model a realistic scenario, a Detection Equal Error Rate (D-EER) of 0.5% has been achieved. Moreover, for a BPCER ≤ 0.1% (i.e., highly convenient system), the APCER remains around 3%.","PeriodicalId":430846,"journal":{"name":"2019 International Conference on Biometrics (ICB)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132677229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}