A two-stage algorithm for heterogeneous face recognition using Deep Stacked PCA Descriptor (DSPD) and Coupled Discriminant Neighbourhood Embedding (CDNE)
{"title":"A two-stage algorithm for heterogeneous face recognition using Deep Stacked PCA Descriptor (DSPD) and Coupled Discriminant Neighbourhood Embedding (CDNE)","authors":"Shubhobrata Bhattacharya","doi":"10.1007/s00521-024-10272-5","DOIUrl":null,"url":null,"abstract":"<p>Automatic face recognition has made significant progress in recent decades, particularly in controlled environments. However, recognizing faces across different modalities, known as Heterogeneous Face Recognition, presents challenges due to variations in modality gaps. This paper addresses the problem of HFR by proposing a two-stage algorithm. In the first stage, a deep stacked PCA descriptor (DSPD) is introduced to extract domain-invariant features from face images of different modalities. The DSPD utilizes multiple convolution layers of domain-trained PCA filters, and the features extracted from each layer are concatenated to obtain a final feature representation. Additionally, pre-processing steps are applied to input images to enhance the prominence of facial edges, making the features more distinctive. The obtained DSPD features can be directly used for recognition using nearest neighbour algorithms. To further improve recognition robustness, a coupled subspace called coupled discriminant neighbourhood embedding (CDNE) is proposed in the second stage. CDNE is trained with a limited number of data samples and can project DSPD features from different modalities onto a common subspace. In this subspace, data points representing the same subjects from different modalities are positioned closely, while those of different subjects are positioned apart. This spatial arrangement enhances the recognition of heterogeneous faces using nearest neighbour algorithms. Experimental results demonstrate the effectiveness of the proposed algorithm on various HFR scenarios, including VIS-NIR, VIS-Sketch, and VIS-Thermal face pairs from respective databases. The algorithm shows promising performance in addressing the challenges posed by the modality gap, providing a potential solution for accurate and robust Heterogeneous Face Recognition.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":"45 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computing and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00521-024-10272-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Automatic face recognition has made significant progress in recent decades, particularly in controlled environments. However, recognizing faces across different modalities, known as Heterogeneous Face Recognition, presents challenges due to variations in modality gaps. This paper addresses the problem of HFR by proposing a two-stage algorithm. In the first stage, a deep stacked PCA descriptor (DSPD) is introduced to extract domain-invariant features from face images of different modalities. The DSPD utilizes multiple convolution layers of domain-trained PCA filters, and the features extracted from each layer are concatenated to obtain a final feature representation. Additionally, pre-processing steps are applied to input images to enhance the prominence of facial edges, making the features more distinctive. The obtained DSPD features can be directly used for recognition using nearest neighbour algorithms. To further improve recognition robustness, a coupled subspace called coupled discriminant neighbourhood embedding (CDNE) is proposed in the second stage. CDNE is trained with a limited number of data samples and can project DSPD features from different modalities onto a common subspace. In this subspace, data points representing the same subjects from different modalities are positioned closely, while those of different subjects are positioned apart. This spatial arrangement enhances the recognition of heterogeneous faces using nearest neighbour algorithms. Experimental results demonstrate the effectiveness of the proposed algorithm on various HFR scenarios, including VIS-NIR, VIS-Sketch, and VIS-Thermal face pairs from respective databases. The algorithm shows promising performance in addressing the challenges posed by the modality gap, providing a potential solution for accurate and robust Heterogeneous Face Recognition.