César H. G. Andrade, Hendrio L. S. Bragança, Horácio Fernandes, Eduardo Feitosa, Eduardo Souto
Authentication in personal and corporate computer systems predominantly relies on login and password credentials, which are vulnerable to unauthorized access, especially when genuine users leave their devices unlocked. To address this issue, continuous authentication (CA) systems based on behavioral biometrics have gained attention. Traditional CA models leverage user–device interactions, such as mouse movements, typing dynamics, and speech recognition. This paper introduces a novel approach that utilizes system performance counters—attributes such as memory usage, CPU load, and network activity—collected passively by operating systems (OSs), to develop a robust and low-intrusive authentication mechanism. Our method employs a deep network architecture combining convolutional neural networks (CNNs) with long short-term memory (LSTM) layers to analyze temporal patterns and identify unique user behaviors. Unlike traditional methods, performance counters capture subtle system-level usage patterns that are harder to mimic, enhancing security and resilience to attacks. We integrate a trust model into the CA framework to balance security and usability by avoiding interruptions for genuine users while blocking impostors in real-time. We evaluate our approach using two new datasets, COUNT-SO-I (26 users) and COUNT-SO-II (37 users), collected in real-world scenarios without specific task constraints. Our results demonstrate the feasibility and effectiveness of the proposed method, achieving 99% detection accuracy (ACC) for impostor users within an average of 17.2 s, while maintaining seamless user experiences. These findings highlight the potential of performance counter–based CA systems for practical applications, such as safeguarding sensitive systems in corporate, governmental, and personal environments.
{"title":"A DeepConvLSTM Approach for Continuous Authentication Using Operational System Performance Counters","authors":"César H. G. Andrade, Hendrio L. S. Bragança, Horácio Fernandes, Eduardo Feitosa, Eduardo Souto","doi":"10.1049/bme2/8262252","DOIUrl":"10.1049/bme2/8262252","url":null,"abstract":"<p>Authentication in personal and corporate computer systems predominantly relies on login and password credentials, which are vulnerable to unauthorized access, especially when genuine users leave their devices unlocked. To address this issue, continuous authentication (CA) systems based on behavioral biometrics have gained attention. Traditional CA models leverage user–device interactions, such as mouse movements, typing dynamics, and speech recognition. This paper introduces a novel approach that utilizes system performance counters—attributes such as memory usage, CPU load, and network activity—collected passively by operating systems (OSs), to develop a robust and low-intrusive authentication mechanism. Our method employs a deep network architecture combining convolutional neural networks (CNNs) with long short-term memory (LSTM) layers to analyze temporal patterns and identify unique user behaviors. Unlike traditional methods, performance counters capture subtle system-level usage patterns that are harder to mimic, enhancing security and resilience to attacks. We integrate a trust model into the CA framework to balance security and usability by avoiding interruptions for genuine users while blocking impostors in real-time. We evaluate our approach using two new datasets, COUNT-SO-I (26 users) and COUNT-SO-II (37 users), collected in real-world scenarios without specific task constraints. Our results demonstrate the feasibility and effectiveness of the proposed method, achieving 99% detection accuracy (ACC) for impostor users within an average of 17.2 s, while maintaining seamless user experiences. These findings highlight the potential of performance counter–based CA systems for practical applications, such as safeguarding sensitive systems in corporate, governmental, and personal environments.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/8262252","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144897286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiyuan Shen, Xueyan Li, Junqi Bai, Kai Wang, Yifan Xu
Fatigue among air traffic controllers is a factor contributing to civil aviation crashes. Existing methods for extracting and fuzing fatigue features encounter two main challenges: (1) the low accuracy of traditional single-mode fatigue recognition methods, and (2) disregarding multimodal data correlations in traditional multimodal methods for feature concatenation and fusion. This paper proposes an interactive algorithm for the fusion and recognition of multimode fatigue features that combines multihead attention (MHA) and cross-attention (XATTN) which are based on an improved speech and facial fatigue recognition model. First, an improved conformer model which combines a convolutional module with a transformer encoder is proposed using the radiotelephony communication data of controllers by employing the filter bank method for extracting profound speech features. Second, facial data of controllers are processed via pointwise convolutions employing a stack of inverted residual layers, which facilitates the extraction of facial features. Third, speech and facial features are fuzed interactively by combining MHA and XATTN, which achieves high accuracy of recognizing the fatigue state of controllers working in complex operational environments. A series of experiments were conducted with audiovisual data sets collected from actual air traffic control (ATC) missions. Comparing with four competing methods for fuzing multimodal features, the experimental results indicate that the proposed method for fuzing multimode features achieved a recognition accuracy of 99.2%, which was 8.9% higher than that for a speech single-mode model and 0.4% higher than that for a facial single-mode model.
{"title":"A Dynamic Interactive Fusion Model for Extracting Fatigue Features Based on the Audiovisual Data Flow of Air Traffic Controllers","authors":"Zhiyuan Shen, Xueyan Li, Junqi Bai, Kai Wang, Yifan Xu","doi":"10.1049/bme2/7626919","DOIUrl":"10.1049/bme2/7626919","url":null,"abstract":"<p>Fatigue among air traffic controllers is a factor contributing to civil aviation crashes. Existing methods for extracting and fuzing fatigue features encounter two main challenges: (1) the low accuracy of traditional single-mode fatigue recognition methods, and (2) disregarding multimodal data correlations in traditional multimodal methods for feature concatenation and fusion. This paper proposes an interactive algorithm for the fusion and recognition of multimode fatigue features that combines multihead attention (MHA) and cross-attention (XATTN) which are based on an improved speech and facial fatigue recognition model. First, an improved conformer model which combines a convolutional module with a transformer encoder is proposed using the radiotelephony communication data of controllers by employing the filter bank method for extracting profound speech features. Second, facial data of controllers are processed via pointwise convolutions employing a stack of inverted residual layers, which facilitates the extraction of facial features. Third, speech and facial features are fuzed interactively by combining MHA and XATTN, which achieves high accuracy of recognizing the fatigue state of controllers working in complex operational environments. A series of experiments were conducted with audiovisual data sets collected from actual air traffic control (ATC) missions. Comparing with four competing methods for fuzing multimodal features, the experimental results indicate that the proposed method for fuzing multimode features achieved a recognition accuracy of 99.2%, which was 8.9% higher than that for a speech single-mode model and 0.4% higher than that for a facial single-mode model.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/7626919","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144891645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laurenz Ruzicka, Bernhard Kohn, Clemens Heitzinger
Biometric identification systems, particularly those utilizing fingerprints, have become essential as a means of authenticating users due to their reliability and uniqueness. The recent shift towards contactless fingerprint sensors requires precise fingertip segmentation with changing backgrounds, to maintain high accuracy. This study introduces a novel deep learning model combining ResNeSt and UNet++ architectures called FingerUNeSt++, aimed at improving segmentation accuracy and inference speed for contactless fingerprint images. Our model significantly outperforms traditional and state-of-the-art methods, achieving superior performance metrics. Extensive data augmentation and an optimized model architecture contribute to its robustness and efficiency. This advancement holds promise for enhancing the effectiveness of contactless biometric systems in diverse real-world applications.
{"title":"FingerUNeSt++: Improving Fingertip Segmentation in Contactless Fingerprint Imaging Using Deep Learning","authors":"Laurenz Ruzicka, Bernhard Kohn, Clemens Heitzinger","doi":"10.1049/bme2/9982355","DOIUrl":"10.1049/bme2/9982355","url":null,"abstract":"<p>Biometric identification systems, particularly those utilizing fingerprints, have become essential as a means of authenticating users due to their reliability and uniqueness. The recent shift towards contactless fingerprint sensors requires precise fingertip segmentation with changing backgrounds, to maintain high accuracy. This study introduces a novel deep learning model combining ResNeSt and UNet++ architectures called FingerUNeSt++, aimed at improving segmentation accuracy and inference speed for contactless fingerprint images. Our model significantly outperforms traditional and state-of-the-art methods, achieving superior performance metrics. Extensive data augmentation and an optimized model architecture contribute to its robustness and efficiency. This advancement holds promise for enhancing the effectiveness of contactless biometric systems in diverse real-world applications.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/9982355","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144662898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, deepfake videos have emerged as a significant threat to societal and cybersecurity landscapes. Artificial intelligence (AI) techniques are used to create convincing deepfakes. The main counter method is deepfake detection. Currently, most of the mainstream detectors are based on deep neural networks. Such deep learning detection frameworks often face several problems that need to be addressed, for example, dependence on large-annotated datasets, lack of interpretability, and limited attention to source traceability. Towards overcoming these limitations, in this paper, we propose a novel training-free deepfake detection framework based on the interpretable inherent source attribution. The proposed framework not only distinguishes between real and fake videos but also traces their origins using camera fingerprints. Moreover, we have also constructed a new deepfake video dataset from 10 distinct camera devices. Experimental evaluations on multiple datasets show that the proposed method can attain high detection accuracies (ACCs) comparable to state-of-the-art (SOTA) deep learning techniques and also has superior traceability capabilities. This framework provides a robust and efficient solution for deepfake video authentication and source attribution, thus, making it highly adaptable to real-world scenarios.
{"title":"Deepfake Video Traceability and Authentication via Source Attribution","authors":"Canghai Shi, Minglei Qiao, Zhuang Li, Zahid Akhtar, Bin Wang, Meng Han, Tong Qiao","doi":"10.1049/bme2/5687970","DOIUrl":"10.1049/bme2/5687970","url":null,"abstract":"<p>In recent years, deepfake videos have emerged as a significant threat to societal and cybersecurity landscapes. Artificial intelligence (AI) techniques are used to create convincing deepfakes. The main counter method is deepfake detection. Currently, most of the mainstream detectors are based on deep neural networks. Such deep learning detection frameworks often face several problems that need to be addressed, for example, dependence on large-annotated datasets, lack of interpretability, and limited attention to source traceability. Towards overcoming these limitations, in this paper, we propose a novel training-free deepfake detection framework based on the interpretable inherent source attribution. The proposed framework not only distinguishes between real and fake videos but also traces their origins using camera fingerprints. Moreover, we have also constructed a new deepfake video dataset from 10 distinct camera devices. Experimental evaluations on multiple datasets show that the proposed method can attain high detection accuracies (ACCs) comparable to state-of-the-art (SOTA) deep learning techniques and also has superior traceability capabilities. This framework provides a robust and efficient solution for deepfake video authentication and source attribution, thus, making it highly adaptable to real-world scenarios.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/5687970","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144615060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fingerprints are unique biometric identifiers that reflect intricate genetic and environmental/physiological influences. Beyond their forensic significance, they can offer insights into physiological traits like blood groups and gender, which can help in forensic analysis to narrow down the search. This exploratory study aims to identify potential associations between fingerprint patterns, gender, and blood groups within a defined regional cohort in Kathmandu, Nepal. This preliminary study included 290 students (144 males and 146 females) from Himalayan Whitehouse International College (HWIC). Fingerprint patterns (loops, whorls, and arches) were analyzed and compared with participants’ ABO-Rh blood groups. Statistical analyses, including chi-square tests, were used to determine associations and trends. Loops emerged as the most common fingerprint pattern (57.14%), followed by whorls (35%), and arches (7.86%). Blood group B+ve was the most prevalent (33.1%) among the study population in Kathmandu. The significant association between gender and fingerprint pattern was observed. The gender analysis revealed that loops were predominant in females, while males showed a higher frequency of whorls. While no significant relationship was observed between ABO blood groups and fingerprint patterns, a strong association was found between fingerprint patterns and Rh factor (p = 0.0496). Loops were more prevalent among Rh-positive (Rh+ve) individuals, while whorls were more common among Rh-negative (Rh−ve) individuals. Additionally, specific fingers were observed to have distinct fingerprint patterns more frequently. Arches were most prevalent in the index finger of both hands, loops were most abundant in both pinky finger, and left middle finger. Whorls were most frequently observed in ring finger of both hands and right thumb. The findings reinforce global patterns of blood group and fingerprint distribution, where Rh+ve individuals represent the majority and loops are most dominant fingerprint pattern. The gender-specific trends suggest the nuanced interplay of genetics, with females displaying a higher frequency of loops and males showing more whorls. Similarly, some blood group are more likely to exhibit a specific set of fingerprint patterns. This research clearly shows the gender-based differences and influence of genetic factors on fingerprint patterns, particularly the Rh factor. These findings contribute to the growing field of dermatoglyphics, with implications for forensic science, and population genetics.
{"title":"A Dermatoglyphic Study of Primary Fingerprints Pattern in Relation to Gender and Blood Group Among Residents of Kathmandu Valley, Nepal","authors":"Sushma Paudel, Sushmita Paudel, Samikshya Kafle","doi":"10.1049/bme2/9993120","DOIUrl":"10.1049/bme2/9993120","url":null,"abstract":"<p>Fingerprints are unique biometric identifiers that reflect intricate genetic and environmental/physiological influences. Beyond their forensic significance, they can offer insights into physiological traits like blood groups and gender, which can help in forensic analysis to narrow down the search. This exploratory study aims to identify potential associations between fingerprint patterns, gender, and blood groups within a defined regional cohort in Kathmandu, Nepal. This preliminary study included 290 students (144 males and 146 females) from Himalayan Whitehouse International College (HWIC). Fingerprint patterns (loops, whorls, and arches) were analyzed and compared with participants’ ABO-Rh blood groups. Statistical analyses, including chi-square tests, were used to determine associations and trends. Loops emerged as the most common fingerprint pattern (57.14%), followed by whorls (35%), and arches (7.86%). Blood group B+ve was the most prevalent (33.1%) among the study population in Kathmandu. The significant association between gender and fingerprint pattern was observed. The gender analysis revealed that loops were predominant in females, while males showed a higher frequency of whorls. While no significant relationship was observed between ABO blood groups and fingerprint patterns, a strong association was found between fingerprint patterns and Rh factor (<i>p</i> = 0.0496). Loops were more prevalent among Rh-positive (Rh+ve) individuals, while whorls were more common among Rh-negative (Rh−ve) individuals. Additionally, specific fingers were observed to have distinct fingerprint patterns more frequently. Arches were most prevalent in the index finger of both hands, loops were most abundant in both pinky finger, and left middle finger. Whorls were most frequently observed in ring finger of both hands and right thumb. The findings reinforce global patterns of blood group and fingerprint distribution, where Rh+ve individuals represent the majority and loops are most dominant fingerprint pattern. The gender-specific trends suggest the nuanced interplay of genetics, with females displaying a higher frequency of loops and males showing more whorls. Similarly, some blood group are more likely to exhibit a specific set of fingerprint patterns. This research clearly shows the gender-based differences and influence of genetic factors on fingerprint patterns, particularly the Rh factor. These findings contribute to the growing field of dermatoglyphics, with implications for forensic science, and population genetics.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/9993120","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon Kirchgasser, Christof Kauba, Georg Wimmer, Andreas Uhl
Natural scene statistics commonly used in nonreference image quality measures and a proposed deep-learning (DL)–based quality assessment approach are suggested as biometric quality indicators for vasculature images. While NIQE (natural image quality evaluator) and BRISQUE (blind/referenceless image spatial quality evaluator) if trained in common images with usual distortions do not work well for assessing vasculature pattern samples’ quality, their variants being trained on high- and low-quality vasculature sample data behave as expected from a biometric quality estimator in most cases (deviations from the overall trend occur for certain datasets or feature extraction methods). A DL-based quality metric is proposed in this work and designed to be capable of assigning the correct quality class to the vasculature pattern samples in most cases, independent of finger or hand vein patterns being assessed. The experiments, evaluating NIQE, BRISQUE, and the newly proposed DL quality metrics, were conducted on a total of 13 publicly available finger and hand vein datasets and involve three distinct template representations (two of them especially designed for vascular biometrics). The proposed (trained) quality measure(s) are compared to several classical quality metrics, with their achieved results underlining their promising behavior.
{"title":"Advanced Image Quality Assessment for Hand- and Finger-Vein Biometrics","authors":"Simon Kirchgasser, Christof Kauba, Georg Wimmer, Andreas Uhl","doi":"10.1049/bme2/8869140","DOIUrl":"10.1049/bme2/8869140","url":null,"abstract":"<p>Natural scene statistics commonly used in nonreference image quality measures and a proposed deep-learning (DL)–based quality assessment approach are suggested as biometric quality indicators for vasculature images. While NIQE (natural image quality evaluator) and BRISQUE (blind/referenceless image spatial quality evaluator) if trained in common images with usual distortions do not work well for assessing vasculature pattern samples’ quality, their variants being trained on high- and low-quality vasculature sample data behave as expected from a biometric quality estimator in most cases (deviations from the overall trend occur for certain datasets or feature extraction methods). A DL-based quality metric is proposed in this work and designed to be capable of assigning the correct quality class to the vasculature pattern samples in most cases, independent of finger or hand vein patterns being assessed. The experiments, evaluating NIQE, BRISQUE, and the newly proposed DL quality metrics, were conducted on a total of 13 publicly available finger and hand vein datasets and involve three distinct template representations (two of them especially designed for vascular biometrics). The proposed (trained) quality measure(s) are compared to several classical quality metrics, with their achieved results underlining their promising behavior.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/8869140","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143909172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenlong Liu, Lu Yang, Wen Zhou, Yuan Li, Fanchang Hao
With the increasing application of biometric recognition technology in daily life, the number of registered users is rapidly growing, making fast retrieval techniques increasingly important for biometric recognition. However, existing biometric recognition models are often overly complex, making them difficult to deploy on resource-constrained terminal devices. Inspired by knowledge distillation (KD) for model simplification and deep hashing for fast image retrieval, we propose a new model that achieves lightweight palmprint and finger vein retrieval. This model integrates hash distillation loss, classification distillation loss, and supervised loss from labels within a KD framework. And it improves the retrieval and recognition performance of the lightweight model through the network design. Experimental results demonstrate that this method promotes the performance of the student model on multiple palmprint and finger vein datasets, with retrieval precision and recognition accuracy surpassing several existing advanced hashing methods.
{"title":"Deep Distillation Hashing for Palmprint and Finger Vein Retrieval","authors":"Chenlong Liu, Lu Yang, Wen Zhou, Yuan Li, Fanchang Hao","doi":"10.1049/bme2/9017371","DOIUrl":"10.1049/bme2/9017371","url":null,"abstract":"<p>With the increasing application of biometric recognition technology in daily life, the number of registered users is rapidly growing, making fast retrieval techniques increasingly important for biometric recognition. However, existing biometric recognition models are often overly complex, making them difficult to deploy on resource-constrained terminal devices. Inspired by knowledge distillation (KD) for model simplification and deep hashing for fast image retrieval, we propose a new model that achieves lightweight palmprint and finger vein retrieval. This model integrates hash distillation loss, classification distillation loss, and supervised loss from labels within a KD framework. And it improves the retrieval and recognition performance of the lightweight model through the network design. Experimental results demonstrate that this method promotes the performance of the student model on multiple palmprint and finger vein datasets, with retrieval precision and recognition accuracy surpassing several existing advanced hashing methods.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/9017371","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143871822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the abuse of deep forgery technology, the research on forgery detection methods has become increasingly urgent. The corresponding relationship between the frequency spectrum information and the spatial clues, which is often neglected by current methods, could be conducive to a more accurate and generalized forgery detection. Motivated by this inspiration, we propose a wavelet-based texture mining and enhancement framework for face forgery detection. First, we introduce a frequency-guided texture enhancement (FGTE) module that mining the high-frequency information to improve the network’s extraction of effective texture features. Next, we propose a global–local feature refinement (GLFR) module to enhance the model’s leverage of both global semantic features and local texture features. Moreover, the interactive fusion module (IFM) is designed to fully incorporate the enhanced texture clues with spatial features. The proposed method has been extensively evaluated on five public datasets, such as FaceForensics++ (FF++), deepfake (DF) detection (DFD) challenge (DFDC), Celeb-DFv2, DFDC preview (DFDC-P), and DFD, for face forgery detection, yielding promising performance within and cross dataset experiments.
{"title":"Wavelet-Based Texture Mining and Enhancement for Face Forgery Detection","authors":"Xin Li, Hui Zhao, Bingxin Xu, Hongzhe Liu","doi":"10.1049/bme2/2217175","DOIUrl":"10.1049/bme2/2217175","url":null,"abstract":"<p>Due to the abuse of deep forgery technology, the research on forgery detection methods has become increasingly urgent. The corresponding relationship between the frequency spectrum information and the spatial clues, which is often neglected by current methods, could be conducive to a more accurate and generalized forgery detection. Motivated by this inspiration, we propose a wavelet-based texture mining and enhancement framework for face forgery detection. First, we introduce a frequency-guided texture enhancement (FGTE) module that mining the high-frequency information to improve the network’s extraction of effective texture features. Next, we propose a global–local feature refinement (GLFR) module to enhance the model’s leverage of both global semantic features and local texture features. Moreover, the interactive fusion module (IFM) is designed to fully incorporate the enhanced texture clues with spatial features. The proposed method has been extensively evaluated on five public datasets, such as FaceForensics++ (FF++), deepfake (DF) detection (DFD) challenge (DFDC), Celeb-DFv2, DFDC preview (DFDC-P), and DFD, for face forgery detection, yielding promising performance within and cross dataset experiments.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2025 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/2217175","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emotions play a significant role in how we perceive and interact with products. Thoughtfully designed emotionally appealing products can evoke strong user responses, making them more attractive. Color, as a crucial attribute of products, is a significant aspect to consider in the process of emotional product design. However, users’ emotional perception of product colors is highly intricate and challenging to define. To address this, this research proposes a product color design concept that considers human emotion perception based on deep learning and cluster analysis. First, for a given product, a color style is chosen for rerendering, which is an emotional color image. Different emotional color images have distinct RGB color representations. Second, clustering methods are employed to establish relationships between various emotional color images and different colors, selecting emotionally close style images. Subsequently, through transfer learning techniques, specific grid structures are used to retrain network weights, allowing for the fusion design of style and content images. This process ultimately achieves emotional color rendering design based on emotional color clustering and transfer learning. Multiple sets of emotional color design examples demonstrate that the method proposed in this study can accurately fulfill the emotional color design requirements of products, thereby, offering practical applicability. The satisfaction survey shows that the proposed method has certain guiding significance for clothing emotional color design.
{"title":"Product Color Design Concept that Considers Human Emotion Perception: Based on Deep Learning and Cluster Analysis","authors":"Anqi Gao, Yantao Zhong","doi":"10.1049/bme2/5576927","DOIUrl":"10.1049/bme2/5576927","url":null,"abstract":"<p>Emotions play a significant role in how we perceive and interact with products. Thoughtfully designed emotionally appealing products can evoke strong user responses, making them more attractive. Color, as a crucial attribute of products, is a significant aspect to consider in the process of emotional product design. However, users’ emotional perception of product colors is highly intricate and challenging to define. To address this, this research proposes a product color design concept that considers human emotion perception based on deep learning and cluster analysis. First, for a given product, a color style is chosen for rerendering, which is an emotional color image. Different emotional color images have distinct RGB color representations. Second, clustering methods are employed to establish relationships between various emotional color images and different colors, selecting emotionally close style images. Subsequently, through transfer learning techniques, specific grid structures are used to retrain network weights, allowing for the fusion design of style and content images. This process ultimately achieves emotional color rendering design based on emotional color clustering and transfer learning. Multiple sets of emotional color design examples demonstrate that the method proposed in this study can accurately fulfill the emotional color design requirements of products, thereby, offering practical applicability. The satisfaction survey shows that the proposed method has certain guiding significance for clothing emotional color design.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/5576927","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143118922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Byeong Seon An, Hyeji Lim, Hyeon Ah Seong, Eui Chul Lee
Deepfake (DF) involves utilizing artificial intelligence (AI) technology to synthesize or manipulate images, voices, and other human or object data. However, recent times have seen a surge in instances of DF technology misuse, raising concerns about cybercrime and the credibility of manipulated information. The objective of this study is to devise a method that employs remote photoplethysmography (rPPG) biosignals for DF detection. The face was divided into five regions based on landmarks, with automatic extraction performed on the neck region. We conducted rPPG signal extraction from each facial area and the neck region was defined as the ground truth. The five signals extracted from the face were used as inputs to an support vector machine (SVM) model by calculating the euclidean distance between each signal and the signal extracted from the neck region, measuring rPPG signal similarity with five features. Our approach demonstrated robust performance with an area under the curve (AUC) score of 91.2% on the audio-driven dataset and 99.7% on the face swapping generative adversarial network (FSGAN) dataset, even though we only used datasets excluding DF techniques that can be visually identified in Korean DF Detection Dataset (KoDF). Therefore, our research findings demonstrate that similarity features of rPPG signals can be utilized as key features for detecting DFs.
{"title":"Facial and Neck Region Analysis for Deepfake Detection Using Remote Photoplethysmography Signal Similarity","authors":"Byeong Seon An, Hyeji Lim, Hyeon Ah Seong, Eui Chul Lee","doi":"10.1049/bme2/7095412","DOIUrl":"10.1049/bme2/7095412","url":null,"abstract":"<p>Deepfake (DF) involves utilizing artificial intelligence (AI) technology to synthesize or manipulate images, voices, and other human or object data. However, recent times have seen a surge in instances of DF technology misuse, raising concerns about cybercrime and the credibility of manipulated information. The objective of this study is to devise a method that employs remote photoplethysmography (rPPG) biosignals for DF detection. The face was divided into five regions based on landmarks, with automatic extraction performed on the neck region. We conducted rPPG signal extraction from each facial area and the neck region was defined as the ground truth. The five signals extracted from the face were used as inputs to an support vector machine (SVM) model by calculating the euclidean distance between each signal and the signal extracted from the neck region, measuring rPPG signal similarity with five features. Our approach demonstrated robust performance with an area under the curve (AUC) score of 91.2% on the audio-driven dataset and 99.7% on the face swapping generative adversarial network (FSGAN) dataset, even though we only used datasets excluding DF techniques that can be visually identified in Korean DF Detection Dataset (KoDF). Therefore, our research findings demonstrate that similarity features of rPPG signals can be utilized as key features for detecting DFs.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2/7095412","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142708010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}