David Freire-Obregón , Joao Neves , Žiga Emeršič , Blaž Meden , Modesto Castrillón-Santana , Hugo Proença
{"title":"Synthesizing multilevel abstraction ear sketches for enhanced biometric recognition","authors":"David Freire-Obregón , Joao Neves , Žiga Emeršič , Blaž Meden , Modesto Castrillón-Santana , Hugo Proença","doi":"10.1016/j.imavis.2025.105424","DOIUrl":null,"url":null,"abstract":"<div><div>Sketch understanding poses unique challenges for general-purpose vision algorithms due to the sparse and semantically ambiguous nature of sketches. This paper introduces a novel approach to biometric recognition that leverages sketch-based representations of ears, a largely unexplored but promising area in biometric research. Specifically, we address the “<em>sketch-2-image</em>” matching problem by synthesizing ear sketches at multiple abstraction levels, achieved through a triplet-loss function adapted to integrate these levels. The abstraction level is determined by the number of strokes used, with fewer strokes reflecting higher abstraction. Our methodology combines sketch representations across abstraction levels to improve robustness and generalizability in matching. Extensive evaluations were conducted on four ear datasets (AMI, AWE, IITDII, and BIPLab) using various pre-trained neural network backbones, showing consistently superior performance over state-of-the-art methods. These results highlight the potential of ear sketch-based recognition, with cross-dataset tests confirming its adaptability to real-world conditions and suggesting applicability beyond ear biometrics.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"154 ","pages":"Article 105424"},"PeriodicalIF":4.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625000125","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Sketch understanding poses unique challenges for general-purpose vision algorithms due to the sparse and semantically ambiguous nature of sketches. This paper introduces a novel approach to biometric recognition that leverages sketch-based representations of ears, a largely unexplored but promising area in biometric research. Specifically, we address the “sketch-2-image” matching problem by synthesizing ear sketches at multiple abstraction levels, achieved through a triplet-loss function adapted to integrate these levels. The abstraction level is determined by the number of strokes used, with fewer strokes reflecting higher abstraction. Our methodology combines sketch representations across abstraction levels to improve robustness and generalizability in matching. Extensive evaluations were conducted on four ear datasets (AMI, AWE, IITDII, and BIPLab) using various pre-trained neural network backbones, showing consistently superior performance over state-of-the-art methods. These results highlight the potential of ear sketch-based recognition, with cross-dataset tests confirming its adaptability to real-world conditions and suggesting applicability beyond ear biometrics.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.