使用两层沙漏网络实现高效的耳朵对齐

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IET Biometrics Pub Date : 2023-03-13 DOI:10.1049/bme2.12109

Anja Hrovatič, Peter Peer, Vitomir Štruc, Žiga Emeršič

{"title":"使用两层沙漏网络实现高效的耳朵对齐","authors":"Anja Hrovatič, Peter Peer, Vitomir Štruc, Žiga Emeršič","doi":"10.1049/bme2.12109","DOIUrl":null,"url":null,"abstract":"<p>Ear images have been shown to be a reliable modality for biometric recognition with desirable characteristics, such as high universality, distinctiveness, measurability and permanence. While a considerable amount of research has been directed towards ear recognition techniques, the problem of ear alignment is still under-explored in the open literature. Nonetheless, accurate alignment of ear images, especially in unconstrained acquisition scenarios, where the ear appearance is expected to vary widely due to pose and view point variations, is critical for the performance of all downstream tasks, including ear recognition. Here, the authors address this problem and present a framework for ear alignment that relies on a two-step procedure: (i) automatic landmark detection and (ii) fiducial point alignment. For the first (landmark detection) step, the authors implement and train a Two-Stack Hourglass model (2-SHGNet) capable of accurately predicting 55 landmarks on diverse ear images captured in uncontrolled conditions. For the second (alignment) step, the authors use the Random Sample Consensus (RANSAC) algorithm to align the estimated landmark/fiducial points with a pre-defined ear shape (i.e. a collection of average ear landmark positions). The authors evaluate the proposed framework in comprehensive experiments on the AWEx and ITWE datasets and show that the 2-SHGNet model leads to more accurate landmark predictions than competing state-of-the-art models from the literature. Furthermore, the authors also demonstrate that the alignment step significantly improves recognition accuracy with ear images from unconstrained environments compared to unaligned imagery.</p>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"12 2","pages":"77-90"},"PeriodicalIF":1.8000,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2.12109","citationCount":"2","resultStr":"{\"title\":\"Efficient ear alignment using a two-stack hourglass network\",\"authors\":\"Anja Hrovatič, Peter Peer, Vitomir Štruc, Žiga Emeršič\",\"doi\":\"10.1049/bme2.12109\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Ear images have been shown to be a reliable modality for biometric recognition with desirable characteristics, such as high universality, distinctiveness, measurability and permanence. While a considerable amount of research has been directed towards ear recognition techniques, the problem of ear alignment is still under-explored in the open literature. Nonetheless, accurate alignment of ear images, especially in unconstrained acquisition scenarios, where the ear appearance is expected to vary widely due to pose and view point variations, is critical for the performance of all downstream tasks, including ear recognition. Here, the authors address this problem and present a framework for ear alignment that relies on a two-step procedure: (i) automatic landmark detection and (ii) fiducial point alignment. For the first (landmark detection) step, the authors implement and train a Two-Stack Hourglass model (2-SHGNet) capable of accurately predicting 55 landmarks on diverse ear images captured in uncontrolled conditions. For the second (alignment) step, the authors use the Random Sample Consensus (RANSAC) algorithm to align the estimated landmark/fiducial points with a pre-defined ear shape (i.e. a collection of average ear landmark positions). The authors evaluate the proposed framework in comprehensive experiments on the AWEx and ITWE datasets and show that the 2-SHGNet model leads to more accurate landmark predictions than competing state-of-the-art models from the literature. Furthermore, the authors also demonstrate that the alignment step significantly improves recognition accuracy with ear images from unconstrained environments compared to unaligned imagery.</p>\",\"PeriodicalId\":48821,\"journal\":{\"name\":\"IET Biometrics\",\"volume\":\"12 2\",\"pages\":\"77-90\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/bme2.12109\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Biometrics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/bme2.12109\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Biometrics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/bme2.12109","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 2

摘要

耳朵图像已被证明是一种可靠的生物识别模式，具有良好的通用性、独特性、可测量性和持久性等特点。虽然大量的研究都是针对耳朵识别技术的，但在公开文献中，耳朵对齐的问题仍然没有得到充分的探索。尽管如此，耳朵图像的准确对齐，特别是在不受约束的采集场景中，由于姿势和视点的变化，耳朵的外观预计会有很大的变化，这对包括耳朵识别在内的所有下游任务的性能至关重要。在这里，作者解决了这个问题，并提出了一个耳朵对齐的框架，该框架依赖于两步程序：（i）自动地标检测和（ii）基准点对齐。对于第一步（界标检测），作者实现并训练了两层沙漏模型（2-SHGNet），该模型能够准确预测在非受控条件下拍摄的不同耳朵图像上的55个界标。对于第二步（对准），作者使用随机样本一致性（RANSAC）算法将估计的界标/基准点与预定义的耳朵形状（即平均耳朵界标位置的集合）对准。作者在AWEx和ITWE数据集上的综合实验中评估了所提出的框架，并表明2-SHGNet模型比文献中最先进的竞争模型更准确地进行了里程碑式预测。此外，作者还证明，与未对准的图像相比，对准步骤显著提高了来自无约束环境的耳朵图像的识别精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Efficient ear alignment using a two-stack hourglass network

Ear images have been shown to be a reliable modality for biometric recognition with desirable characteristics, such as high universality, distinctiveness, measurability and permanence. While a considerable amount of research has been directed towards ear recognition techniques, the problem of ear alignment is still under-explored in the open literature. Nonetheless, accurate alignment of ear images, especially in unconstrained acquisition scenarios, where the ear appearance is expected to vary widely due to pose and view point variations, is critical for the performance of all downstream tasks, including ear recognition. Here, the authors address this problem and present a framework for ear alignment that relies on a two-step procedure: (i) automatic landmark detection and (ii) fiducial point alignment. For the first (landmark detection) step, the authors implement and train a Two-Stack Hourglass model (2-SHGNet) capable of accurately predicting 55 landmarks on diverse ear images captured in uncontrolled conditions. For the second (alignment) step, the authors use the Random Sample Consensus (RANSAC) algorithm to align the estimated landmark/fiducial points with a pre-defined ear shape (i.e. a collection of average ear landmark positions). The authors evaluate the proposed framework in comprehensive experiments on the AWEx and ITWE datasets and show that the 2-SHGNet model leads to more accurate landmark predictions than competing state-of-the-art models from the literature. Furthermore, the authors also demonstrate that the alignment step significantly improves recognition accuracy with ear images from unconstrained environments compared to unaligned imagery.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IET Biometrics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

5.90

自引率

0.00%

发文量

审稿时长

33 weeks

期刊介绍： The field of biometric recognition - automated recognition of individuals based on their behavioural and biological characteristics - has now reached a level of maturity where viable practical applications are both possible and increasingly available. The biometrics field is characterised especially by its interdisciplinarity since, while focused primarily around a strong technological base, effective system design and implementation often requires a broad range of skills encompassing, for example, human factors, data security and database technologies, psychological and physiological awareness, and so on. Also, the technology focus itself embraces diversity, since the engineering of effective biometric systems requires integration of image analysis, pattern recognition, sensor technology, database engineering, security design and many other strands of understanding. The scope of the journal is intentionally relatively wide. While focusing on core technological issues, it is recognised that these may be inherently diverse and in many cases may cross traditional disciplinary boundaries. The scope of the journal will therefore include any topics where it can be shown that a paper can increase our understanding of biometric systems, signal future developments and applications for biometrics, or promote greater practical uptake for relevant technologies: Development and enhancement of individual biometric modalities including the established and traditional modalities (e.g. face, fingerprint, iris, signature and handwriting recognition) and also newer or emerging modalities (gait, ear-shape, neurological patterns, etc.) Multibiometrics, theoretical and practical issues, implementation of practical systems, multiclassifier and multimodal approaches Soft biometrics and information fusion for identification, verification and trait prediction Human factors and the human-computer interface issues for biometric systems, exception handling strategies Template construction and template management, ageing factors and their impact on biometric systems Usability and user-oriented design, psychological and physiological principles and system integration Sensors and sensor technologies for biometric processing Database technologies to support biometric systems Implementation of biometric systems, security engineering implications, smartcard and associated technologies in implementation, implementation platforms, system design and performance evaluation Trust and privacy issues, security of biometric systems and supporting technological solutions, biometric template protection Biometric cryptosystems, security and biometrics-linked encryption Links with forensic processing and cross-disciplinary commonalities Core underpinning technologies (e.g. image analysis, pattern recognition, computer vision, signal processing, etc.), where the specific relevance to biometric processing can be demonstrated Applications and application-led considerations Position papers on technology or on the industrial context of biometric system development Adoption and promotion of standards in biometrics, improving technology acceptance, deployment and interoperability, avoiding cross-cultural and cross-sector restrictions Relevant ethical and social issues