用于凝视估计的带有自适应参考样本的快速差分网络

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computer Vision and Image Understanding Pub Date : 2024-10-09 DOI:10.1016/j.cviu.2024.104156

Jiahui Hu , Yonghua Lu , Xiyuan Ye , Qiang Feng , Lihua Zhou

{"title":"用于凝视估计的带有自适应参考样本的快速差分网络","authors":"Jiahui Hu , Yonghua Lu , Xiyuan Ye , Qiang Feng , Lihua Zhou","doi":"10.1016/j.cviu.2024.104156","DOIUrl":null,"url":null,"abstract":"<div><div>Most non-invasive gaze estimation methods do not consider the inter-individual differences in anatomical structure, but directly regress the gaze direction from the appearance image information, which limits the accuracy of individual-independent gaze estimation networks. In addition, existing gaze estimation methods tend to consider only how to improve the model’s generalization performance, ignoring the crucial issue of efficiency, which leads to bulky models that are difficult to deploy and have questionable cost-effectiveness in practical use. This paper makes the following contributions: (1) A differential network for gaze estimation using adaptive reference samples is proposed, which can adaptively select reference samples based on scene and individual characteristics. (2) The knowledge distillation is used to transfer the knowledge structure of robust teacher networks into lightweight networks so that our networks can execute quickly and at low computational cost, dramatically increasing the prospect and value of applying gaze estimation. (3) Integrating the above innovations, a novel fast differential neural network (Diff-Net) named FDAR-Net is constructed and achieved excellent results on MPIIGaze, UTMultiview and EyeDiap.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"249 ","pages":"Article 104156"},"PeriodicalIF":4.3000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A fast differential network with adaptive reference sample for gaze estimation\",\"authors\":\"Jiahui Hu , Yonghua Lu , Xiyuan Ye , Qiang Feng , Lihua Zhou\",\"doi\":\"10.1016/j.cviu.2024.104156\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Most non-invasive gaze estimation methods do not consider the inter-individual differences in anatomical structure, but directly regress the gaze direction from the appearance image information, which limits the accuracy of individual-independent gaze estimation networks. In addition, existing gaze estimation methods tend to consider only how to improve the model’s generalization performance, ignoring the crucial issue of efficiency, which leads to bulky models that are difficult to deploy and have questionable cost-effectiveness in practical use. This paper makes the following contributions: (1) A differential network for gaze estimation using adaptive reference samples is proposed, which can adaptively select reference samples based on scene and individual characteristics. (2) The knowledge distillation is used to transfer the knowledge structure of robust teacher networks into lightweight networks so that our networks can execute quickly and at low computational cost, dramatically increasing the prospect and value of applying gaze estimation. (3) Integrating the above innovations, a novel fast differential neural network (Diff-Net) named FDAR-Net is constructed and achieved excellent results on MPIIGaze, UTMultiview and EyeDiap.</div></div>\",\"PeriodicalId\":50633,\"journal\":{\"name\":\"Computer Vision and Image Understanding\",\"volume\":\"249 \",\"pages\":\"Article 104156\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Vision and Image Understanding\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1077314224002376\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224002376","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

大多数非侵入式注视估计方法不考虑解剖结构的个体间差异，而是直接从外观图像信息回归注视方向，这限制了独立于个体的注视估计网络的准确性。此外，现有的注视估计方法往往只考虑如何提高模型的泛化性能，而忽视了效率这一关键问题，导致模型体积庞大，难以部署，在实际应用中的成本效益也值得怀疑。本文的贡献如下：（1）提出了一种使用自适应参考样本进行注视估计的差分网络，它可以根据场景和个体特征自适应地选择参考样本。(2) 利用知识提炼法将稳健教师网络的知识结构转移到轻量级网络中，从而使我们的网络能够以较低的计算成本快速执行，大大提高了凝视估计的应用前景和价值。(3) 综合上述创新，构建了名为 FDAR-Net 的新型快速差分神经网络（Diff-Net），并在 MPIIGaze、UTMultiview 和 EyeDiap 上取得了优异的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A fast differential network with adaptive reference sample for gaze estimation

Most non-invasive gaze estimation methods do not consider the inter-individual differences in anatomical structure, but directly regress the gaze direction from the appearance image information, which limits the accuracy of individual-independent gaze estimation networks. In addition, existing gaze estimation methods tend to consider only how to improve the model’s generalization performance, ignoring the crucial issue of efficiency, which leads to bulky models that are difficult to deploy and have questionable cost-effectiveness in practical use. This paper makes the following contributions: (1) A differential network for gaze estimation using adaptive reference samples is proposed, which can adaptively select reference samples based on scene and individual characteristics. (2) The knowledge distillation is used to transfer the knowledge structure of robust teacher networks into lightweight networks so that our networks can execute quickly and at low computational cost, dramatically increasing the prospect and value of applying gaze estimation. (3) Integrating the above innovations, a novel fast differential neural network (Diff-Net) named FDAR-Net is constructed and achieved excellent results on MPIIGaze, UTMultiview and EyeDiap.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Vision and Image Understanding 工程技术-工程：电子与电气

CiteScore

7.80

自引率

4.40%

发文量

112

审稿时长

79 days

期刊介绍： The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems

期刊最新文献

Editorial Board Multi-Scale Adaptive Skeleton Transformer for action recognition Open-set domain adaptation with visual-language foundation models Leveraging vision-language prompts for real-world image restoration and enhancement RetSeg3D: Retention-based 3D semantic segmentation for autonomous driving