{"title":"Real-Time Human Gaze Estimation","authors":"T. Rowntree, C. Pontecorvo, I. Reid","doi":"10.1109/DICTA47822.2019.8945919","DOIUrl":null,"url":null,"abstract":"This paper describes a system for estimating the course gaze or 1D head pose of multiple people in a video stream from a moving camera in an indoor scene. The system runs at 30 Hz and can detect human heads with a F-Score of 87.2% and predict their gaze with an average error 20.9° including when they are facing directly away from the camera. The system uses two Convolutional Neural Networks (CNNs) for head detection and gaze estimation respectively and uses common tracking and filtering techniques for smoothing predictions over time. This paper is application-focused and so describes the individual components of the system as well as the techniques used for collecting data and training the CNNs.","PeriodicalId":6696,"journal":{"name":"2019 Digital Image Computing: Techniques and Applications (DICTA)","volume":"22 1","pages":"1-7"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA47822.2019.8945919","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper describes a system for estimating the course gaze or 1D head pose of multiple people in a video stream from a moving camera in an indoor scene. The system runs at 30 Hz and can detect human heads with a F-Score of 87.2% and predict their gaze with an average error 20.9° including when they are facing directly away from the camera. The system uses two Convolutional Neural Networks (CNNs) for head detection and gaze estimation respectively and uses common tracking and filtering techniques for smoothing predictions over time. This paper is application-focused and so describes the individual components of the system as well as the techniques used for collecting data and training the CNNs.