{"title":"走向实时头部检测和凝视估计","authors":"Kunxu Zhao, Zhengxi Hu, Qianyi Zhang, Jingtai Liu","doi":"10.1109/ROBIO55434.2022.10011802","DOIUrl":null,"url":null,"abstract":"As an important way of understanding human in-tentions, gaze estimation has always been a research hotspot in the field of human-robot interaction. Most studies now estimate gaze direction by analyzing head features and head detection is required before gaze estimation. For these two sequential tasks, the current research usually adopts two different networks, which increases the memory occupation of the graphics card and is not easy to deploy on the edge device. In this paper, we propose a unified network for simultaneous head detection and gaze estimation, unifying these two tasks into a multi-task learning model. In this network framework, head detection and gaze estimation share the same set of features, which enables them to promote each other to improve detection accuracy. We evaluated our model on gaze360 dataset and the gaze error dropped to 19.62 degrees while running at 23 fps.","PeriodicalId":151112,"journal":{"name":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RTHG: Towards Real- Time Head Detection And Gaze Estimation\",\"authors\":\"Kunxu Zhao, Zhengxi Hu, Qianyi Zhang, Jingtai Liu\",\"doi\":\"10.1109/ROBIO55434.2022.10011802\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As an important way of understanding human in-tentions, gaze estimation has always been a research hotspot in the field of human-robot interaction. Most studies now estimate gaze direction by analyzing head features and head detection is required before gaze estimation. For these two sequential tasks, the current research usually adopts two different networks, which increases the memory occupation of the graphics card and is not easy to deploy on the edge device. In this paper, we propose a unified network for simultaneous head detection and gaze estimation, unifying these two tasks into a multi-task learning model. In this network framework, head detection and gaze estimation share the same set of features, which enables them to promote each other to improve detection accuracy. We evaluated our model on gaze360 dataset and the gaze error dropped to 19.62 degrees while running at 23 fps.\",\"PeriodicalId\":151112,\"journal\":{\"name\":\"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBIO55434.2022.10011802\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO55434.2022.10011802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RTHG: Towards Real- Time Head Detection And Gaze Estimation
As an important way of understanding human in-tentions, gaze estimation has always been a research hotspot in the field of human-robot interaction. Most studies now estimate gaze direction by analyzing head features and head detection is required before gaze estimation. For these two sequential tasks, the current research usually adopts two different networks, which increases the memory occupation of the graphics card and is not easy to deploy on the edge device. In this paper, we propose a unified network for simultaneous head detection and gaze estimation, unifying these two tasks into a multi-task learning model. In this network framework, head detection and gaze estimation share the same set of features, which enables them to promote each other to improve detection accuracy. We evaluated our model on gaze360 dataset and the gaze error dropped to 19.62 degrees while running at 23 fps.