{"title":"Simple Real-Time Multi-face Tracking Based on Convolutional Neural Networks","authors":"Xile Li, J. Lang","doi":"10.1109/CRV.2018.00054","DOIUrl":null,"url":null,"abstract":"We present a simple real-time system that is able to track multiple faces for live videos, broadcast, real-time conference recording, etc. Our proposed tracking system is comprised of three parts: face detection, feature extraction and tracking. We employ a previously proposed cascaded Multi-Task Convolutional Neural Network (MTCNN) to detect a face, a simple CNN to extract the features of detected faces and show that a shallow network for face tracking based on the extracted feature maps of the face is sufficient. Our multi-face tracker runs in real-time without any on-line training. We do not adjust any parameters according to different input videos, and the tracker's run-time will not significantly increase with an increase in the number of faces being tracked, i.e., it is easy to deploy in new real-time applications. We evaluate our tracker based on two commonly used metrics in comparison to five recent face trackers. Our proposed simple tracker can perform competitively in comparison to these trackers despite occlusions in the videos and false positives or false negatives during face detection.","PeriodicalId":281779,"journal":{"name":"2018 15th Conference on Computer and Robot Vision (CRV)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 15th Conference on Computer and Robot Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV.2018.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We present a simple real-time system that is able to track multiple faces for live videos, broadcast, real-time conference recording, etc. Our proposed tracking system is comprised of three parts: face detection, feature extraction and tracking. We employ a previously proposed cascaded Multi-Task Convolutional Neural Network (MTCNN) to detect a face, a simple CNN to extract the features of detected faces and show that a shallow network for face tracking based on the extracted feature maps of the face is sufficient. Our multi-face tracker runs in real-time without any on-line training. We do not adjust any parameters according to different input videos, and the tracker's run-time will not significantly increase with an increase in the number of faces being tracked, i.e., it is easy to deploy in new real-time applications. We evaluate our tracker based on two commonly used metrics in comparison to five recent face trackers. Our proposed simple tracker can perform competitively in comparison to these trackers despite occlusions in the videos and false positives or false negatives during face detection.