Yi Yang, F. Chen, Xiaoming Chen, Yan Dai, Zhenyang Chen, Jiang Ji, Tong Zhao
{"title":"Video system for human attribute analysis using compact convolutional neural network","authors":"Yi Yang, F. Chen, Xiaoming Chen, Yan Dai, Zhenyang Chen, Jiang Ji, Tong Zhao","doi":"10.1109/ICIP.2016.7532424","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks show their advantage in human attribute analysis (e.g. age, gender and ethnicity). However, they experience issues (e.g. robustness and responsiveness) when deployed in an intelligent video system. We propose one compact CNN model and apply it in our video system motivated by the full consideration of performance and usability. With the proposed web image mining and labelling strategy, we construct a large training set which covers various image conditions. The proposed CNN model successfully achieves a mean absolute error (MAE) of 3.23 years on the Morph 2 dataset, using the same test policy as our counterparts. This is the state-of-the-art score to our knowledge using CNN for age estimation. The proposed video analysis system employs this compact CNN model and demonstrated good performance in both dataset tests and deployment in real-world environments.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"53 1","pages":"584-588"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP.2016.7532424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Convolutional neural networks show their advantage in human attribute analysis (e.g. age, gender and ethnicity). However, they experience issues (e.g. robustness and responsiveness) when deployed in an intelligent video system. We propose one compact CNN model and apply it in our video system motivated by the full consideration of performance and usability. With the proposed web image mining and labelling strategy, we construct a large training set which covers various image conditions. The proposed CNN model successfully achieves a mean absolute error (MAE) of 3.23 years on the Morph 2 dataset, using the same test policy as our counterparts. This is the state-of-the-art score to our knowledge using CNN for age estimation. The proposed video analysis system employs this compact CNN model and demonstrated good performance in both dataset tests and deployment in real-world environments.