You Li, L. Zhuo, Jiafeng Li, Jing Zhang, Xi Liang, Q. Tian
{"title":"基于深度特征引导池的视频人物再识别","authors":"You Li, L. Zhuo, Jiafeng Li, Jing Zhang, Xi Liang, Q. Tian","doi":"10.1109/CVPRW.2017.188","DOIUrl":null,"url":null,"abstract":"Person re-identification (re-id) aims to match a specific person across non-overlapping views of different cameras, which is currently one of the hot topics in computer vision. Compared with image-based person re-id, video-based techniques could achieve better performance by fully utilizing the space-time information. This paper presents a novel video-based person re-id method named Deep Feature Guided Pooling (DFGP), which can take full advantage of the space-time information. The contributions of the method are in the following aspects: (1) PCA-based convolutional network (PCN), a lightweight deep learning network, is trained to generate deep features of video frames. Deep features are aggregated by average pooling to obtain person deep feature vectors. The vectors are utilized to guide the generation of human appearance features, which makes the appearance features robust to the severe noise in videos. (2) Hand-crafted local features of videos are aggregated by max pooling to reinforce the motion variations of different persons. In this way, the human descriptors are more discriminative. (3) The final human descriptors are composed of deep features and hand-crafted local features to take their own advantages and the performance of identification is promoted. Experimental results show that our approach outperforms six other state-of-the-art video-based methods on the challenging PRID 2011 and iLIDS-VID video-based person re-id datasets.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"115 1","pages":"1454-1461"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Video-Based Person Re-identification by Deep Feature Guided Pooling\",\"authors\":\"You Li, L. Zhuo, Jiafeng Li, Jing Zhang, Xi Liang, Q. Tian\",\"doi\":\"10.1109/CVPRW.2017.188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Person re-identification (re-id) aims to match a specific person across non-overlapping views of different cameras, which is currently one of the hot topics in computer vision. Compared with image-based person re-id, video-based techniques could achieve better performance by fully utilizing the space-time information. This paper presents a novel video-based person re-id method named Deep Feature Guided Pooling (DFGP), which can take full advantage of the space-time information. The contributions of the method are in the following aspects: (1) PCA-based convolutional network (PCN), a lightweight deep learning network, is trained to generate deep features of video frames. Deep features are aggregated by average pooling to obtain person deep feature vectors. The vectors are utilized to guide the generation of human appearance features, which makes the appearance features robust to the severe noise in videos. (2) Hand-crafted local features of videos are aggregated by max pooling to reinforce the motion variations of different persons. In this way, the human descriptors are more discriminative. (3) The final human descriptors are composed of deep features and hand-crafted local features to take their own advantages and the performance of identification is promoted. Experimental results show that our approach outperforms six other state-of-the-art video-based methods on the challenging PRID 2011 and iLIDS-VID video-based person re-id datasets.\",\"PeriodicalId\":6668,\"journal\":{\"name\":\"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)\",\"volume\":\"115 1\",\"pages\":\"1454-1461\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPRW.2017.188\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW.2017.188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Video-Based Person Re-identification by Deep Feature Guided Pooling
Person re-identification (re-id) aims to match a specific person across non-overlapping views of different cameras, which is currently one of the hot topics in computer vision. Compared with image-based person re-id, video-based techniques could achieve better performance by fully utilizing the space-time information. This paper presents a novel video-based person re-id method named Deep Feature Guided Pooling (DFGP), which can take full advantage of the space-time information. The contributions of the method are in the following aspects: (1) PCA-based convolutional network (PCN), a lightweight deep learning network, is trained to generate deep features of video frames. Deep features are aggregated by average pooling to obtain person deep feature vectors. The vectors are utilized to guide the generation of human appearance features, which makes the appearance features robust to the severe noise in videos. (2) Hand-crafted local features of videos are aggregated by max pooling to reinforce the motion variations of different persons. In this way, the human descriptors are more discriminative. (3) The final human descriptors are composed of deep features and hand-crafted local features to take their own advantages and the performance of identification is promoted. Experimental results show that our approach outperforms six other state-of-the-art video-based methods on the challenging PRID 2011 and iLIDS-VID video-based person re-id datasets.