{"title":"低分辨率视频中的行人检测","authors":"Hisham Sager, W. Hoff","doi":"10.1109/WACV.2014.6836038","DOIUrl":null,"url":null,"abstract":"Pedestrian detection in low resolution videos can be challenging. In outdoor surveillance scenarios, the size of pedestrians in the images is often very small (around 20 pixels tall). The most common and successful approaches for single frame pedestrian detection use gradient-based features and a support vector machine classifier. We propose an extension of these ideas, and develop a new algorithm that extracts gradient features from a spatiotemporal volume, consisting of a short sequence of images (about one second in duration). The additional information provided by the motion of the person compensates for the loss of resolution. On standard datasets (PETS2001, VIRAT) we show a significant improvement in performance over single-frame detection.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"140 1","pages":"668-673"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Pedestrian detection in low resolution videos\",\"authors\":\"Hisham Sager, W. Hoff\",\"doi\":\"10.1109/WACV.2014.6836038\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pedestrian detection in low resolution videos can be challenging. In outdoor surveillance scenarios, the size of pedestrians in the images is often very small (around 20 pixels tall). The most common and successful approaches for single frame pedestrian detection use gradient-based features and a support vector machine classifier. We propose an extension of these ideas, and develop a new algorithm that extracts gradient features from a spatiotemporal volume, consisting of a short sequence of images (about one second in duration). The additional information provided by the motion of the person compensates for the loss of resolution. On standard datasets (PETS2001, VIRAT) we show a significant improvement in performance over single-frame detection.\",\"PeriodicalId\":73325,\"journal\":{\"name\":\"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision\",\"volume\":\"140 1\",\"pages\":\"668-673\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV.2014.6836038\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV.2014.6836038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Pedestrian detection in low resolution videos can be challenging. In outdoor surveillance scenarios, the size of pedestrians in the images is often very small (around 20 pixels tall). The most common and successful approaches for single frame pedestrian detection use gradient-based features and a support vector machine classifier. We propose an extension of these ideas, and develop a new algorithm that extracts gradient features from a spatiotemporal volume, consisting of a short sequence of images (about one second in duration). The additional information provided by the motion of the person compensates for the loss of resolution. On standard datasets (PETS2001, VIRAT) we show a significant improvement in performance over single-frame detection.