改进单阶段行人检测器的遮挡和硬负处理

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI:10.1109/CVPR.2018.00107

Junhyug Noh, Soochan Lee, Beomsu Kim, Gunhee Kim

{"title":"改进单阶段行人检测器的遮挡和硬负处理","authors":"Junhyug Noh, Soochan Lee, Beomsu Kim, Gunhee Kim","doi":"10.1109/CVPR.2018.00107","DOIUrl":null,"url":null,"abstract":"We propose methods of addressing two critical issues of pedestrian detection: (i) occlusion of target objects as false negative failure, and (ii) confusion with hard negative examples like vertical structures as false positive failure. Our solutions to these two problems are general and flexible enough to be applicable to any single-stage detection models. We implement our methods into four state-of-the-art single-stage models, including SqueezeDet+ [22], YOLOv2 [17], SSD [12], and DSSD [8]. We empirically validate that our approach indeed improves the performance of those four models on Caltech pedestrian [4] and CityPersons dataset [25]. Moreover, in some heavy occlusion settings, our approach achieves the best reported performance. Specifically, our two solutions are as follows. For better occlusion handling, we update the output tensors of single-stage models so that they include the prediction of part confidence scores, from which we compute a final occlusion-aware detection score. For reducing confusion with hard negative examples, we introduce average grid classifiers as post-refinement classifiers, trainable in an end-to-end fashion with little memory and time overhead (e.g. increase of 1-5 MB in memory and 1-2 ms in inference time).","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"966-974"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"76","resultStr":"{\"title\":\"Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors\",\"authors\":\"Junhyug Noh, Soochan Lee, Beomsu Kim, Gunhee Kim\",\"doi\":\"10.1109/CVPR.2018.00107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose methods of addressing two critical issues of pedestrian detection: (i) occlusion of target objects as false negative failure, and (ii) confusion with hard negative examples like vertical structures as false positive failure. Our solutions to these two problems are general and flexible enough to be applicable to any single-stage detection models. We implement our methods into four state-of-the-art single-stage models, including SqueezeDet+ [22], YOLOv2 [17], SSD [12], and DSSD [8]. We empirically validate that our approach indeed improves the performance of those four models on Caltech pedestrian [4] and CityPersons dataset [25]. Moreover, in some heavy occlusion settings, our approach achieves the best reported performance. Specifically, our two solutions are as follows. For better occlusion handling, we update the output tensors of single-stage models so that they include the prediction of part confidence scores, from which we compute a final occlusion-aware detection score. For reducing confusion with hard negative examples, we introduce average grid classifiers as post-refinement classifiers, trainable in an end-to-end fashion with little memory and time overhead (e.g. increase of 1-5 MB in memory and 1-2 ms in inference time).\",\"PeriodicalId\":6564,\"journal\":{\"name\":\"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition\",\"volume\":\"10 1\",\"pages\":\"966-974\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"76\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2018.00107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2018.00107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 76

摘要

我们提出了解决行人检测的两个关键问题的方法:(i)目标物体的遮挡作为假阴性失败，以及(ii)与硬阴性示例(如垂直结构)的混淆作为假阳性失败。我们对这两个问题的解决方案是通用的，足够灵活，适用于任何单级检测模型。我们将方法应用到四个最先进的单级模型中，包括SqueezeDet+[22]、YOLOv2[17]、SSD[12]和DSSD[8]。我们通过经验验证了我们的方法确实提高了这四个模型在Caltech pedestrian[4]和CityPersons数据集[25]上的性能。此外，在一些严重的遮挡设置中，我们的方法达到了最佳的报告性能。具体来说，我们的两个解决方案如下。为了更好地处理遮挡，我们更新了单阶段模型的输出张量，使它们包括部分置信度分数的预测，从中我们计算出最终的遮挡感知检测分数。为了减少与硬负示例的混淆，我们引入了平均网格分类器作为后细化分类器，以端到端方式训练，内存和时间开销很少(例如增加1-5 MB内存和1-2毫秒推理时间)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors

We propose methods of addressing two critical issues of pedestrian detection: (i) occlusion of target objects as false negative failure, and (ii) confusion with hard negative examples like vertical structures as false positive failure. Our solutions to these two problems are general and flexible enough to be applicable to any single-stage detection models. We implement our methods into four state-of-the-art single-stage models, including SqueezeDet+ [22], YOLOv2 [17], SSD [12], and DSSD [8]. We empirically validate that our approach indeed improves the performance of those four models on Caltech pedestrian [4] and CityPersons dataset [25]. Moreover, in some heavy occlusion settings, our approach achieves the best reported performance. Specifically, our two solutions are as follows. For better occlusion handling, we update the output tensors of single-stage models so that they include the prediction of part confidence scores, from which we compute a final occlusion-aware detection score. For reducing confusion with hard negative examples, we introduce average grid classifiers as post-refinement classifiers, trainable in an end-to-end fashion with little memory and time overhead (e.g. increase of 1-5 MB in memory and 1-2 ms in inference time).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Multistage Adversarial Losses for Pose-Based Human Image Synthesis Document Enhancement Using Visibility Detection Demo2Vec: Reasoning Object Affordances from Online Videos Planar Shape Detection at Structural Scales Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks