Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang
{"title":"Positive-Unlabeled Data Purification in the Wild for Object Detection","authors":"Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang","doi":"10.1109/CVPR46437.2021.00268","DOIUrl":null,"url":null,"abstract":"Deep learning based object detection approaches have achieved great progress with the benefit from large amount of labeled images. However, image annotation remains a laborious, time-consuming and error-prone process. To further improve the performance of detectors, we seek to exploit all available labeled data and excavate useful samples from massive unlabeled images in the wild, which is rarely discussed before. In this paper, we present a positive-unlabeled learning based scheme to expand training data by purifying valuable images from massive unlabeled ones, where the original training data are viewed as positive data and the unlabeled images in the wild are unlabeled data. To effectively utilized these purified data, we propose a self-distillation algorithm based on hint learning and ground truth bounded knowledge distillation. Experimental results verify that the proposed positive-unlabeled data purification can strengthen the original detector by mining the massive unlabeled data. In particular, our method boosts the mAP of FPN by +2.0% on COCO benchmark.","PeriodicalId":339646,"journal":{"name":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR46437.2021.00268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Deep learning based object detection approaches have achieved great progress with the benefit from large amount of labeled images. However, image annotation remains a laborious, time-consuming and error-prone process. To further improve the performance of detectors, we seek to exploit all available labeled data and excavate useful samples from massive unlabeled images in the wild, which is rarely discussed before. In this paper, we present a positive-unlabeled learning based scheme to expand training data by purifying valuable images from massive unlabeled ones, where the original training data are viewed as positive data and the unlabeled images in the wild are unlabeled data. To effectively utilized these purified data, we propose a self-distillation algorithm based on hint learning and ground truth bounded knowledge distillation. Experimental results verify that the proposed positive-unlabeled data purification can strengthen the original detector by mining the massive unlabeled data. In particular, our method boosts the mAP of FPN by +2.0% on COCO benchmark.