{"title":"基于混合监督的少镜头标注目标检测实例学习","authors":"Yi Zhong, Chengyao Wang, Shiyong Li, Zhuyun Zhou, Yaowei Wang, Weishi Zheng","doi":"10.1145/3503161.3548242","DOIUrl":null,"url":null,"abstract":"Mixed supervision for object detection (MSOD) that utilizes image-level annotations and a small amount of instance-level annotations has emerged as an efficient tool by alleviating the requirement for a large amount of costly instance-level annotations and providing effective instance supervision on previous methods that only use image-level annotations. In this work, we introduce the mixed supervision instance learning (MSIL), as a novel MSOD framework to leverage a handful of instance-level annotations to provide more explicit and implicit supervision. Rather than just adding instance-level annotations directly on loss functions for detection, we aim to dig out more effective explicit and implicit relations between these two different level annotations. In particular, we firstly propose the Instance-Annotation Guided Image Classification strategy to provide explicit guidance from instance-level annotations by using positional relation to force the image classifier to focus on the proposals which contain the correct object. And then, in order to exploit more implicit interaction between the mixed annotations, an instance reproduction strategy guided by the extra instance-level annotations is developed for generating more accurate pseudo ground truth, achieving a more discriminative detector. Finally, a false target instance mining strategy is used to refine the above processing by enriching the number and diversity of training instances with the position and score information. Our experiments show that the proposed MSIL framework outperforms recent state-of-the-art mixed supervised detectors with a large margin on both the Pascal VOC2007 and the MS-COCO dataset.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Mixed Supervision for Instance Learning in Object Detection with Few-shot Annotation\",\"authors\":\"Yi Zhong, Chengyao Wang, Shiyong Li, Zhuyun Zhou, Yaowei Wang, Weishi Zheng\",\"doi\":\"10.1145/3503161.3548242\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mixed supervision for object detection (MSOD) that utilizes image-level annotations and a small amount of instance-level annotations has emerged as an efficient tool by alleviating the requirement for a large amount of costly instance-level annotations and providing effective instance supervision on previous methods that only use image-level annotations. In this work, we introduce the mixed supervision instance learning (MSIL), as a novel MSOD framework to leverage a handful of instance-level annotations to provide more explicit and implicit supervision. Rather than just adding instance-level annotations directly on loss functions for detection, we aim to dig out more effective explicit and implicit relations between these two different level annotations. In particular, we firstly propose the Instance-Annotation Guided Image Classification strategy to provide explicit guidance from instance-level annotations by using positional relation to force the image classifier to focus on the proposals which contain the correct object. And then, in order to exploit more implicit interaction between the mixed annotations, an instance reproduction strategy guided by the extra instance-level annotations is developed for generating more accurate pseudo ground truth, achieving a more discriminative detector. Finally, a false target instance mining strategy is used to refine the above processing by enriching the number and diversity of training instances with the position and score information. Our experiments show that the proposed MSIL framework outperforms recent state-of-the-art mixed supervised detectors with a large margin on both the Pascal VOC2007 and the MS-COCO dataset.\",\"PeriodicalId\":412792,\"journal\":{\"name\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3503161.3548242\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3548242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mixed Supervision for Instance Learning in Object Detection with Few-shot Annotation
Mixed supervision for object detection (MSOD) that utilizes image-level annotations and a small amount of instance-level annotations has emerged as an efficient tool by alleviating the requirement for a large amount of costly instance-level annotations and providing effective instance supervision on previous methods that only use image-level annotations. In this work, we introduce the mixed supervision instance learning (MSIL), as a novel MSOD framework to leverage a handful of instance-level annotations to provide more explicit and implicit supervision. Rather than just adding instance-level annotations directly on loss functions for detection, we aim to dig out more effective explicit and implicit relations between these two different level annotations. In particular, we firstly propose the Instance-Annotation Guided Image Classification strategy to provide explicit guidance from instance-level annotations by using positional relation to force the image classifier to focus on the proposals which contain the correct object. And then, in order to exploit more implicit interaction between the mixed annotations, an instance reproduction strategy guided by the extra instance-level annotations is developed for generating more accurate pseudo ground truth, achieving a more discriminative detector. Finally, a false target instance mining strategy is used to refine the above processing by enriching the number and diversity of training instances with the position and score information. Our experiments show that the proposed MSIL framework outperforms recent state-of-the-art mixed supervised detectors with a large margin on both the Pascal VOC2007 and the MS-COCO dataset.