{"title":"A Novel Semi-Supervised Object Detection Approach via Scale Rebalancing and Global Proposal Contrast Consistency","authors":"Bo Liu;Chengrong Yang;Jing Guo;Yun Yang","doi":"10.1109/TCSVT.2024.3458907","DOIUrl":null,"url":null,"abstract":"Semi-supervised Object Detection (SSOD) is a method that uses a small amount of labeled data and a large amount of unlabeled data to improve the performance of object detection. However, existing SSOD methods face the challenges of scale imbalance and class inconsistency, resulting in large differences in detection results across different scales and classes. To overcome these challenges, we propose a Scale-Rebalanced Global Proposal Contrast Consistency (SGPC) approach, which has the following three advantages: 1) we design a Scale-Rebalanced Input (SRI) structure, which adjusts the distribution of objects of different scales by resampling the input images at low magnification, thereby enhancing the ability of small object detection; 2) we design a Global Proposal Contrast Consistency Loss (GPCC), which can enhance the intra-class compactness and inter-class diversity of Region of Interest (RoI) features, thereby reducing the class inconsistency in pseudo-labels; and 3) we adopt a loss blending optimization strategy, which optimizes the localization accuracy of pseudo-labels by combining supervised loss and unsupervised loss. We conduct extensive experiments on multiple datasets, and the results show that SGPC significantly outperforms the latest other methods on the SSOD task. On the PASCAL VOC dataset, SGPC achieves 55.90 mAP, on the MS-COCO dataset, SGPC exceeds the supervised methods by more than 10 mAP at different scales, and we also verify the significant improvement and robustness of SGPC on the small object detection datasets VisDrone-2019 and EDD.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 1","pages":"232-244"},"PeriodicalIF":11.1000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10676998/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Semi-supervised Object Detection (SSOD) is a method that uses a small amount of labeled data and a large amount of unlabeled data to improve the performance of object detection. However, existing SSOD methods face the challenges of scale imbalance and class inconsistency, resulting in large differences in detection results across different scales and classes. To overcome these challenges, we propose a Scale-Rebalanced Global Proposal Contrast Consistency (SGPC) approach, which has the following three advantages: 1) we design a Scale-Rebalanced Input (SRI) structure, which adjusts the distribution of objects of different scales by resampling the input images at low magnification, thereby enhancing the ability of small object detection; 2) we design a Global Proposal Contrast Consistency Loss (GPCC), which can enhance the intra-class compactness and inter-class diversity of Region of Interest (RoI) features, thereby reducing the class inconsistency in pseudo-labels; and 3) we adopt a loss blending optimization strategy, which optimizes the localization accuracy of pseudo-labels by combining supervised loss and unsupervised loss. We conduct extensive experiments on multiple datasets, and the results show that SGPC significantly outperforms the latest other methods on the SSOD task. On the PASCAL VOC dataset, SGPC achieves 55.90 mAP, on the MS-COCO dataset, SGPC exceeds the supervised methods by more than 10 mAP at different scales, and we also verify the significant improvement and robustness of SGPC on the small object detection datasets VisDrone-2019 and EDD.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.