{"title":"Libra-SOD: Balanced label assignment for small object detection","authors":"Zhuangzhuang Zhou, Yingying Zhu","doi":"10.1016/j.knosys.2024.112353","DOIUrl":null,"url":null,"abstract":"<div><p>Small object detection (SOD) is one of the notoriously challenging tasks in the computer vision community. Due to instances occupying fairly small regions and having limited overlap with priors (anchors or points), strict label assignment based on pre-defined IoU thresholds usually results in a lack of sufficient training samples for small objects. Despite center sampling or IoU statistic-based label assignment strategies mitigate imbalanced label assignment results, they struggle to deliver consistent gains for small, medium and large objects simultaneously. In this paper, we propose a novel model with a balanced label assignment (BLA) strategy for SOD in complex scenes, called Libra-SOD. First, the BLA is proposed, which considers both classification confidence and localization quality in the assignment process, and assigns the same number of positive samples to each Ground Truth. Second, to cooperate with BLA closely, we introduce a task-aware head, which makes the assignment results more reliable by interweaving classification and regression tasks. Finally, the task-aware loss is designed to dynamically assign weight factors and labels during supervised predictions, allowing the framework to focus more on valuable samples. Extensive experiments are performed on four challenging datasets. In DIOR (object DetectIon in Optical Remote sensing image), Libra-SOD achieves a state-of-the-art performance of 73.7 mAP with ResNet-50 as the backbone. To the best of our knowledge, Libra-SOD is the first single-stage framework that performs over 30 AP on SODA-D (Small Object Detection dAtasets).</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124009870","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Small object detection (SOD) is one of the notoriously challenging tasks in the computer vision community. Due to instances occupying fairly small regions and having limited overlap with priors (anchors or points), strict label assignment based on pre-defined IoU thresholds usually results in a lack of sufficient training samples for small objects. Despite center sampling or IoU statistic-based label assignment strategies mitigate imbalanced label assignment results, they struggle to deliver consistent gains for small, medium and large objects simultaneously. In this paper, we propose a novel model with a balanced label assignment (BLA) strategy for SOD in complex scenes, called Libra-SOD. First, the BLA is proposed, which considers both classification confidence and localization quality in the assignment process, and assigns the same number of positive samples to each Ground Truth. Second, to cooperate with BLA closely, we introduce a task-aware head, which makes the assignment results more reliable by interweaving classification and regression tasks. Finally, the task-aware loss is designed to dynamically assign weight factors and labels during supervised predictions, allowing the framework to focus more on valuable samples. Extensive experiments are performed on four challenging datasets. In DIOR (object DetectIon in Optical Remote sensing image), Libra-SOD achieves a state-of-the-art performance of 73.7 mAP with ResNet-50 as the backbone. To the best of our knowledge, Libra-SOD is the first single-stage framework that performs over 30 AP on SODA-D (Small Object Detection dAtasets).
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.