Rudong Jing , Wei Zhang , Yuzhuo Li , Wenlin Li , Yanyan Liu
{"title":"Dynamic Feature Focusing Network for small object detection","authors":"Rudong Jing , Wei Zhang , Yuzhuo Li , Wenlin Li , Yanyan Liu","doi":"10.1016/j.ipm.2024.103858","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning has driven research in object detection and achieved proud results. Despite its significant advancements in object detection, small object detection still struggles with low recognition rates and inaccurate positioning, primarily attributable to their miniature size. The location deviation of small objects induces severe feature misalignment, and the disequilibrium between classification and regression tasks hinders accurate recognition. To address these issues, we propose a Dynamic Feature Focusing Network (DFFN), which contains a duo of crucial modules: Visual Perception Enhancement Module (VPEM) and Task Association Module (TAM). Drawing upon the deformable convolution and attention mechanism, the VPEM concentrates on sparse key features and perceives the misalignment via positional offset. We aggregate multi-level features at identical spatial locations via layer average operation for learning a more discriminative representation. Incorporating class alignment and bounding box alignment parts, the TAM promotes classification ability, refines bounding box regression, and facilitates the joint learning of classification and localization. We conduct diverse experiments, and the proposed method considerably enhances the small object detection performance on four benchmark datasets of MS COCO, VisDrone, VOC, and TinyPerson. Our method has improved by 3.4 and 2.2 in mAP and AP<em>s</em>, making solid improvements on COCO. Compared to other classic detection models, DFFN exhibits a high level of competitiveness in precision.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"61 6","pages":"Article 103858"},"PeriodicalIF":7.4000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324002176","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning has driven research in object detection and achieved proud results. Despite its significant advancements in object detection, small object detection still struggles with low recognition rates and inaccurate positioning, primarily attributable to their miniature size. The location deviation of small objects induces severe feature misalignment, and the disequilibrium between classification and regression tasks hinders accurate recognition. To address these issues, we propose a Dynamic Feature Focusing Network (DFFN), which contains a duo of crucial modules: Visual Perception Enhancement Module (VPEM) and Task Association Module (TAM). Drawing upon the deformable convolution and attention mechanism, the VPEM concentrates on sparse key features and perceives the misalignment via positional offset. We aggregate multi-level features at identical spatial locations via layer average operation for learning a more discriminative representation. Incorporating class alignment and bounding box alignment parts, the TAM promotes classification ability, refines bounding box regression, and facilitates the joint learning of classification and localization. We conduct diverse experiments, and the proposed method considerably enhances the small object detection performance on four benchmark datasets of MS COCO, VisDrone, VOC, and TinyPerson. Our method has improved by 3.4 and 2.2 in mAP and APs, making solid improvements on COCO. Compared to other classic detection models, DFFN exhibits a high level of competitiveness in precision.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.