Localization Distillation for Object Detection

Zhaohui Zheng;Rongguang Ye;Qibin Hou;Dongwei Ren;Ping Wang;Wangmeng Zuo;Ming-Ming Cheng
{"title":"Localization Distillation for Object Detection","authors":"Zhaohui Zheng;Rongguang Ye;Qibin Hou;Dongwei Ren;Ping Wang;Wangmeng Zuo;Ming-Ming Cheng","doi":"10.1109/TPAMI.2023.3248583","DOIUrl":null,"url":null,"abstract":"Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the prediction logits due to its inefficiency in distilling the localization information. In this paper, we investigate whether logit mimicking always lags behind feature imitation. Towards this goal, we first present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student. Second, we introduce the concept of valuable localization region that can aid to selectively distill the classification and localization knowledge for a certain region. Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking under-performs for years. The thorough studies exhibit the great potential of logit mimicking that can significantly alleviate the localization ambiguity, learn robust feature representation, and ease the training difficulty in the early stage. We also provide the theoretical connection between the proposed LD and the classification KD, that they share the equivalent optimization effect. Our distillation scheme is simple as well as effective and can be easily applied to both dense horizontal object detectors and rotated object detectors. Extensive experiments on the MS COCO, PASCAL VOC, and DOTA benchmarks demonstrate that our method can achieve considerable AP improvement without any sacrifice on the inference speed. Our source code and pretrained models are publicly available at \n<uri>https://github.com/HikariTJU/LD</uri>\n.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"45 8","pages":"10070-10083"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10052761/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the prediction logits due to its inefficiency in distilling the localization information. In this paper, we investigate whether logit mimicking always lags behind feature imitation. Towards this goal, we first present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student. Second, we introduce the concept of valuable localization region that can aid to selectively distill the classification and localization knowledge for a certain region. Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking under-performs for years. The thorough studies exhibit the great potential of logit mimicking that can significantly alleviate the localization ambiguity, learn robust feature representation, and ease the training difficulty in the early stage. We also provide the theoretical connection between the proposed LD and the classification KD, that they share the equivalent optimization effect. Our distillation scheme is simple as well as effective and can be easily applied to both dense horizontal object detectors and rotated object detectors. Extensive experiments on the MS COCO, PASCAL VOC, and DOTA benchmarks demonstrate that our method can achieve considerable AP improvement without any sacrifice on the inference speed. Our source code and pretrained models are publicly available at https://github.com/HikariTJU/LD .
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于目标检测的定位蒸馏
以往用于目标检测的知识提取方法大多侧重于特征模仿,而不是模仿预测logits,因为其在提取定位信息方面效率低下。在本文中,我们研究了logit模仿是否总是落后于特征模仿。为此,我们首先提出了一种新的本地化蒸馏(LD)方法,该方法可以有效地将本地化知识从教师传递给学生。其次,我们引入了有价值的定位区域的概念,它可以帮助选择性地提取某个区域的分类和定位知识。结合这两个新的组成部分,我们首次表明logit模仿可以优于特征模仿,而缺乏本地化蒸馏是logit模仿多年来表现不佳的关键原因。深入的研究显示了logit模仿的巨大潜力,它可以显著缓解定位模糊,学习鲁棒的特征表示,并在早期阶段减轻训练难度。我们还提供了所提出的LD和分类KD之间的理论联系,即它们共享等效的优化效果。我们的蒸馏方案简单有效,可以很容易地应用于密集水平物体探测器和旋转物体探测器。在MS COCO、PASCAL VOC和DOTA基准上进行的大量实验表明,我们的方法可以在不牺牲推理速度的情况下实现相当大的AP改进。我们的源代码和预训练模型可在https://github.com/HikariTJU/LD.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Graph Prompt Clustering. CLRNetV2: A Faster and Stronger Lane Detector. Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation. Hard-aware Instance Adaptive Self-training for Unsupervised Cross-domain Semantic Segmentation. Hulk: A Universal Knowledge Translator for Human-Centric Tasks.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1