前景分离知识精馏用于目标检测。

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE PeerJ Computer Science Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI:10.7717/peerj-cs.2485

Chao Li, Rugui Liu, Zhe Quan, Pengpeng Hu, Jun Sun

{"title":"前景分离知识精馏用于目标检测。","authors":"Chao Li, Rugui Liu, Zhe Quan, Pengpeng Hu, Jun Sun","doi":"10.7717/peerj-cs.2485","DOIUrl":null,"url":null,"abstract":"In recent years, deep learning models have become predominant methods for computer vision tasks, but the large computation and storage requirements of many models make them challenging to deploy on devices with limited resources. Knowledge distillation (KD) is a widely used approach for model compression. However, when applied in the object detection problems, the existing KD methods either directly applies the feature map or simply separate the foreground from the background by using a binary mask, aligning the attention between the teacher and the student models. Unfortunately, these methods either completely overlook or fail to thoroughly eliminate noise, resulting in unsatisfactory model accuracy for student models. To address this issue, we propose a foreground separation distillation (FSD) method in this paper. The FSD method enables student models to distinguish between foreground and background using Gaussian heatmaps, reducing irrelevant information in the learning process. Additionally, FSD also extracts the channel feature by converting the spatial feature maps into probabilistic forms to fully utilize the knowledge in each channel of a well-trained teacher. Experimental results demonstrate that the YOLOX detector enhanced with our distillation method achieved superior performance on both the fall detection and the VOC2007 datasets. For example, YOLOX with FSD achieved 73.1% mean average precision (mAP) on the Fall Detection dataset, which is 1.6% higher than the baseline. The code of FSD is accessible via https://doi.org/10.5281/zenodo.13829676.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2485"},"PeriodicalIF":2.5000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623026/pdf/","citationCount":"0","resultStr":"{\"title\":\"Foreground separation knowledge distillation for object detection.\",\"authors\":\"Chao Li, Rugui Liu, Zhe Quan, Pengpeng Hu, Jun Sun\",\"doi\":\"10.7717/peerj-cs.2485\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, deep learning models have become predominant methods for computer vision tasks, but the large computation and storage requirements of many models make them challenging to deploy on devices with limited resources. Knowledge distillation (KD) is a widely used approach for model compression. However, when applied in the object detection problems, the existing KD methods either directly applies the feature map or simply separate the foreground from the background by using a binary mask, aligning the attention between the teacher and the student models. Unfortunately, these methods either completely overlook or fail to thoroughly eliminate noise, resulting in unsatisfactory model accuracy for student models. To address this issue, we propose a foreground separation distillation (FSD) method in this paper. The FSD method enables student models to distinguish between foreground and background using Gaussian heatmaps, reducing irrelevant information in the learning process. Additionally, FSD also extracts the channel feature by converting the spatial feature maps into probabilistic forms to fully utilize the knowledge in each channel of a well-trained teacher. Experimental results demonstrate that the YOLOX detector enhanced with our distillation method achieved superior performance on both the fall detection and the VOC2007 datasets. For example, YOLOX with FSD achieved 73.1% mean average precision (mAP) on the Fall Detection dataset, which is 1.6% higher than the baseline. The code of FSD is accessible via https://doi.org/10.5281/zenodo.13829676.\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"10 \",\"pages\":\"e2485\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623026/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.2485\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2485","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

近年来，深度学习模型已成为计算机视觉任务的主要方法，但许多模型的大量计算和存储需求使其难以在资源有限的设备上部署。知识蒸馏（Knowledge distillation， KD）是一种应用广泛的模型压缩方法。然而，当应用于目标检测问题时，现有的KD方法要么直接应用特征映射，要么简单地使用二值掩码将前景与背景分离，使教师模型和学生模型的注意力对准。不幸的是，这些方法要么完全忽略了噪声，要么没有彻底消除噪声，导致学生模型的模型精度不理想。为了解决这一问题，本文提出了一种前景分离蒸馏（FSD）方法。FSD方法使学生模型能够使用高斯热图区分前景和背景，减少学习过程中的不相关信息。此外，消防处还通过将空间特征图转换为概率形式提取通道特征，以充分利用训练有素的教师在每个通道中的知识。实验结果表明，采用我们的蒸馏方法增强的YOLOX检测器在跌倒检测和VOC2007数据集上都取得了优异的性能。例如，使用FSD的YOLOX在跌倒检测数据集上实现了73.1%的平均精度（mAP），比基线高1.6%。消防处的程式码可透过https://doi.org/10.5281/zenodo.13829676浏览。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Foreground separation knowledge distillation for object detection.

In recent years, deep learning models have become predominant methods for computer vision tasks, but the large computation and storage requirements of many models make them challenging to deploy on devices with limited resources. Knowledge distillation (KD) is a widely used approach for model compression. However, when applied in the object detection problems, the existing KD methods either directly applies the feature map or simply separate the foreground from the background by using a binary mask, aligning the attention between the teacher and the student models. Unfortunately, these methods either completely overlook or fail to thoroughly eliminate noise, resulting in unsatisfactory model accuracy for student models. To address this issue, we propose a foreground separation distillation (FSD) method in this paper. The FSD method enables student models to distinguish between foreground and background using Gaussian heatmaps, reducing irrelevant information in the learning process. Additionally, FSD also extracts the channel feature by converting the spatial feature maps into probabilistic forms to fully utilize the knowledge in each channel of a well-trained teacher. Experimental results demonstrate that the YOLOX detector enhanced with our distillation method achieved superior performance on both the fall detection and the VOC2007 datasets. For example, YOLOX with FSD achieved 73.1% mean average precision (mAP) on the Fall Detection dataset, which is 1.6% higher than the baseline. The code of FSD is accessible via https://doi.org/10.5281/zenodo.13829676.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PeerJ Computer Science Computer Science-General Computer Science

CiteScore

6.10

自引率

5.30%

发文量

332

审稿时长

10 weeks

期刊介绍： PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.