Improve Fine-grained Visual Classification Accuracy by Controllable Location Knowledge Distillation

You-Lin Tsai, Cheng-Hung Lin, Po-Yung Chou
{"title":"Improve Fine-grained Visual Classification Accuracy by Controllable Location Knowledge Distillation","authors":"You-Lin Tsai, Cheng-Hung Lin, Po-Yung Chou","doi":"10.1109/ICCE59016.2024.10444242","DOIUrl":null,"url":null,"abstract":"The current state-of-the-art network models have achieved remarkable performance. However, they often face an issue of having excessively large architectures, making them challenging to deploy on edge devices. In response to this challenge, a groundbreaking solution known as knowledge distillation has been introduced. The concept of knowledge distillation involves transferring information from a complex teacher model to a simpler student model, effectively reducing the model’s complexity. Prior approach has demonstrated promising transfer effects through this technique. Nevertheless, in the field of fine-grained image classification, there has been limited exploration of distillation methods custom-tailored specifically for this domain. In this paper, we focus on knowledge distillation specifically designed for fine-grained image recognition. Notably, this strategy is inspired by the Class Activation Maps (CAM). We first train a complex model and use it generated feature maps with spatial information, which is call hint maps. Furthermore, we propose an adjustment strategy for this hint map, which can control local distribution of information. Referring to it as Controllable Size CAM (CTRLS-CAM). Use it as a guide allowing the student model to effectively learn from the teacher model’s behavior and concentrate on discriminate details. In contrast to conventional distillation models, this strategy proves particularly advantageous for fine-grained recognition, enhancing learning outcomes and enabling the student model to achieve superior performance. In CTRLS-CAM method, we refine the hint maps value distribution, redefining the relative relationships between primary and secondary feature areas. We conducted experiments using the CUB200-2011 dataset, and our results demonstrated a significant accuracy improvement of about 5% compared to the original non-distilled student model. Moreover, our approach achieved a 1.23% enhancement over traditional knowledge distillation methods.","PeriodicalId":518694,"journal":{"name":"2024 IEEE International Conference on Consumer Electronics (ICCE)","volume":"96 11","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE International Conference on Consumer Electronics (ICCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE59016.2024.10444242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The current state-of-the-art network models have achieved remarkable performance. However, they often face an issue of having excessively large architectures, making them challenging to deploy on edge devices. In response to this challenge, a groundbreaking solution known as knowledge distillation has been introduced. The concept of knowledge distillation involves transferring information from a complex teacher model to a simpler student model, effectively reducing the model’s complexity. Prior approach has demonstrated promising transfer effects through this technique. Nevertheless, in the field of fine-grained image classification, there has been limited exploration of distillation methods custom-tailored specifically for this domain. In this paper, we focus on knowledge distillation specifically designed for fine-grained image recognition. Notably, this strategy is inspired by the Class Activation Maps (CAM). We first train a complex model and use it generated feature maps with spatial information, which is call hint maps. Furthermore, we propose an adjustment strategy for this hint map, which can control local distribution of information. Referring to it as Controllable Size CAM (CTRLS-CAM). Use it as a guide allowing the student model to effectively learn from the teacher model’s behavior and concentrate on discriminate details. In contrast to conventional distillation models, this strategy proves particularly advantageous for fine-grained recognition, enhancing learning outcomes and enabling the student model to achieve superior performance. In CTRLS-CAM method, we refine the hint maps value distribution, redefining the relative relationships between primary and secondary feature areas. We conducted experiments using the CUB200-2011 dataset, and our results demonstrated a significant accuracy improvement of about 5% compared to the original non-distilled student model. Moreover, our approach achieved a 1.23% enhancement over traditional knowledge distillation methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过可控位置知识提炼提高细粒度视觉分类精度
当前最先进的网络模型已经取得了卓越的性能。然而,它们往往面临着架构过于庞大的问题,使其在边缘设备上的部署面临挑战。为了应对这一挑战,一种被称为 "知识蒸馏 "的突破性解决方案应运而生。知识蒸馏的概念是将复杂的教师模型中的信息转移到较简单的学生模型中,从而有效降低模型的复杂性。先前的研究方法已经证明,通过这种技术可以达到很好的转移效果。然而,在细粒度图像分类领域,对专门为该领域定制的提炼方法的探索还很有限。在本文中,我们将重点讨论专为细粒度图像识别设计的知识提炼方法。值得注意的是,这一策略受到了类激活图(CAM)的启发。我们首先训练一个复杂的模型,并利用它生成带有空间信息的特征图,即提示图。此外,我们还为这种提示图提出了一种调整策略,可以控制信息的局部分布。我们将其称为可控尺寸 CAM(CTRLS-CAM)。将其作为一种指导,允许学生模型有效地学习教师模型的行为,并专注于辨别细节。与传统的蒸馏模型相比,这种策略在细粒度识别方面特别有优势,能提高学习效果,使学生模型取得优异成绩。在 CTRLS-CAM 方法中,我们完善了提示图的值分布,重新定义了主要特征区域和次要特征区域之间的相对关系。我们使用 CUB200-2011 数据集进行了实验,结果表明,与原始的非蒸馏学生模型相比,我们的准确率显著提高了约 5%。此外,与传统的知识提炼方法相比,我们的方法提高了 1.23%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
HLS Implementation of a Building Cube Stencil Computation Framework for an FPGA Accelerator Performance Enhancement using Data Augmentation of Depth Estimation for Autonomous Driving Robotic Prosthesis with Controllable Knee Angle that Responds to Changes in Gait Pattern A Multi-Functional Drone for Agriculture Maintenance and Monitoring in Small-Scale Farming Enhancing Scene Understanding in VR for Visually Impaired Individuals with High-Frame Videos and Event Overlays
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1