A Unified Asymmetric Knowledge Distillation Framework for Image Classification

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Processing Letters Pub Date : 2024-05-10 DOI:10.1007/s11063-024-11606-z

Xin Ye, Xiang Tian, Bolun Zheng, Fan Zhou, Yaowu Chen

{"title":"A Unified Asymmetric Knowledge Distillation Framework for Image Classification","authors":"Xin Ye, Xiang Tian, Bolun Zheng, Fan Zhou, Yaowu Chen","doi":"10.1007/s11063-024-11606-z","DOIUrl":null,"url":null,"abstract":"<p>Knowledge distillation is a model compression technique that transfers knowledge learned by teacher networks to student networks. Existing knowledge distillation methods greatly expand the forms of knowledge, but also make the distillation models complex and symmetric. However, few studies have explored the commonalities among these methods. In this study, we propose a concise distillation framework to unify these methods and a method to construct asymmetric knowledge distillation under the framework. Asymmetric distillation aims to enable differentiated knowledge transfers for different distillation objects. We designed a multi-stage shallow-wide branch bifurcation method to distill different knowledge representations and a grouping ensemble strategy to supervise the network to teach and learn selectively. Consequently, we conducted experiments using image classification benchmarks to verify the proposed method. Experimental results show that our implementation can achieve considerable improvements over existing methods, demonstrating the effectiveness of the method and the potential of the framework.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"21 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11606-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Knowledge distillation is a model compression technique that transfers knowledge learned by teacher networks to student networks. Existing knowledge distillation methods greatly expand the forms of knowledge, but also make the distillation models complex and symmetric. However, few studies have explored the commonalities among these methods. In this study, we propose a concise distillation framework to unify these methods and a method to construct asymmetric knowledge distillation under the framework. Asymmetric distillation aims to enable differentiated knowledge transfers for different distillation objects. We designed a multi-stage shallow-wide branch bifurcation method to distill different knowledge representations and a grouping ensemble strategy to supervise the network to teach and learn selectively. Consequently, we conducted experiments using image classification benchmarks to verify the proposed method. Experimental results show that our implementation can achieve considerable improvements over existing methods, demonstrating the effectiveness of the method and the potential of the framework.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

图像分类的统一非对称知识提炼框架

知识蒸馏是一种将教师网络所学知识转移到学生网络的模型压缩技术。现有的知识蒸馏方法大大扩展了知识的形式，但也使蒸馏模型变得复杂和对称。然而，很少有研究探讨这些方法之间的共性。在本研究中，我们提出了一个简明的蒸馏框架来统一这些方法，并在该框架下提出了一种构建非对称知识蒸馏的方法。非对称蒸馏旨在针对不同的蒸馏对象实现差异化的知识转移。我们设计了一种多级浅宽分支分叉法来提炼不同的知识表征，并设计了一种分组集合策略来监督网络有选择地教学和学习。因此，我们使用图像分类基准进行了实验，以验证所提出的方法。实验结果表明，与现有方法相比，我们的实现方法可以取得相当大的改进，证明了该方法的有效性和该框架的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters