Review of Recent Distillation Studies

MATEC Web of Conferences Pub Date : 2023-01-01 DOI:10.1051/matecconf/202338201034

Min Gao

{"title":"Review of Recent Distillation Studies","authors":"Min Gao","doi":"10.1051/matecconf/202338201034","DOIUrl":null,"url":null,"abstract":"Knowledge distillation has gained a lot of interest in recent years because it allows for compressing a large deep neural network (teacher DNN) into a smaller DNN (student DNN), while maintaining its accuracy. Recent improvements have been made to knowledge distillation. One such improvement is the teaching assistant distillation method. This method involves introducing an intermediate \"teaching assistant\" model between the teacher and student. The teaching assistant is first trained to mimic the teacher, and then the student is trained to mimic the teaching assistant. This multi-step process can improve student performance. Another improvement to knowledge distillation is curriculum distillation. This method involves gradually training the student by exposing it to increasingly difficult concepts over time, similar to curriculum learning in humans. This process can help the student learn in a more stable and consistent manner. Finally, there is the mask distillation method. Here, the student is trained to specifically mimic the attention mechanisms learned by the teacher, not just the overall output of the teacher DNN. These improvements help to enhance the knowledge distillation process and enable the creation of more efficient DNNs.","PeriodicalId":18309,"journal":{"name":"MATEC Web of Conferences","volume":"137 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MATEC Web of Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1051/matecconf/202338201034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Knowledge distillation has gained a lot of interest in recent years because it allows for compressing a large deep neural network (teacher DNN) into a smaller DNN (student DNN), while maintaining its accuracy. Recent improvements have been made to knowledge distillation. One such improvement is the teaching assistant distillation method. This method involves introducing an intermediate "teaching assistant" model between the teacher and student. The teaching assistant is first trained to mimic the teacher, and then the student is trained to mimic the teaching assistant. This multi-step process can improve student performance. Another improvement to knowledge distillation is curriculum distillation. This method involves gradually training the student by exposing it to increasingly difficult concepts over time, similar to curriculum learning in humans. This process can help the student learn in a more stable and consistent manner. Finally, there is the mask distillation method. Here, the student is trained to specifically mimic the attention mechanisms learned by the teacher, not just the overall output of the teacher DNN. These improvements help to enhance the knowledge distillation process and enable the creation of more efficient DNNs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

最新蒸馏研究综述

近年来，知识蒸馏获得了很多关注，因为它允许将大型深度神经网络(教师DNN)压缩成较小的深度神经网络(学生DNN)，同时保持其准确性。最近对知识的提炼有了改进。其中一个改进就是助教蒸馏法。这种方法涉及到在教师和学生之间引入一个中间的“助教”模型。首先训练助教模仿老师，然后训练学生模仿助教。这个多步骤的过程可以提高学生的表现。知识蒸馏的另一个改进是课程蒸馏。这种方法是通过让学生接触越来越难的概念来逐渐训练学生，类似于人类的课程学习。这个过程可以帮助学生以更稳定和一致的方式学习。最后是掩膜蒸馏法。在这里，学生被训练专门模仿老师学习的注意机制，而不仅仅是老师DNN的整体输出。这些改进有助于增强知识蒸馏过程，并使创建更有效的深度神经网络成为可能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

MATEC Web of Conferences

自引率

0.00%

发文量

342

审稿时长

6 weeks

期刊介绍： MATEC Web of Conferences is an Open Access publication series dedicated to archiving conference proceedings dealing with all fundamental and applied research aspects related to Materials science, Engineering and Chemistry. All engineering disciplines are covered by the aims and scope of the journal: civil, naval, mechanical, chemical, and electrical engineering as well as nanotechnology and metrology. The journal concerns also all materials in regard to their physical-chemical characterization, implementation, resistance in their environment… Other subdisciples of chemistry, such as analytical chemistry, petrochemistry, organic chemistry…, and even pharmacology, are also welcome. MATEC Web of Conferences offers a wide range of services from the organization of the submission of conference proceedings to the worldwide dissemination of the conference papers. It provides an efficient archiving solution, ensuring maximum exposure and wide indexing of scientific conference proceedings. Proceedings are published under the scientific responsibility of the conference editors.