Learning to Weight Samples for Dynamic Early-exiting Networks

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision Pub Date : 2022-09-17 DOI:10.48550/arXiv.2209.08310

Yizeng Han, Yifan Pu, Zihang Lai, Chaofei Wang, S. Song, Junfen Cao, Wenhui Huang, Chao Deng, Gao Huang

{"title":"Learning to Weight Samples for Dynamic Early-exiting Networks","authors":"Yizeng Han, Yifan Pu, Zihang Lai, Chaofei Wang, S. Song, Junfen Cao, Wenhui Huang, Chao Deng, Gao Huang","doi":"10.48550/arXiv.2209.08310","DOIUrl":null,"url":null,"abstract":"Early exiting is an effective paradigm for improving the inference efficiency of deep networks. By constructing classifiers with varying resource demands (the exits), such networks allow easy samples to be output at early exits, removing the need for executing deeper layers. While existing works mainly focus on the architectural design of multi-exit networks, the training strategies for such models are largely left unexplored. The current state-of-the-art models treat all samples the same during training. However, the early-exiting behavior during testing has been ignored, leading to a gap between training and testing. In this paper, we propose to bridge this gap by sample weighting. Intuitively, easy samples, which generally exit early in the network during inference, should contribute more to training early classifiers. The training of hard samples (mostly exit from deeper layers), however, should be emphasized by the late classifiers. Our work proposes to adopt a weight prediction network to weight the loss of different training samples at each exit. This weight prediction network and the backbone model are jointly optimized under a meta-learning framework with a novel optimization objective. By bringing the adaptive behavior during inference into the training phase, we show that the proposed weighting mechanism consistently improves the trade-off between classification accuracy and inference efficiency. Code is available at https://github.com/LeapLabTHU/L2W-DEN.","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"49 1","pages":"362-378"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.08310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 19

Abstract

Early exiting is an effective paradigm for improving the inference efficiency of deep networks. By constructing classifiers with varying resource demands (the exits), such networks allow easy samples to be output at early exits, removing the need for executing deeper layers. While existing works mainly focus on the architectural design of multi-exit networks, the training strategies for such models are largely left unexplored. The current state-of-the-art models treat all samples the same during training. However, the early-exiting behavior during testing has been ignored, leading to a gap between training and testing. In this paper, we propose to bridge this gap by sample weighting. Intuitively, easy samples, which generally exit early in the network during inference, should contribute more to training early classifiers. The training of hard samples (mostly exit from deeper layers), however, should be emphasized by the late classifiers. Our work proposes to adopt a weight prediction network to weight the loss of different training samples at each exit. This weight prediction network and the backbone model are jointly optimized under a meta-learning framework with a novel optimization objective. By bringing the adaptive behavior during inference into the training phase, we show that the proposed weighting mechanism consistently improves the trade-off between classification accuracy and inference efficiency. Code is available at https://github.com/LeapLabTHU/L2W-DEN.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

动态早期存在网络的样本加权学习

早期退出是提高深度网络推理效率的有效范例。通过构造具有不同资源需求(出口)的分类器，这样的网络允许在早期出口输出简单的样本，从而消除了执行更深层次的需要。虽然现有的工作主要集中在多出口网络的架构设计上，但这些模型的训练策略在很大程度上没有被探索。目前最先进的模型在训练过程中对待所有样本都是一样的。然而，在测试过程中，早期退出行为被忽视，导致训练和测试之间的差距。在本文中，我们建议通过样本加权来弥补这一差距。直观地说，容易的样本通常在推理过程中较早退出网络，应该对训练早期分类器有更大的贡献。然而，后期分类器应该强调硬样本(大多来自更深层)的训练。我们的工作建议采用权重预测网络对每个出口不同训练样本的损失进行加权。该权重预测网络和骨干模型在元学习框架下进行了联合优化，并提出了新的优化目标。通过将推理过程中的自适应行为引入训练阶段，我们证明了所提出的加权机制能够持续改善分类精度和推理效率之间的权衡。代码可从https://github.com/LeapLabTHU/L2W-DEN获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision

自引率

0.00%

发文量