Growing a Brain: Fine-Tuning by Increasing Model Capacity

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI:10.1109/CVPR.2017.323

Yu-Xiong Wang, Deva Ramanan, M. Hebert

{"title":"Growing a Brain: Fine-Tuning by Increasing Model Capacity","authors":"Yu-Xiong Wang, Deva Ramanan, M. Hebert","doi":"10.1109/CVPR.2017.323","DOIUrl":null,"url":null,"abstract":"CNNs have made an undeniable impact on computer vision through the ability to learn high-capacity models with large annotated training sets. One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset. This is usually accomplished through fine-tuning a fixed-size network on new target data. Indeed, virtually every contemporary visual recognition system makes use of fine-tuning to transfer knowledge from ImageNet. In this work, we analyze what components and parameters change during fine-tuning, and discover that increasing model capacity allows for more natural model adaptation through fine-tuning. By making an analogy to developmental learning, we demonstrate that growing a CNN with additional units, either by widening existing layers or deepening the overall network, significantly outperforms classic fine-tuning approaches. But in order to properly grow a network, we show that newly-added units must be appropriately normalized to allow for a pace of learning that is consistent with existing units. We empirically validate our approach on several benchmark datasets, producing state-of-the-art results.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"60 1","pages":"3029-3038"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"132","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2017.323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 132

Abstract

CNNs have made an undeniable impact on computer vision through the ability to learn high-capacity models with large annotated training sets. One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset. This is usually accomplished through fine-tuning a fixed-size network on new target data. Indeed, virtually every contemporary visual recognition system makes use of fine-tuning to transfer knowledge from ImageNet. In this work, we analyze what components and parameters change during fine-tuning, and discover that increasing model capacity allows for more natural model adaptation through fine-tuning. By making an analogy to developmental learning, we demonstrate that growing a CNN with additional units, either by widening existing layers or deepening the overall network, significantly outperforms classic fine-tuning approaches. But in order to properly grow a network, we show that newly-added units must be appropriately normalized to allow for a pace of learning that is consistent with existing units. We empirically validate our approach on several benchmark datasets, producing state-of-the-art results.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

大脑成长:通过增加模型容量进行微调

cnn通过使用大型带注释的训练集学习高容量模型的能力，对计算机视觉产生了不可否认的影响。它们的一个显著特性是能够将知识从大型源数据集转移到(通常较小的)目标数据集。这通常是通过在新目标数据上微调固定大小的网络来实现的。事实上，几乎每一个当代视觉识别系统都利用微调从ImageNet转移知识。在这项工作中，我们分析了微调过程中哪些组件和参数发生了变化，并发现增加模型容量可以通过微调实现更自然的模型适应。通过与发展性学习进行类比，我们证明了通过扩大现有层或深化整个网络来增加额外单元的CNN，显著优于经典的微调方法。但是，为了适当地发展网络，我们表明，必须适当地规范化新添加的单元，以允许与现有单元一致的学习速度。我们在几个基准数据集上验证了我们的方法，产生了最先进的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

FFTLasso: Large-Scale LASSO in the Fourier Domain Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes Coarse-to-Fine Segmentation with Shape-Tailored Continuum Scale Spaces Joint Gap Detection and Inpainting of Line Drawings Wetness and Color from a Single Multispectral Image