大脑成长:通过增加模型容量进行微调

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI:10.1109/CVPR.2017.323

Yu-Xiong Wang, Deva Ramanan, M. Hebert

{"title":"大脑成长:通过增加模型容量进行微调","authors":"Yu-Xiong Wang, Deva Ramanan, M. Hebert","doi":"10.1109/CVPR.2017.323","DOIUrl":null,"url":null,"abstract":"CNNs have made an undeniable impact on computer vision through the ability to learn high-capacity models with large annotated training sets. One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset. This is usually accomplished through fine-tuning a fixed-size network on new target data. Indeed, virtually every contemporary visual recognition system makes use of fine-tuning to transfer knowledge from ImageNet. In this work, we analyze what components and parameters change during fine-tuning, and discover that increasing model capacity allows for more natural model adaptation through fine-tuning. By making an analogy to developmental learning, we demonstrate that growing a CNN with additional units, either by widening existing layers or deepening the overall network, significantly outperforms classic fine-tuning approaches. But in order to properly grow a network, we show that newly-added units must be appropriately normalized to allow for a pace of learning that is consistent with existing units. We empirically validate our approach on several benchmark datasets, producing state-of-the-art results.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"60 1","pages":"3029-3038"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"132","resultStr":"{\"title\":\"Growing a Brain: Fine-Tuning by Increasing Model Capacity\",\"authors\":\"Yu-Xiong Wang, Deva Ramanan, M. Hebert\",\"doi\":\"10.1109/CVPR.2017.323\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"CNNs have made an undeniable impact on computer vision through the ability to learn high-capacity models with large annotated training sets. One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset. This is usually accomplished through fine-tuning a fixed-size network on new target data. Indeed, virtually every contemporary visual recognition system makes use of fine-tuning to transfer knowledge from ImageNet. In this work, we analyze what components and parameters change during fine-tuning, and discover that increasing model capacity allows for more natural model adaptation through fine-tuning. By making an analogy to developmental learning, we demonstrate that growing a CNN with additional units, either by widening existing layers or deepening the overall network, significantly outperforms classic fine-tuning approaches. But in order to properly grow a network, we show that newly-added units must be appropriately normalized to allow for a pace of learning that is consistent with existing units. We empirically validate our approach on several benchmark datasets, producing state-of-the-art results.\",\"PeriodicalId\":6631,\"journal\":{\"name\":\"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)\",\"volume\":\"60 1\",\"pages\":\"3029-3038\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"132\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2017.323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2017.323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 132

摘要

cnn通过使用大型带注释的训练集学习高容量模型的能力，对计算机视觉产生了不可否认的影响。它们的一个显著特性是能够将知识从大型源数据集转移到(通常较小的)目标数据集。这通常是通过在新目标数据上微调固定大小的网络来实现的。事实上，几乎每一个当代视觉识别系统都利用微调从ImageNet转移知识。在这项工作中，我们分析了微调过程中哪些组件和参数发生了变化，并发现增加模型容量可以通过微调实现更自然的模型适应。通过与发展性学习进行类比，我们证明了通过扩大现有层或深化整个网络来增加额外单元的CNN，显著优于经典的微调方法。但是，为了适当地发展网络，我们表明，必须适当地规范化新添加的单元，以允许与现有单元一致的学习速度。我们在几个基准数据集上验证了我们的方法，产生了最先进的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Growing a Brain: Fine-Tuning by Increasing Model Capacity

CNNs have made an undeniable impact on computer vision through the ability to learn high-capacity models with large annotated training sets. One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset. This is usually accomplished through fine-tuning a fixed-size network on new target data. Indeed, virtually every contemporary visual recognition system makes use of fine-tuning to transfer knowledge from ImageNet. In this work, we analyze what components and parameters change during fine-tuning, and discover that increasing model capacity allows for more natural model adaptation through fine-tuning. By making an analogy to developmental learning, we demonstrate that growing a CNN with additional units, either by widening existing layers or deepening the overall network, significantly outperforms classic fine-tuning approaches. But in order to properly grow a network, we show that newly-added units must be appropriately normalized to allow for a pace of learning that is consistent with existing units. We empirically validate our approach on several benchmark datasets, producing state-of-the-art results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

FFTLasso: Large-Scale LASSO in the Fourier Domain Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes Coarse-to-Fine Segmentation with Shape-Tailored Continuum Scale Spaces Joint Gap Detection and Inpainting of Line Drawings Wetness and Color from a Single Multispectral Image