Local kernel renormalization as a mechanism for feature learning in overparametrized convolutional neural networks

IF 15.7 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Nature Communications Pub Date : 2025-01-10 DOI:10.1038/s41467-024-55229-3
R. Aiudi, R. Pacelli, P. Baglioni, A. Vezzani, R. Burioni, P. Rotondo
{"title":"Local kernel renormalization as a mechanism for feature learning in overparametrized convolutional neural networks","authors":"R. Aiudi, R. Pacelli, P. Baglioni, A. Vezzani, R. Burioni, P. Rotondo","doi":"10.1038/s41467-024-55229-3","DOIUrl":null,"url":null,"abstract":"<p>Empirical evidence shows that fully-connected neural networks in the infinite-width limit (lazy training) eventually outperform their finite-width counterparts in most computer vision tasks; on the other hand, modern architectures with convolutional layers often achieve optimal performances in the finite-width regime. In this work, we present a theoretical framework that provides a rationale for these differences in one-hidden-layer networks; we derive an effective action in the so-called proportional limit for an architecture with one convolutional hidden layer and compare it with the result available for fully-connected networks. Remarkably, we identify a completely different form of kernel renormalization: whereas the kernel of the fully-connected architecture is just globally renormalized by a single scalar parameter, the convolutional kernel undergoes a local renormalization, meaning that the network can select the local components that will contribute to the final prediction in a data-dependent way. This finding highlights a simple mechanism for feature learning that can take place in overparametrized shallow convolutional neural networks, but not in shallow fully-connected architectures or in locally connected neural networks without weight sharing.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"91 1","pages":""},"PeriodicalIF":15.7000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-024-55229-3","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Empirical evidence shows that fully-connected neural networks in the infinite-width limit (lazy training) eventually outperform their finite-width counterparts in most computer vision tasks; on the other hand, modern architectures with convolutional layers often achieve optimal performances in the finite-width regime. In this work, we present a theoretical framework that provides a rationale for these differences in one-hidden-layer networks; we derive an effective action in the so-called proportional limit for an architecture with one convolutional hidden layer and compare it with the result available for fully-connected networks. Remarkably, we identify a completely different form of kernel renormalization: whereas the kernel of the fully-connected architecture is just globally renormalized by a single scalar parameter, the convolutional kernel undergoes a local renormalization, meaning that the network can select the local components that will contribute to the final prediction in a data-dependent way. This finding highlights a simple mechanism for feature learning that can take place in overparametrized shallow convolutional neural networks, but not in shallow fully-connected architectures or in locally connected neural networks without weight sharing.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
局部核重整化作为一种超参数化卷积神经网络特征学习机制
经验证据表明,在大多数计算机视觉任务中,无限宽度限制(惰性训练)下的全连接神经网络最终表现优于有限宽度的神经网络;另一方面,具有卷积层的现代架构通常在有限宽度范围内实现最佳性能。在这项工作中,我们提出了一个理论框架,为单隐藏层网络中的这些差异提供了基本原理;我们在所谓的比例极限中为一个具有一个卷积隐藏层的架构导出了一个有效的动作,并将其与全连接网络的结果进行了比较。值得注意的是,我们确定了一种完全不同形式的核重整化:而全连接架构的核只是通过单个标量参数进行全局重整化,卷积核则进行局部重整化,这意味着网络可以选择局部组件,这些组件将以数据依赖的方式有助于最终预测。这一发现强调了一种简单的特征学习机制,它可以发生在过参数化的浅卷积神经网络中,但不能发生在浅全连接架构或没有权值共享的局部连接神经网络中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Nature Communications
Nature Communications Biological Science Disciplines-
CiteScore
24.90
自引率
2.40%
发文量
6928
审稿时长
3.7 months
期刊介绍: Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.
期刊最新文献
Diffractive magic cube network with super-high capacity enabled by mechanical reconfiguration The transcription factor HHEX maintains glucocorticoid levels and protects adrenals from androgen-induced lipid depletion Alleviating non-radiative losses in organic solar cells through side-chain regulation of low-bandgap non-fullerene acceptors. Regular-wrinkling tunable MXene lattice for electromagnetic interference shielding Integration of large vision language models for efficient post-disaster damage assessment and reporting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1