Infinite-width limit of deep linear neural networks

IF 3.1 1区 数学 Q1 MATHEMATICS Communications on Pure and Applied Mathematics Pub Date : 2024-05-06 DOI:10.1002/cpa.22200
Lénaïc Chizat, Maria Colombo, Xavier Fernández-Real, Alessio Figalli
{"title":"Infinite-width limit of deep linear neural networks","authors":"Lénaïc Chizat,&nbsp;Maria Colombo,&nbsp;Xavier Fernández-Real,&nbsp;Alessio Figalli","doi":"10.1002/cpa.22200","DOIUrl":null,"url":null,"abstract":"<p>This paper studies the infinite-width limit of deep linear neural networks (NNs) initialized with random parameters. We obtain that, when the number of parameters diverges, the training dynamics converge (in a precise sense) to the dynamics obtained from a gradient descent on an infinitely wide deterministic linear NN. Moreover, even if the weights remain random, we get their precise law along the training dynamics, and prove a quantitative convergence result of the linear predictor in terms of the number of parameters. We finally study the continuous-time limit obtained for infinitely wide linear NNs and show that the linear predictors of the NN converge at an exponential rate to the minimal <span></span><math>\n <semantics>\n <msub>\n <mi>ℓ</mi>\n <mn>2</mn>\n </msub>\n <annotation>$\\ell _2$</annotation>\n </semantics></math>-norm minimizer of the risk.</p>","PeriodicalId":10601,"journal":{"name":"Communications on Pure and Applied Mathematics","volume":"77 10","pages":"3958-4007"},"PeriodicalIF":3.1000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cpa.22200","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications on Pure and Applied Mathematics","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpa.22200","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper studies the infinite-width limit of deep linear neural networks (NNs) initialized with random parameters. We obtain that, when the number of parameters diverges, the training dynamics converge (in a precise sense) to the dynamics obtained from a gradient descent on an infinitely wide deterministic linear NN. Moreover, even if the weights remain random, we get their precise law along the training dynamics, and prove a quantitative convergence result of the linear predictor in terms of the number of parameters. We finally study the continuous-time limit obtained for infinitely wide linear NNs and show that the linear predictors of the NN converge at an exponential rate to the minimal 2 $\ell _2$ -norm minimizer of the risk.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
深度线性神经网络的无穷宽极限
本文研究了以随机参数初始化的深度线性神经网络(NN)的无限宽极限。我们发现,当参数数量发散时,训练动态(在精确意义上)会收敛到无限宽确定性线性神经网络的梯度下降动态。此外,即使权重仍然是随机的,我们也能沿着训练动态得到它们的精确规律,并证明了线性预测器在参数数量上的定量收敛结果。最后,我们研究了无限宽线性 NN 的连续时间极限,并证明 NN 的线性预测器以指数速度收敛到风险的最小正态最小化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.70
自引率
3.30%
发文量
59
审稿时长
>12 weeks
期刊介绍: Communications on Pure and Applied Mathematics (ISSN 0010-3640) is published monthly, one volume per year, by John Wiley & Sons, Inc. © 2019. The journal primarily publishes papers originating at or solicited by the Courant Institute of Mathematical Sciences. It features recent developments in applied mathematics, mathematical physics, and mathematical analysis. The topics include partial differential equations, computer science, and applied mathematics. CPAM is devoted to mathematical contributions to the sciences; both theoretical and applied papers, of original or expository type, are included.
期刊最新文献
Hydrodynamic large deviations of TASEP On the derivation of the homogeneous kinetic wave equation On the stability of Runge–Kutta methods for arbitrarily large systems of ODEs The α$\alpha$‐SQG patch problem is illposed in C2,β$C^{2,\beta }$ and W2,p$W^{2,p}$ Mean‐field limit of non‐exchangeable systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1