首页 > 最新文献

Acta Numerica最新文献

英文 中文
Asymptotic-preserving schemes for multiscale physical problems 多尺度物理问题的渐近保持格式
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-12-11 DOI: 10.1017/S0962492922000010
Shi Jin
We present the asymptotic transitions from microscopic to macroscopic physics, their computational challenges and the asymptotic-preserving (AP) strategies to compute multiscale physical problems efficiently. Specifically, we will first study the asymptotic transition from quantum to classical mechanics, from classical mechanics to kinetic theory, and then from kinetic theory to hydrodynamics. We then review some representative AP schemes that mimic these asymptotic transitions at the discrete level, and hence can be used crossing scales and, in particular, capture the macroscopic behaviour without resolving the microscopic physical scale numerically.
我们提出了从微观到宏观物理的渐近转换,它们的计算挑战以及有效计算多尺度物理问题的渐近保持(AP)策略。具体来说,我们将首先研究从量子力学到经典力学,从经典力学到运动论,然后从运动论到流体力学的渐近跃迁。然后,我们回顾了一些具有代表性的AP方案,这些方案在离散水平上模拟这些渐近转换,因此可以用于跨尺度,特别是在不解决微观物理尺度的情况下捕获宏观行为。
{"title":"Asymptotic-preserving schemes for multiscale physical problems","authors":"Shi Jin","doi":"10.1017/S0962492922000010","DOIUrl":"https://doi.org/10.1017/S0962492922000010","url":null,"abstract":"We present the asymptotic transitions from microscopic to macroscopic physics, their computational challenges and the asymptotic-preserving (AP) strategies to compute multiscale physical problems efficiently. Specifically, we will first study the asymptotic transition from quantum to classical mechanics, from classical mechanics to kinetic theory, and then from kinetic theory to hydrodynamics. We then review some representative AP schemes that mimic these asymptotic transitions at the discrete level, and hence can be used crossing scales and, in particular, capture the macroscopic behaviour without resolving the microscopic physical scale numerically.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"31 1","pages":"415 - 489"},"PeriodicalIF":14.2,"publicationDate":"2021-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43359804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
ANU volume 30 Cover and Front matter 澳大利亚国立大学第30卷封面和封面问题
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-05-01 DOI: 10.1017/s096249292100009x
R. Altmann, P. Henning, D. Peterseim, P. Bartlett, A. Montanari, A. Rakhlin, Ronald A. DeVore, B. Hanin
{"title":"ANU volume 30 Cover and Front matter","authors":"R. Altmann, P. Henning, D. Peterseim, P. Bartlett, A. Montanari, A. Rakhlin, Ronald A. DeVore, B. Hanin","doi":"10.1017/s096249292100009x","DOIUrl":"https://doi.org/10.1017/s096249292100009x","url":null,"abstract":"","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"f1 - f6"},"PeriodicalIF":14.2,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47797032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Numerical homogenization beyond scale separation 超越尺度分离的数值均质化
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-05-01 DOI: 10.1017/S0962492921000015
R. Altmann, P. Henning, D. Peterseim
Numerical homogenization is a methodology for the computational solution of multiscale partial differential equations. It aims at reducing complex large-scale problems to simplified numerical models valid on some target scale of interest, thereby accounting for the impact of features on smaller scales that are otherwise not resolved. While constructive approaches in the mathematical theory of homogenization are restricted to problems with a clear scale separation, modern numerical homogenization methods can accurately handle problems with a continuum of scales. This paper reviews such approaches embedded in a historical context and provides a unified variational framework for their design and numerical analysis. Apart from prototypical elliptic model problems, the class of partial differential equations covered here includes wave scattering in heterogeneous media and serves as a template for more general multi-physics problems.
数值均匀化是求解多尺度偏微分方程的一种方法。它旨在将复杂的大规模问题简化为在某些感兴趣的目标尺度上有效的简化数值模型,从而考虑到在较小尺度上无法解决的特征的影响。均匀化数学理论中的建设性方法仅限于具有明确尺度分离的问题,而现代数值均匀化方法可以准确地处理具有连续尺度的问题。本文回顾了嵌入在历史背景下的这些方法,并为其设计和数值分析提供了统一的变分框架。除了典型的椭圆模型问题外,这里所涵盖的一类偏微分方程还包括非均匀介质中的波散射,并作为更一般的多物理场问题的模板。
{"title":"Numerical homogenization beyond scale separation","authors":"R. Altmann, P. Henning, D. Peterseim","doi":"10.1017/S0962492921000015","DOIUrl":"https://doi.org/10.1017/S0962492921000015","url":null,"abstract":"Numerical homogenization is a methodology for the computational solution of multiscale partial differential equations. It aims at reducing complex large-scale problems to simplified numerical models valid on some target scale of interest, thereby accounting for the impact of features on smaller scales that are otherwise not resolved. While constructive approaches in the mathematical theory of homogenization are restricted to problems with a clear scale separation, modern numerical homogenization methods can accurately handle problems with a continuum of scales. This paper reviews such approaches embedded in a historical context and provides a unified variational framework for their design and numerical analysis. Apart from prototypical elliptic model problems, the class of partial differential equations covered here includes wave scattering in heterogeneous media and serves as a template for more general multi-physics problems.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"1 - 86"},"PeriodicalIF":14.2,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48106578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Optimal transportation, modelling and numerical simulation 最优运输,建模和数值模拟
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-05-01 DOI: 10.1017/S0962492921000040
J. Benamou
We present an overviewof the basic theory, modern optimal transportation extensions and recent algorithmic advances. Selected modelling and numerical applications illustrate the impact of optimal transportation in numerical analysis.
本文综述了最优交通的基本理论、现代最优交通的扩展和最新的算法进展。所选的模型和数值应用说明了数值分析中最优运输的影响。
{"title":"Optimal transportation, modelling and numerical simulation","authors":"J. Benamou","doi":"10.1017/S0962492921000040","DOIUrl":"https://doi.org/10.1017/S0962492921000040","url":null,"abstract":"We present an overviewof the basic theory, modern optimal transportation extensions and recent algorithmic advances. Selected modelling and numerical applications illustrate the impact of optimal transportation in numerical analysis.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"249 - 325"},"PeriodicalIF":14.2,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41848083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Tensors in computations 计算中的张量
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-05-01 DOI: 10.1017/S0962492921000076
Lek-Heng Lim
The notion of a tensor captures three great ideas: equivariance, multilinearity, separability. But trying to be three things at once makes the notion difficult to understand. We will explain tensors in an accessible and elementary way through the lens of linear algebra and numerical linear algebra, elucidated with examples from computational and applied mathematics.
张量的概念包含了三个伟大的概念:等变性、多线性和可分性。但是,试图同时成为三件事会使这个概念难以理解。我们将通过线性代数和数值线性代数的视角,通过计算数学和应用数学的例子,以一种简单易懂的方式解释张量。
{"title":"Tensors in computations","authors":"Lek-Heng Lim","doi":"10.1017/S0962492921000076","DOIUrl":"https://doi.org/10.1017/S0962492921000076","url":null,"abstract":"The notion of a tensor captures three great ideas: equivariance, multilinearity, separability. But trying to be three things at once makes the notion difficult to understand. We will explain tensors in an accessible and elementary way through the lens of linear algebra and numerical linear algebra, elucidated with examples from computational and applied mathematics.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"555 - 764"},"PeriodicalIF":14.2,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44930203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Learning physics-based models from data: perspectives from inverse problems and model reduction 从数据中学习基于物理的模型:从反问题和模型简化的角度
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-05-01 DOI: 10.1017/S0962492921000064
O. Ghattas, K. Willcox
This article addresses the inference of physics models from data, from the perspectives of inverse problems and model reduction. These fields develop formulations that integrate data into physics-based models while exploiting the fact that many mathematical models of natural and engineered systems exhibit an intrinsically low-dimensional solution manifold. In inverse problems, we seek to infer uncertain components of the inputs from observations of the outputs, while in model reduction we seek low-dimensional models that explicitly capture the salient features of the input–output map through approximation in a low-dimensional subspace. In both cases, the result is a predictive model that reflects data-driven learning yet deeply embeds the underlying physics, and thus can be used for design, control and decision-making, often with quantified uncertainties. We highlight recent developments in scalable and efficient algorithms for inverse problems and model reduction governed by large-scale models in the form of partial differential equations. Several illustrative applications to large-scale complex problems across different domains of science and engineering are provided.
本文从反问题和模型约简的角度,讨论了物理模型的数据推理。这些领域开发了将数据集成到基于物理的模型中的公式,同时利用了许多自然和工程系统的数学模型表现出本质上低维解流形的事实。在反问题中,我们试图从输出的观察中推断输入的不确定成分,而在模型约简中,我们寻求通过在低维子空间中的近似来明确捕获输入-输出映射的显著特征的低维模型。在这两种情况下,结果都是一个预测模型,它反映了数据驱动的学习,但又深深嵌入了底层物理,因此可以用于设计、控制和决策,通常具有量化的不确定性。我们强调了最近在可扩展和高效算法的发展,用于逆问题和由偏微分方程形式的大规模模型控制的模型约简。提供了在不同科学和工程领域的大规模复杂问题的几个说明性应用。
{"title":"Learning physics-based models from data: perspectives from inverse problems and model reduction","authors":"O. Ghattas, K. Willcox","doi":"10.1017/S0962492921000064","DOIUrl":"https://doi.org/10.1017/S0962492921000064","url":null,"abstract":"This article addresses the inference of physics models from data, from the perspectives of inverse problems and model reduction. These fields develop formulations that integrate data into physics-based models while exploiting the fact that many mathematical models of natural and engineered systems exhibit an intrinsically low-dimensional solution manifold. In inverse problems, we seek to infer uncertain components of the inputs from observations of the outputs, while in model reduction we seek low-dimensional models that explicitly capture the salient features of the input–output map through approximation in a low-dimensional subspace. In both cases, the result is a predictive model that reflects data-driven learning yet deeply embeds the underlying physics, and thus can be used for design, control and decision-making, often with quantified uncertainties. We highlight recent developments in scalable and efficient algorithms for inverse problems and model reduction governed by large-scale models in the form of partial differential equations. Several illustrative applications to large-scale complex problems across different domains of science and engineering are provided.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"445 - 554"},"PeriodicalIF":14.2,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47260397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation 无所畏惧:通过插补棱镜的深度学习的非凡数学现象
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-05-01 DOI: 10.1017/S0962492921000039
M. Belkin
In the past decade the mathematical theory of machine learning has lagged far behind the triumphs of deep neural networks on practical challenges. However, the gap between theory and practice is gradually starting to close. In this paper I will attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning. The two key themes will be interpolation and its sibling over-parametrization. Interpolation corresponds to fitting data, even noisy data, exactly. Over-parametrization enables interpolation and provides flexibility to select a suitable interpolating model. As we will see, just as a physical prism separates colours mixed within a ray of light, the figurative prism of interpolation helps to disentangle generalization and optimization properties within the complex picture of modern machine learning. This article is written in the belief and hope that clearer understanding of these issues will bring us a step closer towards a general theory of deep learning and machine learning.
在过去的十年里,机器学习的数学理论在实践挑战方面远远落后于深度神经网络的成功。然而,理论与实践之间的差距正在逐渐缩小。在这篇论文中,我将尝试收集一些引人注目但仍然不完整的数学拼图,这些拼图是在理解深度学习的基础的努力中产生的。两个关键主题将是插值及其兄弟参数化。插值精确地对应于拟合数据,甚至是噪声数据。过参数化实现插值,并提供选择合适插值模型的灵活性。正如我们将看到的,正如物理棱镜分离光线中混合的颜色一样,插值的具象棱镜有助于在现代机器学习的复杂画面中理清泛化和优化特性。本文相信并希望对这些问题的更清晰理解将使我们离深度学习和机器学习的一般理论更近一步。
{"title":"Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation","authors":"M. Belkin","doi":"10.1017/S0962492921000039","DOIUrl":"https://doi.org/10.1017/S0962492921000039","url":null,"abstract":"In the past decade the mathematical theory of machine learning has lagged far behind the triumphs of deep neural networks on practical challenges. However, the gap between theory and practice is gradually starting to close. In this paper I will attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning. The two key themes will be interpolation and its sibling over-parametrization. Interpolation corresponds to fitting data, even noisy data, exactly. Over-parametrization enables interpolation and provides flexibility to select a suitable interpolating model. As we will see, just as a physical prism separates colours mixed within a ray of light, the figurative prism of interpolation helps to disentangle generalization and optimization properties within the complex picture of modern machine learning. This article is written in the belief and hope that clearer understanding of these issues will bring us a step closer towards a general theory of deep learning and machine learning.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"203 - 248"},"PeriodicalIF":14.2,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46793783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 116
ANU volume 30 Cover and Back matter 澳大利亚国立大学第30卷封面和封底
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-05-01 DOI: 10.1017/s0962492921000106
{"title":"ANU volume 30 Cover and Back matter","authors":"","doi":"10.1017/s0962492921000106","DOIUrl":"https://doi.org/10.1017/s0962492921000106","url":null,"abstract":"","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"b1 - b1"},"PeriodicalIF":14.2,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46952396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling and computation of liquid crystals 液晶的建模与计算
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-04-06 DOI: 10.1017/S0962492921000088
Wen Wang, Lei Zhang, Pingwen Zhang
Liquid crystals are a type of soft matter that is intermediate between crystalline solids and isotropic fluids. The study of liquid crystals has made tremendous progress over the past four decades, which is of great importance for fundamental scientific research and has widespread applications in industry. In this paper we review the mathematical models and their connections to liquid crystals, and survey the developments of numerical methods for finding rich configurations of liquid crystals.
液晶是介于结晶固体和各向同性流体之间的一种软物质。近四十年来,液晶研究取得了巨大进展,对基础科学研究具有重要意义,在工业上有着广泛的应用。在本文中,我们回顾了数学模型及其与液晶的联系,并综述了寻找丰富液晶构型的数值方法的发展。
{"title":"Modelling and computation of liquid crystals","authors":"Wen Wang, Lei Zhang, Pingwen Zhang","doi":"10.1017/S0962492921000088","DOIUrl":"https://doi.org/10.1017/S0962492921000088","url":null,"abstract":"Liquid crystals are a type of soft matter that is intermediate between crystalline solids and isotropic fluids. The study of liquid crystals has made tremendous progress over the past four decades, which is of great importance for fundamental scientific research and has widespread applications in industry. In this paper we review the mathematical models and their connections to liquid crystals, and survey the developments of numerical methods for finding rich configurations of liquid crystals.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"765 - 851"},"PeriodicalIF":14.2,"publicationDate":"2021-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49450682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Deep learning: a statistical viewpoint 深度学习:统计学观点
IF 14.2 1区 数学 Q1 MATHEMATICS Pub Date : 2021-03-16 DOI: 10.1017/S0962492921000027
P. Bartlett, A. Montanari, A. Rakhlin
The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting, that is, accurate predictions despite overfitting training data. In this article, we survey recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behaviour of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favourable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.
深度学习的显著实践成功从理论角度揭示了一些重大惊喜。特别是,简单的梯度方法很容易找到非凸优化问题的近似最优解,尽管在没有任何明确的控制模型复杂性的努力的情况下对训练数据给出了近似完美的拟合,但这些方法表现出了优异的预测准确性。我们推测这些现象背后的具体原理是:过帧化允许梯度方法找到插值解,这些方法隐含地施加正则化,并且过帧化导致良性过拟合,即,尽管训练数据过拟合,但仍能准确预测。在这篇文章中,我们调查了统计学习理论的最新进展,并提供了在更简单的环境中说明这些原则的例子。我们首先回顾了经典的一致收敛结果,以及为什么它们不能解释深度学习方法行为的各个方面。我们给出了在简单设置中的隐式正则化的例子,其中梯度方法导致完美拟合训练数据的最小范数函数。然后,我们回顾了表现出良性过拟合的预测方法,重点关注具有二次损失的回归问题。对于这些方法,我们可以将预测规则分解为对预测有用的简单分量和对过拟合有用的尖峰分量,但在有利的设置下,不会损害预测精度。我们特别关注神经网络的线性状态,其中网络可以通过线性模型来近似。在这种情况下,我们证明了梯度流的成功,并考虑了两层网络的良性过拟合,给出了精确的渐近分析,精确地证明了过帧化的影响。最后,我们强调了将这些见解扩展到现实的深度学习环境中所面临的关键挑战。
{"title":"Deep learning: a statistical viewpoint","authors":"P. Bartlett, A. Montanari, A. Rakhlin","doi":"10.1017/S0962492921000027","DOIUrl":"https://doi.org/10.1017/S0962492921000027","url":null,"abstract":"The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting, that is, accurate predictions despite overfitting training data. In this article, we survey recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behaviour of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favourable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.","PeriodicalId":48863,"journal":{"name":"Acta Numerica","volume":"30 1","pages":"87 - 201"},"PeriodicalIF":14.2,"publicationDate":"2021-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/S0962492921000027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41948779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 177
期刊
Acta Numerica
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1