深度神经网络：多分类和通用逼近

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI:arxiv-2409.06555

Martín Hernández, Enrique Zuazua

{"title":"深度神经网络：多分类和通用逼近","authors":"Martín Hernández, Enrique Zuazua","doi":"arxiv-2409.06555","DOIUrl":null,"url":null,"abstract":"We demonstrate that a ReLU deep neural network with a width of $2$ and a\ndepth of $2N+4M-1$ layers can achieve finite sample memorization for any\ndataset comprising $N$ elements in $\\mathbb{R}^d$, where $d\\ge1,$ and $M$\nclasses, thereby ensuring accurate classification. By modeling the neural network as a time-discrete nonlinear dynamical system,\nwe interpret the memorization property as a problem of simultaneous or ensemble\ncontrollability. This problem is addressed by constructing the network\nparameters inductively and explicitly, bypassing the need for training or\nsolving any optimization problem. Additionally, we establish that such a network can achieve universal\napproximation in $L^p(\\Omega;\\mathbb{R}_+)$, where $\\Omega$ is a bounded subset\nof $\\mathbb{R}^d$ and $p\\in[1,\\infty)$, using a ReLU deep neural network with a\nwidth of $d+1$. We also provide depth estimates for approximating $W^{1,p}$\nfunctions and width estimates for approximating $L^p(\\Omega;\\mathbb{R}^m)$ for\n$m\\geq1$. Our proofs are constructive, offering explicit values for the biases\nand weights involved.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Neural Networks: Multi-Classification and Universal Approximation\",\"authors\":\"Martín Hernández, Enrique Zuazua\",\"doi\":\"arxiv-2409.06555\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We demonstrate that a ReLU deep neural network with a width of $2$ and a\\ndepth of $2N+4M-1$ layers can achieve finite sample memorization for any\\ndataset comprising $N$ elements in $\\\\mathbb{R}^d$, where $d\\\\ge1,$ and $M$\\nclasses, thereby ensuring accurate classification. By modeling the neural network as a time-discrete nonlinear dynamical system,\\nwe interpret the memorization property as a problem of simultaneous or ensemble\\ncontrollability. This problem is addressed by constructing the network\\nparameters inductively and explicitly, bypassing the need for training or\\nsolving any optimization problem. Additionally, we establish that such a network can achieve universal\\napproximation in $L^p(\\\\Omega;\\\\mathbb{R}_+)$, where $\\\\Omega$ is a bounded subset\\nof $\\\\mathbb{R}^d$ and $p\\\\in[1,\\\\infty)$, using a ReLU deep neural network with a\\nwidth of $d+1$. We also provide depth estimates for approximating $W^{1,p}$\\nfunctions and width estimates for approximating $L^p(\\\\Omega;\\\\mathbb{R}^m)$ for\\n$m\\\\geq1$. Our proofs are constructive, offering explicit values for the biases\\nand weights involved.\",\"PeriodicalId\":501340,\"journal\":{\"name\":\"arXiv - STAT - Machine Learning\",\"volume\":\"3 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06555\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们证明了一个宽度为 2 美元、深度为 2N+4M-1 美元层的 ReLU 深度神经网络可以对任何由 $\mathbb{R}^d$ 中的 $N$ 元素（其中 $d\ge1,$ 和 $M$ 类）组成的数据集实现有限样本记忆，从而确保准确分类。通过将神经网络建模为一个时间离散的非线性动态系统，我们将记忆特性解释为同步或集合可控性问题。解决这个问题的方法是通过归纳和显式构建网络参数，从而绕过了训练或解决任何优化问题的需要。此外，我们还利用带宽为 $d+1$ 的 ReLU 深度神经网络，确定这种网络可以在 $L^p(\Omega;\mathbb{R}_+)$（其中 $Omega$ 是 $\mathbb{R}^d$ 的有界子集，且 $p/in[1,\infty)$ 中实现通用逼近。我们还提供了近似$W^{1,p}$函数的深度估计值，以及近似$m\geq1$的$L^p(\Omega;\mathbb{R}^m)$的宽度估计值。我们的证明是建设性的，为所涉及的偏差和权重提供了明确的数值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep Neural Networks: Multi-Classification and Universal Approximation

We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements in $\mathbb{R}^d$, where $d\ge1,$ and $M$ classes, thereby ensuring accurate classification. By modeling the neural network as a time-discrete nonlinear dynamical system, we interpret the memorization property as a problem of simultaneous or ensemble controllability. This problem is addressed by constructing the network parameters inductively and explicitly, bypassing the need for training or solving any optimization problem. Additionally, we establish that such a network can achieve universal approximation in $L^p(\Omega;\mathbb{R}_+)$, where $\Omega$ is a bounded subset of $\mathbb{R}^d$ and $p\in[1,\infty)$, using a ReLU deep neural network with a width of $d+1$. We also provide depth estimates for approximating $W^{1,p}$ functions and width estimates for approximating $L^p(\Omega;\mathbb{R}^m)$ for $m\geq1$. Our proofs are constructive, offering explicit values for the biases and weights involved.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - STAT - Machine Learning

自引率

0.00%

发文量

期刊最新文献

Fitting Multilevel Factor Models Cartan moving frames and the data manifolds Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks Recurrent Interpolants for Probabilistic Time Series Prediction PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities