具有 Lipschitz 连续激活函数和可变宽度的深度神经网络的均匀收敛性

IF 2.2 3区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Information Theory Pub Date : 2024-08-05 DOI:10.1109/TIT.2024.3439136

Yuesheng Xu;Haizhang Zhang

{"title":"具有 Lipschitz 连续激活函数和可变宽度的深度神经网络的均匀收敛性","authors":"Yuesheng Xu;Haizhang Zhang","doi":"10.1109/TIT.2024.3439136","DOIUrl":null,"url":null,"abstract":"We consider deep neural networks (DNNs) with a Lipschitz continuous activation function and with weight matrices of variable widths. We establish a uniform convergence analysis framework in which sufficient conditions on weight matrices and bias vectors together with the Lipschitz constant are provided to ensure uniform convergence of DNNs to a meaningful function as the number of their layers tends to infinity. In the framework, special results on uniform convergence of DNNs with a fixed width, bounded widths and unbounded widths are presented. In particular, as convolutional neural networks are special DNNs with weight matrices of increasing widths, we put forward conditions on the mask sequence which lead to uniform convergence of the resulting convolutional neural networks. The Lipschitz continuity assumption on the activation functions allows us to include in our theory most of commonly used activation functions in applications.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 10","pages":"7125-7142"},"PeriodicalIF":2.2000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10623495","citationCount":"0","resultStr":"{\"title\":\"Uniform Convergence of Deep Neural Networks With Lipschitz Continuous Activation Functions and Variable Widths\",\"authors\":\"Yuesheng Xu;Haizhang Zhang\",\"doi\":\"10.1109/TIT.2024.3439136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider deep neural networks (DNNs) with a Lipschitz continuous activation function and with weight matrices of variable widths. We establish a uniform convergence analysis framework in which sufficient conditions on weight matrices and bias vectors together with the Lipschitz constant are provided to ensure uniform convergence of DNNs to a meaningful function as the number of their layers tends to infinity. In the framework, special results on uniform convergence of DNNs with a fixed width, bounded widths and unbounded widths are presented. In particular, as convolutional neural networks are special DNNs with weight matrices of increasing widths, we put forward conditions on the mask sequence which lead to uniform convergence of the resulting convolutional neural networks. The Lipschitz continuity assumption on the activation functions allows us to include in our theory most of commonly used activation functions in applications.\",\"PeriodicalId\":13494,\"journal\":{\"name\":\"IEEE Transactions on Information Theory\",\"volume\":\"70 10\",\"pages\":\"7125-7142\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10623495\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Theory\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10623495/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10623495/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

我们考虑了具有 Lipschitz 连续激活函数和可变宽度权重矩阵的深度神经网络（DNN）。我们建立了一个均匀收敛分析框架，其中提供了权重矩阵和偏置向量的充分条件以及 Lipschitz 常量，以确保 DNN 在层数趋于无穷大时均匀收敛到一个有意义的函数。在这一框架中，提出了关于具有固定宽度、有界宽度和无界宽度的 DNNs 均匀收敛的特殊结果。特别是，由于卷积神经网络是权重矩阵宽度递增的特殊 DNN，我们提出了掩码序列的条件，这些条件导致了卷积神经网络的均匀收敛。激活函数的 Lipschitz 连续性假设使我们能够将应用中的大多数常用激活函数纳入我们的理论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Uniform Convergence of Deep Neural Networks With Lipschitz Continuous Activation Functions and Variable Widths

We consider deep neural networks (DNNs) with a Lipschitz continuous activation function and with weight matrices of variable widths. We establish a uniform convergence analysis framework in which sufficient conditions on weight matrices and bias vectors together with the Lipschitz constant are provided to ensure uniform convergence of DNNs to a meaningful function as the number of their layers tends to infinity. In the framework, special results on uniform convergence of DNNs with a fixed width, bounded widths and unbounded widths are presented. In particular, as convolutional neural networks are special DNNs with weight matrices of increasing widths, we put forward conditions on the mask sequence which lead to uniform convergence of the resulting convolutional neural networks. The Lipschitz continuity assumption on the activation functions allows us to include in our theory most of commonly used activation functions in applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Information Theory 工程技术-工程：电子与电气

CiteScore

5.70

自引率

20.00%

发文量

514

审稿时长

12 months

期刊介绍： The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.

期刊最新文献

Table of Contents IEEE Transactions on Information Theory Publication Information IEEE Transactions on Information Theory Information for Authors Large and Small Deviations for Statistical Sequence Matching Derivatives of Entropy and the MMSE Conjecture