{"title":"On the Upper Bounds of Number of Linear Regions and Generalization Error of Deep Convolutional Neural Networks.","authors":"Degang Chen, Jiayu Liu, Xiaoya Che","doi":"10.1109/TPAMI.2025.3548620","DOIUrl":null,"url":null,"abstract":"<p><p>Understanding the effect of hyperparameters of the network structure on the performance of Convolutional Neural Networks (CNNs) remains the most fundamental and urgent issue in deep learning, and we attempt to address this issue based on the piecewise linear (PWL) function nature of CNNs in this paper. Firstly, the operations of convolutions, ReLUs and Max pooling in a CNN are represented as the multiplication of multiple matrices for a fixed sample in order to obtain an algebraic expression of CNNs, this expression clearly suggests that CNNs are PWL functions. Although such representation has high time complexity, it provides a more convenient and intuitive way to study the mathematical properties of CNNs. Secondly, we develop a tight bound of the number of linear regions and the upper bounds of generalization error for CNNs, both taking into account factors such as the number of layers, dimension of pooling, and the width in the network. The above research results provide a possible guidance for designing and training CNNs.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3548620","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding the effect of hyperparameters of the network structure on the performance of Convolutional Neural Networks (CNNs) remains the most fundamental and urgent issue in deep learning, and we attempt to address this issue based on the piecewise linear (PWL) function nature of CNNs in this paper. Firstly, the operations of convolutions, ReLUs and Max pooling in a CNN are represented as the multiplication of multiple matrices for a fixed sample in order to obtain an algebraic expression of CNNs, this expression clearly suggests that CNNs are PWL functions. Although such representation has high time complexity, it provides a more convenient and intuitive way to study the mathematical properties of CNNs. Secondly, we develop a tight bound of the number of linear regions and the upper bounds of generalization error for CNNs, both taking into account factors such as the number of layers, dimension of pooling, and the width in the network. The above research results provide a possible guidance for designing and training CNNs.