{"title":"卷积层谱规范的严密而高效的上界","authors":"Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba","doi":"arxiv-2409.11859","DOIUrl":null,"url":null,"abstract":"Controlling the spectral norm of the Jacobian matrix, which is related to the\nconvolution operation, has been shown to improve generalization, training\nstability and robustness in CNNs. Existing methods for computing the norm\neither tend to overestimate it or their performance may deteriorate quickly\nwith increasing the input and kernel sizes. In this paper, we demonstrate that\nthe tensor version of the spectral norm of a four-dimensional convolution\nkernel, up to a constant factor, serves as an upper bound for the spectral norm\nof the Jacobian matrix associated with the convolution operation. This new\nupper bound is independent of the input image resolution, differentiable and\ncan be efficiently calculated during training. Through experiments, we\ndemonstrate how this new bound can be used to improve the performance of\nconvolutional architectures.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tight and Efficient Upper Bound on Spectral Norm of Convolutional Layers\",\"authors\":\"Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba\",\"doi\":\"arxiv-2409.11859\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Controlling the spectral norm of the Jacobian matrix, which is related to the\\nconvolution operation, has been shown to improve generalization, training\\nstability and robustness in CNNs. Existing methods for computing the norm\\neither tend to overestimate it or their performance may deteriorate quickly\\nwith increasing the input and kernel sizes. In this paper, we demonstrate that\\nthe tensor version of the spectral norm of a four-dimensional convolution\\nkernel, up to a constant factor, serves as an upper bound for the spectral norm\\nof the Jacobian matrix associated with the convolution operation. This new\\nupper bound is independent of the input image resolution, differentiable and\\ncan be efficiently calculated during training. Through experiments, we\\ndemonstrate how this new bound can be used to improve the performance of\\nconvolutional architectures.\",\"PeriodicalId\":501301,\"journal\":{\"name\":\"arXiv - CS - Machine Learning\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11859\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11859","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tight and Efficient Upper Bound on Spectral Norm of Convolutional Layers
Controlling the spectral norm of the Jacobian matrix, which is related to the
convolution operation, has been shown to improve generalization, training
stability and robustness in CNNs. Existing methods for computing the norm
either tend to overestimate it or their performance may deteriorate quickly
with increasing the input and kernel sizes. In this paper, we demonstrate that
the tensor version of the spectral norm of a four-dimensional convolution
kernel, up to a constant factor, serves as an upper bound for the spectral norm
of the Jacobian matrix associated with the convolution operation. This new
upper bound is independent of the input image resolution, differentiable and
can be efficiently calculated during training. Through experiments, we
demonstrate how this new bound can be used to improve the performance of
convolutional architectures.