{"title":"通过张量分解学习深度卷积神经网络","authors":"Samet Oymak;Mahdi Soltanolkotabi","doi":"10.1093/imaiai/iaaa042","DOIUrl":null,"url":null,"abstract":"In this paper, we study the problem of learning the weights of a deep convolutional neural network. We consider a network where convolutions are carried out over non-overlapping patches. We develop an algorithm for simultaneously learning all the kernels from the training data. Our approach dubbed deep tensor decomposition (DeepTD) is based on a low-rank tensor decomposition. We theoretically investigate DeepTD under a realizable model for the training data where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted convolutional kernels. We show that DeepTD is sample efficient and provably works as soon as the sample size exceeds the total number of convolutional weights in the network.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 3","pages":"1031-1071"},"PeriodicalIF":1.4000,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa042","citationCount":"4","resultStr":"{\"title\":\"Learning a deep convolutional neural network via tensor decomposition\",\"authors\":\"Samet Oymak;Mahdi Soltanolkotabi\",\"doi\":\"10.1093/imaiai/iaaa042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study the problem of learning the weights of a deep convolutional neural network. We consider a network where convolutions are carried out over non-overlapping patches. We develop an algorithm for simultaneously learning all the kernels from the training data. Our approach dubbed deep tensor decomposition (DeepTD) is based on a low-rank tensor decomposition. We theoretically investigate DeepTD under a realizable model for the training data where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted convolutional kernels. We show that DeepTD is sample efficient and provably works as soon as the sample size exceeds the total number of convolutional weights in the network.\",\"PeriodicalId\":45437,\"journal\":{\"name\":\"Information and Inference-A Journal of the Ima\",\"volume\":\"10 3\",\"pages\":\"1031-1071\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2021-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1093/imaiai/iaaa042\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Inference-A Journal of the Ima\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9579226/\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Inference-A Journal of the Ima","FirstCategoryId":"100","ListUrlMain":"https://ieeexplore.ieee.org/document/9579226/","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
Learning a deep convolutional neural network via tensor decomposition
In this paper, we study the problem of learning the weights of a deep convolutional neural network. We consider a network where convolutions are carried out over non-overlapping patches. We develop an algorithm for simultaneously learning all the kernels from the training data. Our approach dubbed deep tensor decomposition (DeepTD) is based on a low-rank tensor decomposition. We theoretically investigate DeepTD under a realizable model for the training data where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to planted convolutional kernels. We show that DeepTD is sample efficient and provably works as soon as the sample size exceeds the total number of convolutional weights in the network.