{"title":"基于梯度下降的片上学习","authors":"J. Sum, Janet C.C. Chang","doi":"10.1109/taai54685.2021.00034","DOIUrl":null,"url":null,"abstract":"Recently, it has been shown that gradient descent learning (GDL) might have problem in training a neural network (NN) with persistent weight noise. In the presence of multiplicative weight noise (resp. node noise), the model generated by GDL is not the desired model which minimizes the expected mean-squared-error (MSE) subjected to multiplicative weight noise (resp. node noise). In this paper, the analysis is formalized under a conceptual framework called suitability and extended to the learning gradient descent with momentum (GDM). A learning algorithm is suitable to be implemented on-chip to train a NN with weight noise if its learning objective is identical to the expected MSE of the NN with the same noise. In this regard, it is shown that GDL and GDM are not suitable to be implemented on-chip. Theoretical analysis in support with experimental evidences are presented for the claims.","PeriodicalId":343821,"journal":{"name":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On Gradient Descent for On-Chip Learning\",\"authors\":\"J. Sum, Janet C.C. Chang\",\"doi\":\"10.1109/taai54685.2021.00034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, it has been shown that gradient descent learning (GDL) might have problem in training a neural network (NN) with persistent weight noise. In the presence of multiplicative weight noise (resp. node noise), the model generated by GDL is not the desired model which minimizes the expected mean-squared-error (MSE) subjected to multiplicative weight noise (resp. node noise). In this paper, the analysis is formalized under a conceptual framework called suitability and extended to the learning gradient descent with momentum (GDM). A learning algorithm is suitable to be implemented on-chip to train a NN with weight noise if its learning objective is identical to the expected MSE of the NN with the same noise. In this regard, it is shown that GDL and GDM are not suitable to be implemented on-chip. Theoretical analysis in support with experimental evidences are presented for the claims.\",\"PeriodicalId\":343821,\"journal\":{\"name\":\"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/taai54685.2021.00034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/taai54685.2021.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recently, it has been shown that gradient descent learning (GDL) might have problem in training a neural network (NN) with persistent weight noise. In the presence of multiplicative weight noise (resp. node noise), the model generated by GDL is not the desired model which minimizes the expected mean-squared-error (MSE) subjected to multiplicative weight noise (resp. node noise). In this paper, the analysis is formalized under a conceptual framework called suitability and extended to the learning gradient descent with momentum (GDM). A learning algorithm is suitable to be implemented on-chip to train a NN with weight noise if its learning objective is identical to the expected MSE of the NN with the same noise. In this regard, it is shown that GDL and GDM are not suitable to be implemented on-chip. Theoretical analysis in support with experimental evidences are presented for the claims.