{"title":"关于优化器-激活对的收敛性","authors":"Dachuan Zhao","doi":"10.1109/ICCC51575.2020.9345160","DOIUrl":null,"url":null,"abstract":"The effect of training of deep neural network depends on the selection of the activation function and the optimizer, because the different activation functions lead to distinct loss curvature and the different optimizers will have different performance in distinct curvatures. In this paper, we select different combinations of activation functions and optimizers, seek to select the best combination under the same experiment setting, and take a general discussion for the efficiency of these combinations finally. Moreover, to guarantee fair comparison the hyperparameters tuning is conducted.","PeriodicalId":386048,"journal":{"name":"2020 IEEE 6th International Conference on Computer and Communications (ICCC)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the convergence of optimizer-activation pairs\",\"authors\":\"Dachuan Zhao\",\"doi\":\"10.1109/ICCC51575.2020.9345160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The effect of training of deep neural network depends on the selection of the activation function and the optimizer, because the different activation functions lead to distinct loss curvature and the different optimizers will have different performance in distinct curvatures. In this paper, we select different combinations of activation functions and optimizers, seek to select the best combination under the same experiment setting, and take a general discussion for the efficiency of these combinations finally. Moreover, to guarantee fair comparison the hyperparameters tuning is conducted.\",\"PeriodicalId\":386048,\"journal\":{\"name\":\"2020 IEEE 6th International Conference on Computer and Communications (ICCC)\",\"volume\":\"66 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 6th International Conference on Computer and Communications (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCC51575.2020.9345160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 6th International Conference on Computer and Communications (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCC51575.2020.9345160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The effect of training of deep neural network depends on the selection of the activation function and the optimizer, because the different activation functions lead to distinct loss curvature and the different optimizers will have different performance in distinct curvatures. In this paper, we select different combinations of activation functions and optimizers, seek to select the best combination under the same experiment setting, and take a general discussion for the efficiency of these combinations finally. Moreover, to guarantee fair comparison the hyperparameters tuning is conducted.