{"title":"深度神经网络优化归一化方法的两个最新进展","authors":"Lei Zhang","doi":"10.1109/VCIP49819.2020.9301751","DOIUrl":null,"url":null,"abstract":"The normalization methods are very important for the effective and efficient optimization of deep neural networks (DNNs). The statistics such as mean and variance can be used to normalize the network activations or weights to make the training process more stable. Among the activation normalization techniques, batch normalization (BN) is the most popular one. However, BN has poor performance when the batch size is small in training. We found that the formulation of BN in the inference stage is problematic, and consequently presented a corrected one. Without any change in the training stage, the corrected BN significantly improves the inference performance when training with small batch size.","PeriodicalId":431880,"journal":{"name":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two recent advances on normalization methods for deep neural network optimization\",\"authors\":\"Lei Zhang\",\"doi\":\"10.1109/VCIP49819.2020.9301751\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The normalization methods are very important for the effective and efficient optimization of deep neural networks (DNNs). The statistics such as mean and variance can be used to normalize the network activations or weights to make the training process more stable. Among the activation normalization techniques, batch normalization (BN) is the most popular one. However, BN has poor performance when the batch size is small in training. We found that the formulation of BN in the inference stage is problematic, and consequently presented a corrected one. Without any change in the training stage, the corrected BN significantly improves the inference performance when training with small batch size.\",\"PeriodicalId\":431880,\"journal\":{\"name\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VCIP49819.2020.9301751\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP49819.2020.9301751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Two recent advances on normalization methods for deep neural network optimization
The normalization methods are very important for the effective and efficient optimization of deep neural networks (DNNs). The statistics such as mean and variance can be used to normalize the network activations or weights to make the training process more stable. Among the activation normalization techniques, batch normalization (BN) is the most popular one. However, BN has poor performance when the batch size is small in training. We found that the formulation of BN in the inference stage is problematic, and consequently presented a corrected one. Without any change in the training stage, the corrected BN significantly improves the inference performance when training with small batch size.