Batch Layer Normalization A new normalization layer for CNNs and RNNs

A. Ziaee, Erion cCano
{"title":"Batch Layer Normalization A new normalization layer for CNNs and RNNs","authors":"A. Ziaee, Erion cCano","doi":"10.1145/3571560.3571566","DOIUrl":null,"url":null,"abstract":"This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online.1","PeriodicalId":143909,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Advances in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571560.3571566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online.1
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
批处理层规范化cnn和rnn的一种新的规范化层
本研究引入了一种新的归一化层,称为批层归一化(Batch layer normalization, BLN),以减少深度神经网络层内部协变量移位的问题。作为批处理归一化和层归一化的结合版本,BLN在学习过程中,根据小批的逆大小,自适应地赋予小批和特征归一化适当的权重,对某一层的输入进行归一化。它还使用mini-batch统计数据或总体统计数据,在推理时间进行微小的更改,从而执行精确的计算。选择小批量统计量还是总体统计量的决策过程,使BLN能够在模型的超参数优化过程中发挥全面的作用。BLN的关键优势是支持独立于输入数据的理论分析,其统计配置在很大程度上取决于所执行的任务、训练数据的数量和批次的大小。实验结果表明了BLN在卷积神经网络和循环神经网络中的应用潜力,其收敛速度快于批归一化和层归一化。实验代码在网上是公开的
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A semantic real-time activity recognition system for sequential procedures in vocational learning An Effective Implementation of Detection and Retrieval Property of Episodic Memory Measuring Airport Service Quality Using Machine Learning Algorithms Prospects for the use of algebraic rings to describe the operation of convolutional neural networks Optimizing Ethanol Production in Escherichia Coli Using a Hybrid of Particle Swarm Optimization and Artificial Bee Colony
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1