论文标题
批处理层归一化,CNN和RNN的新标准化层
Batch Layer Normalization, A new normalization layer for CNNs and RNN
论文作者
论文摘要
这项研究引入了一个称为批处理层归一化(BLN)的新归一化层,以减少深神经网络层中内部协变量转移的问题。作为批次和层归一化的组合版本,BLN根据迷你批次的逆尺寸自适应地将适当的重量放在迷你批处理上,并在学习过程中将输入标准化为层。它还使用微型批量统计或人口统计数据,在推理时间执行精确的计算,并在推理时间进行较小的变化。使用迷你批次或人口统计的决策过程使BLN具有在模型的超参数优化过程中发挥全面作用的能力。 BLN的关键优势是支持独立于输入数据的理论分析,其统计配置在很大程度上取决于执行的任务,培训数据的量和批次的大小。测试结果表明,BLN的应用潜力及其更快的收敛性在卷积和复发性神经网络中都比批处理归一化和层归一化。实验的代码在线公开可用(https://github.com/a2amir/batch-layer-normalization)。
This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online (https://github.com/A2Amir/Batch-Layer-Normalization).