论文标题
Beta稳定器对深度学习的调查
An Investigation on Deep Learning with Beta Stabilizer
论文作者
论文摘要
人工神经网络(ANN)已在许多应用中使用,例如手写识别和语音识别。众所周知,学习率是人工神经网络的训练程序中的关键价值。结果表明,学习率的初始值可能会混淆影响最终结果,并且该值始终在实践中手动设置。引入了一个称为Beta稳定器的新参数,以降低初始学习率的灵敏度。但是,仅针对具有乙状结激活函数的深神经网络(DNN)提出了这种方法。在本文中,我们将Beta稳定剂扩展到长期短期记忆(LSTM),并研究了Beta稳定器参数对不同模型(包括LSTM和DNN)的影响,包括LSTM和DNN具有RELU激活函数。可以得出结论,Beta稳定器参数可以通过RERU激活函数和LSTM在DNN上几乎相同的DNN降低学习率的敏感性。但是,结果表明,β稳定剂对具有relu激活函数和LSTM的DNN的影响少于对Sigmoid激活函数对DNN的影响。
Artificial neural networks (ANN) have been used in many applications such like handwriting recognition and speech recognition. It is well-known that learning rate is a crucial value in the training procedure for artificial neural networks. It is shown that the initial value of learning rate can confoundedly affect the final result and this value is always set manually in practice. A new parameter called beta stabilizer has been introduced to reduce the sensitivity of the initial learning rate. But this method has only been proposed for deep neural network (DNN) with sigmoid activation function. In this paper we extended beta stabilizer to long short-term memory (LSTM) and investigated the effects of beta stabilizer parameters on different models, including LSTM and DNN with relu activation function. It is concluded that beta stabilizer parameters can reduce the sensitivity of learning rate with almost the same performance on DNN with relu activation function and LSTM. However, it is shown that the effects of beta stabilizer on DNN with relu activation function and LSTM are fewer than the effects on DNN with sigmoid activation function.