深神经网络的正则柔性激活功能组合

论文标题

深神经网络的正则柔性激活功能组合

Regularized Flexible Activation Function Combinations for Deep Neural Networks

论文作者

Jie, Renlong, Gao, Junbin, Vasnev, Andrey, Tran, Min-ngoc

论文摘要

深度神经网络中的激活对于实现非线性映射至关重要。传统研究主要集中于为特定的一组学习任务或模型架构寻找固定的激活。在设计理念和应用方案中，有关灵活激活的研究都非常有限。在这项研究中，提出了选择柔性激活组件的三个原则，并实施了柔性激活功能的一般组合形式。基于此，实施了一个新型的柔性激活功能，可以替代LSTM细胞中的Sigmoid或Tanh，以及通过将Relu和Elus结合起来的新家族。同样，引入了基于假设的两个新的正则术语，作为先验知识。已经表明，具有拟议柔性激活的LSTM模型P-SIG-RAMP可在时间序列预测中进行显着改进，而拟议的P-E2-Relu在具有卷积自动编码器的有损图像压缩任务上取得更好，更稳定的性能。此外，提出的正则化项可以提高具有灵活激活功能的模型的收敛，性能和稳定性。

Activation in deep neural networks is fundamental to achieving non-linear mappings. Traditional studies mainly focus on finding fixed activations for a particular set of learning tasks or model architectures. The research on flexible activation is quite limited in both designing philosophy and application scenarios. In this study, three principles of choosing flexible activation components are proposed and a general combined form of flexible activation functions is implemented. Based on this, a novel family of flexible activation functions that can replace sigmoid or tanh in LSTM cells are implemented, as well as a new family by combining ReLU and ELUs. Also, two new regularisation terms based on assumptions as prior knowledge are introduced. It has been shown that LSTM models with proposed flexible activations P-Sig-Ramp provide significant improvements in time series forecasting, while the proposed P-E2-ReLU achieves better and more stable performance on lossy image compression tasks with convolutional auto-encoders. In addition, the proposed regularization terms improve the convergence, performance and stability of the models with flexible activation functions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题