基于LSTM的语音识别的批准批准

论文标题

基于LSTM的语音识别的批准批准

Attentive batch normalization for lstm-based acoustic modeling of speech recognition

论文作者

Ding, Fenglin, Guo, Wu, Dai, Lirong, Du, Jun

论文摘要

分批归一化（BN）是加速模型训练并改善神经网络的概括性能的有效方法。在本文中，我们提出了一种改进的批准技术，称为“ ASCENTIVE BATCH归一化”（ABN），用于长期记忆（LSTM）的声学模型，用于自动语音识别（ASR）。在提出的方法中，辅助网络用于在批准化中动态生成缩放和变化参数，并引入注意机制以改善其正则性能。此外，还研究了两个方案，即框架级别和话语级别的ABN。我们分别评估了有关普通话和Uyghur ASR任务的建议方法。实验结果表明，拟议的ABN大大提高了两种语言的转录精度的批处理性能。

Batch normalization (BN) is an effective method to accelerate model training and improve the generalization performance of neural networks. In this paper, we propose an improved batch normalization technique called attentive batch normalization (ABN) in Long Short Term Memory (LSTM) based acoustic modeling for automatic speech recognition (ASR). In the proposed method, an auxiliary network is used to dynamically generate the scaling and shifting parameters in batch normalization, and attention mechanisms are introduced to improve their regularized performance. Furthermore, two schemes, frame-level and utterance-level ABN, are investigated. We evaluate our proposed methods on Mandarin and Uyghur ASR tasks, respectively. The experimental results show that the proposed ABN greatly improves the performance of batch normalization in terms of transcription accuracy for both languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题