大规模转移学习，以了解低资源的口语理解

论文标题

大规模转移学习，以了解低资源的口语理解

Large-scale Transfer Learning for Low-resource Spoken Language Understanding

论文作者

Jia, Xueli, Wang, Jianzong, Zhang, Zhiyong, Cheng, Ning, Xiao, Jing

论文摘要

端到端的口语理解（SLU）模型变得越来越大且复杂，以达到最先进的准确性。但是，模型的复杂性增加也可以引入过度拟合的高风险，由于可用数据的限制，这是SLU任务的主要挑战。在本文中，我们提出了一个基于注意力的SLU模型以及三种编码器增强策略来克服数据稀疏挑战。第一个策略的重点是转移学习方法，以提高编码器的特征提取能力。它是通过用数量的自动语音识别注释的数据来预训练编码器组件来实现的，该数据依赖于标准变压器体系结构，然后使用少量的目标标记数据对SLU模型进行微调。第二种策略采用多任务学习策略，SLU模型通过共享相同的基本编码器来整合语音识别模型，从而提高了稳健性和概括能力。第三个策略是从组件融合（CF）思想中学习的，涉及从变压器（BERT）模型的双向编码器表示，旨在通过辅助网络提高解码器的能力。因此，它降低了过度拟合的风险，并增加了基础编码器的能力。在Fluentai数据集上的实验表明，与基线相比，跨语言转移学习和多任务策略分别提高了4：52％和3：89％。

End-to-end Spoken Language Understanding (SLU) models are made increasingly large and complex to achieve the state-ofthe-art accuracy. However, the increased complexity of a model can also introduce high risk of over-fitting, which is a major challenge in SLU tasks due to the limitation of available data. In this paper, we propose an attention-based SLU model together with three encoder enhancement strategies to overcome data sparsity challenge. The first strategy focuses on the transferlearning approach to improve feature extraction capability of the encoder. It is implemented by pre-training the encoder component with a quantity of Automatic Speech Recognition annotated data relying on the standard Transformer architecture and then fine-tuning the SLU model with a small amount of target labelled data. The second strategy adopts multitask learning strategy, the SLU model integrates the speech recognition model by sharing the same underlying encoder, such that improving robustness and generalization ability. The third strategy, learning from Component Fusion (CF) idea, involves a Bidirectional Encoder Representation from Transformer (BERT) model and aims to boost the capability of the decoder with an auxiliary network. It hence reduces the risk of over-fitting and augments the ability of the underlying encoder, indirectly. Experiments on the FluentAI dataset show that cross-language transfer learning and multi-task strategies have been improved by up to 4:52% and 3:89% respectively, compared to the baseline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题