组合：低资源神经机器翻译的转移学习中建模一致性

论文标题

组合：低资源神经机器翻译的转移学习中建模一致性

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

论文作者

Li, Zhaocong, Liu, Xuebo, Wong, Derek F., Chao, Lidia S., Zhang, Min

论文摘要

转移学习是一种简单而强大的方法，可用于提高低资源神经机器翻译（NMT）的模型性能。 NMT的现有转移学习方法是静态的，它只会通过参数初始化将知识从父模型转移到子模型。在本文中，我们为NMT提出了一种新颖的传输学习方法，即构成，该方法可以在训练子模型期间从父型模型中连续转移知识。具体而言，对于儿童模型的每个培训实例，组成的tll构造了父型模型的语义等效实例，并鼓励父母和子女之间的预测一致性，这与父母模型在父型模型的指导下学习每个实例的儿童模型相当。对五个低资源NMT任务的实验结果表明，与强大的转移学习基准相比，组成的构成显着改善，比广泛使用的WMT17 Turkish-English-English基准的现有反向翻译模型的增长率高达1.7 bleu。进一步的分析表明，CandeTL可以改善儿童模型的推理校准。代码和脚本可在https://github.com/nlp2ct/consisttl上免费获得。

Transfer learning is a simple and powerful method that can be used to boost model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer knowledge from a parent model to a child model once via parameter initialization. In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model. Specifically, for each training instance of the child model, ConsistTL constructs the semantically-equivalent instance for the parent model and encourages prediction consistency between the parent and child for this instance, which is equivalent to the child model learning each instance under the guidance of the parent model. Experimental results on five low-resource NMT tasks demonstrate that ConsistTL results in significant improvements over strong transfer learning baselines, with a gain up to 1.7 BLEU over the existing back-translation model on the widely-used WMT17 Turkish-English benchmark. Further analysis reveals that ConsistTL can improve the inference calibration of the child model. Code and scripts are freely available at https://github.com/NLP2CT/ConsistTL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题