论文标题

多语言无监督的神经机器翻译的知识蒸馏

Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation

论文作者

Sun, Haipeng, Wang, Rui, Chen, Kehai, Utiyama, Masao, Sumita, Eiichiro, Zhao, Tiejun

论文摘要

无监督的神经机器翻译(UNMT)最近为几种语言对取得了显着的结果。但是,它只能在单语言对之间翻译,并且不能同时为多语言对产生翻译结果。也就是说,对多语言UNMT的研究受到限制。在本文中,我们凭经验介绍了一种简单的方法,可以使用单个编码器和一个解码器在13种语言之间翻译,并利用多语言数据来改善所有语言对的UNMT。根据经验发现,我们提出了两种知识蒸馏方法,以进一步增强多语言UNMT性能。我们在使用英语翻译的数据集上进行的实验,从其他十二种语言(包括三种语言家族和六个语言分支)中表现出了显着的结果,超过了强大的无监督单个基线,同时在低回味语言的零摄像翻译方案中实现了零摄像的不良翻译方案的非英语语言对之间的有希望的表现。

Unsupervised neural machine translation (UNMT) has recently achieved remarkable results for several language pairs. However, it can only translate between a single language pair and cannot produce translation results for multiple language pairs at the same time. That is, research on multilingual UNMT has been limited. In this paper, we empirically introduce a simple method to translate between thirteen languages using a single encoder and a single decoder, making use of multilingual data to improve UNMT for all language pairs. On the basis of the empirical findings, we propose two knowledge distillation methods to further enhance multilingual UNMT performance. Our experiments on a dataset with English translated to and from twelve other languages (including three language families and six language branches) show remarkable results, surpassing strong unsupervised individual baselines while achieving promising performance between non-English language pairs in zero-shot translation scenarios and alleviating poor performance in low-resource language pairs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源