论文标题
用英语翻译要比英语更容易:通过交叉杂音信息测量神经翻译难度
It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
论文作者
论文摘要
通常根据BLEU评估神经机器翻译系统的性能。但是,由于依赖目标语言属性和生成,BLEU指标不允许评估哪些翻译方向更难建模。在本文中,我们提出了交叉杂音信息(XMI):机器翻译难度的不对称信息理论指标,利用了大多数神经机器翻译模型的概率性质。 XMI使我们能够更好地评估将文本转换为目标语言的困难,同时控制目标端生成组件的难度与翻译任务无关。然后,我们使用现代神经翻译系统介绍了跨语性翻译困难的首次系统和对照研究。复制我们的实验的代码可在https://github.com/e-bug/nmt-difficulty上在线获得。
The performance of neural machine translation systems is commonly evaluated in terms of BLEU. However, due to its reliance on target language properties and generation, the BLEU metric does not allow an assessment of which translation directions are more difficult to model. In this paper, we propose cross-mutual information (XMI): an asymmetric information-theoretic metric of machine translation difficulty that exploits the probabilistic nature of most neural machine translation models. XMI allows us to better evaluate the difficulty of translating text into the target language while controlling for the difficulty of the target-side generation component independent of the translation task. We then present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems. Code for replicating our experiments is available online at https://github.com/e-bug/nmt-difficulty.