为什么神经机器翻译更喜欢空输出

论文标题

为什么神经机器翻译更喜欢空输出

Why Neural Machine Translation Prefers Empty Outputs

论文作者

Shi, Xing, Xiao, Yijun, Knight, Kevin

论文摘要

我们研究了为什么神经机器翻译（NMT）系统为空翻译分配了很高的可能性。我们找到两个解释。首先，标签平滑使正确的长度翻译降低了自信，从而使空翻译更容易最终超越它们。其次，NMT系统使用相同的高频EOS单词来结束所有目标句子，而不论长度如何。这会产生一个隐式平滑，从而增加了零长度的翻译。在不同长度的目标句子中使用不同的EOS类型会暴露出来，并消除这种隐式平滑。

We investigate why neural machine translation (NMT) systems assign high probability to empty translations. We find two explanations. First, label smoothing makes correct-length translations less confident, making it easier for the empty translation to finally outscore them. Second, NMT systems use the same, high-frequency EoS word to end all target sentences, regardless of length. This creates an implicit smoothing that increases zero-length translations. Using different EoS types in target sentences of different lengths exposes and eliminates this implicit smoothing.

下载PDF全文

下载文献需遵守相关版权规定

论文标题