论文标题
为什么神经机器翻译更喜欢空输出
Why Neural Machine Translation Prefers Empty Outputs
论文作者
论文摘要
我们研究了为什么神经机器翻译(NMT)系统为空翻译分配了很高的可能性。我们找到两个解释。首先,标签平滑使正确的长度翻译降低了自信,从而使空翻译更容易最终超越它们。其次,NMT系统使用相同的高频EOS单词来结束所有目标句子,而不论长度如何。这会产生一个隐式平滑,从而增加了零长度的翻译。在不同长度的目标句子中使用不同的EOS类型会暴露出来,并消除这种隐式平滑。
We investigate why neural machine translation (NMT) systems assign high probability to empty translations. We find two explanations. First, label smoothing makes correct-length translations less confident, making it easier for the empty translation to finally outscore them. Second, NMT systems use the same, high-frequency EoS word to end all target sentences, regardless of length. This creates an implicit smoothing that increases zero-length translations. Using different EoS types in target sentences of different lengths exposes and eliminates this implicit smoothing.