序列的语义标签平滑到序列问题

论文标题

序列的语义标签平滑到序列问题

Semantic Label Smoothing for Sequence to Sequence Problems

论文作者

Lukasik, Michal, Jain, Himanshu, Menon, Aditya Krishna, Kim, Seungyeon, Bhojanapalli, Srinadh, Yu, Felix, Kumar, Sanjiv

论文摘要

标签平滑已被证明是分类中的有效正规化策略，可防止过度拟合并有助于标记噪声。但是，将这种方法直接扩展到SEQ2SEQ设置（例如机器翻译）具有挑战性：此类问题的大型目标输出空间使得在所有可能的输出上应用标签平滑是棘手的。 SEQ2SEQ设置的大多数现有方法要么进行令牌级平滑，要么通过在目标序列中随机替换令牌而产生的序列平滑。与这些作品不同，在本文中，我们提出了一项技术，该技术可以平滑\ emph {良好的}相关序列，该序列不仅具有足够的n-gram与目标序列重叠，而且还具有\ emph {语义上相似}。我们的方法对不同数据集的最新技术表现出一致且显着的改进。

Label smoothing has been shown to be an effective regularization strategy in classification, that prevents overfitting and helps in label de-noising. However, extending such methods directly to seq2seq settings, such as Machine Translation, is challenging: the large target output space of such problems makes it intractable to apply label smoothing over all possible outputs. Most existing approaches for seq2seq settings either do token level smoothing, or smooth over sequences generated by randomly substituting tokens in the target sequence. Unlike these works, in this paper, we propose a technique that smooths over \emph{well formed} relevant sequences that not only have sufficient n-gram overlap with the target sequence, but are also \emph{semantically similar}. Our method shows a consistent and significant improvement over the state-of-the-art techniques on different datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题