论文标题

具有跨语性位置表示的自我发作

Self-Attention with Cross-Lingual Position Representation

论文作者

Ding, Liang, Wang, Longyue, Tao, Dacheng

论文摘要

位置编码(PE)是自我发项网络(SANS)的重要组成部分,用于保留自然语言处理任务的单词订单信息,为输入序列生成固定位置索引。但是,在跨语言场景中,例如机器翻译,源和目标句子的PES是独立建模的。由于单词顺序差异的不同语言,对跨语性的位置关系进行建模可能有助于解决这个问题。在本文中,我们使用\ emph {跨语性位置表示}来增强sans,以建模输入句子的双语意识潜在结构。具体而言,我们利用基于括号的转导语法(BTG)的重新排序信息来鼓励SANS学习双语对角线比对。 WMT'14英语$ \ rightArrow $ derman,Wat'17 Japanese $ \ rightarrow $英语和WMT'17中文$ \ leftrightArrow $英语翻译任务表明,我们的方法显着,一致地改善了与强基地的翻译质量相比。广泛的分析证实,绩效增长来自跨语义信息。

Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences. However, in cross-lingual scenarios, e.g. machine translation, the PEs of source and target sentences are modeled independently. Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem. In this paper, we augment SANs with \emph{cross-lingual position representations} to model the bilingually aware latent structure for the input sentence. Specifically, we utilize bracketing transduction grammar (BTG)-based reordering information to encourage SANs to learn bilingual diagonal alignments. Experimental results on WMT'14 English$\Rightarrow$German, WAT'17 Japanese$\Rightarrow$English, and WMT'17 Chinese$\Leftrightarrow$English translation tasks demonstrate that our approach significantly and consistently improves translation quality over strong baselines. Extensive analyses confirm that the performance gains come from the cross-lingual information.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源