论文标题
增加掩蔽时单词顺序很重要
Word Order Matters when you Increase Masking
论文作者
论文摘要
单词顺序是天然语言的重要属性,使用位置编码注入基于变压器的神经语言模型。但是,最近的实验表明,明确的位置编码并不总是有用的,因为某些没有此类功能的模型可以在某些任务上实现最先进的性能。为了更好地理解这种现象,我们研究了删除位置编码对训练预训练目标本身(即蒙版语言建模)的影响,以测试模型是否可以单独从共发生的位置信息重建位置信息。我们通过控制输入句子中的蒙版令牌的数量,以影响任务的位置信息的重要性。我们发现,位置信息的必要性随掩盖的数量而增加,并且没有位置编码的掩盖语言模型无法在任务上重建此信息。这些发现指出了使用位置编码捕获命令对语言的订单敏感方面的掩盖量与变压器能力之间的直接关系。
Word order, an essential property of natural languages, is injected in Transformer-based neural language models using position encoding. However, recent experiments have shown that explicit position encoding is not always useful, since some models without such feature managed to achieve state-of-the art performance on some tasks. To understand better this phenomenon, we examine the effect of removing position encodings on the pre-training objective itself (i.e., masked language modelling), to test whether models can reconstruct position information from co-occurrences alone. We do so by controlling the amount of masked tokens in the input sentence, as a proxy to affect the importance of position information for the task. We find that the necessity of position information increases with the amount of masking, and that masked language models without position encodings are not able to reconstruct this information on the task. These findings point towards a direct relationship between the amount of masking and the ability of Transformers to capture order-sensitive aspects of language using position encoding.