经常性的神经网络语言模型总是学习类似英语的相对从句附件

论文标题

经常性的神经网络语言模型总是学习类似英语的相对从句附件

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

论文作者

Davis, Forrest, van Schijndel, Marten

论文摘要

一种评估语言模型的标准方法分析模型如何将概率分配给有效的句法结构（即语法句子比不语法上的句子更有可能）。我们的工作使用模棱两可的相对从句附件将这种评估扩展到多个同时有效解释的情况，而存在鲜明的语法差异。我们比较了英语和西班牙语的模型性能，以表明RNN LMS中的非语言偏见有利地与英语的句法结构重叠，而不是西班牙语。因此，英语模型似乎可能获得了类似人类的句法偏好，而接受西班牙语训练的模型无法获得类似人类的偏好。最后，我们将这些结果与对理解之间的关系（即典型的语言模型用例）和生产（为语言模型生成培训数据）之间的关系的更广泛的关注来得出结论，这表明培训信号中根本不存在必要的语言偏见。

A standard approach to evaluating language models analyzes how models assign probabilities to valid versus invalid syntactic constructions (i.e. is a grammatical sentence more probable than an ungrammatical sentence). Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent. We compare model performance in English and Spanish to show that non-linguistic biases in RNN LMs advantageously overlap with syntactic structure in English but not Spanish. Thus, English models may appear to acquire human-like syntactic preferences, while models trained on Spanish fail to acquire comparable human-like preferences. We conclude by relating these results to broader concerns about the relationship between comprehension (i.e. typical language model use cases) and production (which generates the training data for language models), suggesting that necessary linguistic biases are not present in the training signal at all.

下载PDF全文

下载文献需遵守相关版权规定

论文标题