单词预测模型的跨语言句法评估

论文标题

单词预测模型的跨语言句法评估

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

论文作者

Mueller, Aaron, Nicolai, Garrett, Petrou-Zeniou, Panayiota, Talmina, Natalia, Linzen, Tal

论文摘要

一系列研究得出的结论是，神经单词预测模型可以将语法与不准确的语法句子区分开。但是，这些研究主要基于英语的单语言证据。为了调查这些模型学习语法的能力如何随语言而变化，我们引入了蛤（语法上的跨语言评估），这是一个单语和多语言模型的语法评估套件。蛤包括由我们开发的语法产生的英语，法语，德语，希伯来语和俄语的主题驱动协议挑战赛。我们使用蛤评估LSTM语言模型以及单语和多语言BERT。在各种语言中，单语LSTM在没有吸引子的依赖性方面达到了很高的准确性，并且在对象相对条款中的一致性通常很差。在其他结构上，一致性的准确性通常更高的语言具有更丰富的形态。多语言模型通常表现不佳。多语言BERT在英语方面表现出较高的句法精度，但在其他语言中表现出明显的缺陷。

A range of studies have concluded that neural word prediction models can distinguish grammatical from ungrammatical sentences with high accuracy. However, these studies are based primarily on monolingual evidence from English. To investigate how these models' ability to learn syntax varies by language, we introduce CLAMS (Cross-Linguistic Assessment of Models on Syntax), a syntactic evaluation suite for monolingual and multilingual models. CLAMS includes subject-verb agreement challenge sets for English, French, German, Hebrew and Russian, generated from grammars we develop. We use CLAMS to evaluate LSTM language models as well as monolingual and multilingual BERT. Across languages, monolingual LSTMs achieved high accuracy on dependencies without attractors, and generally poor accuracy on agreement across object relative clauses. On other constructions, agreement accuracy was generally higher in languages with richer morphology. Multilingual models generally underperformed monolingual models. Multilingual BERT showed high syntactic accuracy on English, but noticeable deficiencies in other languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题