论文标题
神经语言模型中句法概括的系统评估
A Systematic Assessment of Syntactic Generalization in Neural Language Models
论文作者
论文摘要
尽管最新的神经网络模型继续在语言建模基准上取得较低的困惑得分,但尚不清楚是否对宽覆盖的预测性能进行优化是否导致类似人类的句法知识。此外,现有工作尚未提供有关产生适当句法概括所需的模型属性的清晰图片。我们对神经语言模型的句法知识进行了系统评估,测试了一组34个英语语法测试套件的模型类型和数据大小组合的组合。我们发现模型体系结构的句法概括性能有很大的差异,顺序模型表现不佳。通过构建模型架构和训练数据集大小(1m-40m单词),我们发现通过体系结构的句法概括性能的可变性要大于我们实验中测试的Corpora的数据集大小。我们的结果还揭示了困惑性和句法泛化性能之间的分离。
While state-of-the-art neural network models continue to achieve lower perplexity scores on language modeling benchmarks, it remains unknown whether optimizing for broad-coverage predictive performance leads to human-like syntactic knowledge. Furthermore, existing work has not provided a clear picture about the model properties required to produce proper syntactic generalizations. We present a systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and data sizes on a set of 34 English-language syntactic test suites. We find substantial differences in syntactic generalization performance by model architecture, with sequential models underperforming other architectures. Factorially manipulating model architecture and training dataset size (1M--40M words), we find that variability in syntactic generalization performance is substantially greater by architecture than by dataset size for the corpora tested in our experiments. Our results also reveal a dissociation between perplexity and syntactic generalization performance.