论文标题
语言模型比经验可预测性更好地解释单词阅读时间
Language Models Explain Word Reading Times Better Than Empirical Predictability
论文作者
论文摘要
尽管有很强的共识,即单词长度和频率是决定视觉正学访问精神词典的最重要的单个字特征,但如何最好地捕获句法和语义因素,但较少的一致性。认知阅读研究中的传统方法假设,从句子上下文中的单词可预测性是通过从人类绩效数据中得出的固定完成概率(CCP)来捕获的。我们回顾了最近的研究表明,概率语言模型比CCP提供了更深入的句法和语义效果的解释。然后,我们将CCP与(1)符号n-gram模型通过计算单词发生的概率(给定两个单词)来合并句法和语义短距离关系。 (2)主题模型依靠子符号表示来通过文档中的单词共发生计数捕获远程语义相似性。 (3)在复发性神经网络(RNN)中,鉴于句子中的所有上一个单词,训练了下一个单元以预测下一个单词。为了检查词汇检索,这些模型用于预测单固定持续时间和凝视持续时间,以捕获快速成功和标准的词汇访问,以及捕获晚期语义整合的总观看时间。线性项目级分析显示,与CCP相比,所有语言模型与所有眼动措施的相关性更大。然后,我们使用广义加性模型研究了不同类型的可预测性和阅读时间之间的非线性关系。与主题模型或CCP相比,本字的N-gram和RNN概率更一致地预测阅读性能。
Though there is a strong consensus that word length and frequency are the most important single-word features determining visual-orthographic access to the mental lexicon, there is less agreement as how to best capture syntactic and semantic factors. The traditional approach in cognitive reading research assumes that word predictability from sentence context is best captured by cloze completion probability (CCP) derived from human performance data. We review recent research suggesting that probabilistic language models provide deeper explanations for syntactic and semantic effects than CCP. Then we compare CCP with (1) Symbolic n-gram models consolidate syntactic and semantic short-range relations by computing the probability of a word to occur, given two preceding words. (2) Topic models rely on subsymbolic representations to capture long-range semantic similarity by word co-occurrence counts in documents. (3) In recurrent neural networks (RNNs), the subsymbolic units are trained to predict the next word, given all preceding words in the sentences. To examine lexical retrieval, these models were used to predict single fixation durations and gaze durations to capture rapidly successful and standard lexical access, and total viewing time to capture late semantic integration. The linear item-level analyses showed greater correlations of all language models with all eye-movement measures than CCP. Then we examined non-linear relations between the different types of predictability and the reading times using generalized additive models. N-gram and RNN probabilities of the present word more consistently predicted reading performance compared with topic models or CCP.