作为专家的实体：实体监督的稀疏内存访问

论文标题

作为专家的实体：实体监督的稀疏内存访问

Entities as Experts: Sparse Memory Access with Entity Supervision

论文作者

Févry, Thibault, Soares, Livio Baldini, FitzGerald, Nicholas, Choi, Eunsol, Kwiatkowski, Tom

论文摘要

我们专注于在语言模型的学习参数中捕获有关实体的声明性知识的问题。我们介绍了一个新模型 - 作为专家（EAE）的实体 - 可以访问文本中提到的实体的不同记忆。与以前将实体知识集成到序列模型中的努力不同，EAE的实体表示是直接从文本中学到的。我们表明，EAE学识渊博的表示形式捕获了足够的知识来回答Triviaqa问题，例如“ Who Dr. Who Dr. Roger Delgado，Roger Delgado，Anthony Ainley，Eric Roberts？”，优于10倍参数的编码器发电机变形金刚模型。根据Lama知识调查，EAE包含的事实知识比类似大小的BERT以及以前整合了实体知识的外部来源的方法。由于EAE将参数与特定实体相关联，因此它只需要在推理时间访问其参数的一小部分，我们表明实体的正确识别和表示对于EAE的性能至关重要。

We focus on the problem of capturing declarative knowledge about entities in the learned parameters of a language model. We introduce a new model - Entities as Experts (EAE) - that can access distinct memories of the entities mentioned in a piece of text. Unlike previous efforts to integrate entity knowledge into sequence models, EAE's entity representations are learned directly from text. We show that EAE's learned representations capture sufficient knowledge to answer TriviaQA questions such as "Which Dr. Who villain has been played by Roger Delgado, Anthony Ainley, Eric Roberts?", outperforming an encoder-generator Transformer model with 10x the parameters. According to the LAMA knowledge probes, EAE contains more factual knowledge than a similarly sized BERT, as well as previous approaches that integrate external sources of entity knowledge. Because EAE associates parameters with specific entities, it only needs to access a fraction of its parameters at inference time, and we show that the correct identification and representation of entities is essential to EAE's performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题