自我发作的宪报嵌入命名实体识别的嵌入

论文标题

自我发作的宪报嵌入命名实体识别的嵌入

Self-Attention Gazetteer Embeddings for Named-Entity Recognition

论文作者

Peshterliev, Stanislav, Dupuy, Christophe, Kiss, Imre

论文摘要

最近尝试将外部知识吸收到命名实体识别（NER）的神经模型中的尝试表现出不同的结果。在这项工作中，我们提出了GazSelfattn，这是一种新颖的Gazetteer嵌入方法，该方法使用自我注意力和匹配跨度编码来构建增强的Gazetteer嵌入。此外，我们演示了如何从开源Wikidata知识库中构建Gazetteer资源。对CONLL-03和ONTONOTES 5数据集的评估显示，基线模型的F1改进分别从92.34到92.86和89.11至89.32，可实现与大型最新模型相当的性能。

Recent attempts to ingest external knowledge into neural models for named-entity recognition (NER) have exhibited mixed results. In this work, we present GazSelfAttn, a novel gazetteer embedding approach that uses self-attention and match span encoding to build enhanced gazetteer embeddings. In addition, we demonstrate how to build gazetteer resources from the open source Wikidata knowledge base. Evaluations on CoNLL-03 and Ontonotes 5 datasets, show F1 improvements over baseline model from 92.34 to 92.86 and 89.11 to 89.32 respectively, achieving performance comparable to large state-of-the-art models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题