使用域知识来命名为低资源实体识别

论文标题

使用域知识来命名为低资源实体识别

Using Domain Knowledge for Low Resource Named Entity Recognition

论文作者

Shi, Yuan

论文摘要

近年来，指定的实体识别一直是自然语言处理领域的一项流行研究，而传统的深度学习方法则需要大量的标记数据进行模型培训，这使得它们不适合稀缺标签资源的领域。此外，现有的跨域知识转移方法需要调整不同领域的实体标签，以提高培训成本。为了解决这些问题，通过中文指定实体识别的处理方法的开明，我们建议使用域知识来提高资源低的领域中指定实体识别的性能。我们主要应用的领域知识是域字典和标记数据。我们使用每个单词的字典信息来加强其单词嵌入和标记为数据的域以增强识别效果。提出的模型避免了不同域中的大规模数据调整，同时处理命名实体识别的资源低。实验证明了我们方法的有效性，这在科学和技术设备领域的数据集上取得了令人印象深刻的结果，与许多其他基线方法相比，F1得分得到了显着提高。

In recent years, named entity recognition has always been a popular research in the field of natural language processing, while traditional deep learning methods require a large amount of labeled data for model training, which makes them not suitable for areas where labeling resources are scarce. In addition, the existing cross-domain knowledge transfer methods need to adjust the entity labels for different fields, so as to increase the training cost. To solve these problems, enlightened by a processing method of Chinese named entity recognition, we propose to use domain knowledge to improve the performance of named entity recognition in areas with low resources. The domain knowledge mainly applied by us is domain dictionary and domain labeled data. We use dictionary information for each word to strengthen its word embedding and domain labeled data to reinforce the recognition effect. The proposed model avoids large-scale data adjustments in different domains while handling named entities recognition with low resources. Experiments demonstrate the effectiveness of our method, which has achieved impressive results on the data set in the field of scientific and technological equipment, and the F1 score has been significantly improved compared with many other baseline methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题