论文标题

持续的对比填充可改善低资源关系提取

Continual Contrastive Finetuning Improves Low-Resource Relation Extraction

论文作者

Zhou, Wenxuan, Zhang, Sheng, Naumann, Tristan, Chen, Muhao, Poon, Hoifung

论文摘要

关系提取(RE)依赖于结构注释的语料库进行模型培训,在低资源场景和域中尤其具有挑战性。最近的文献通过自我监督的学习解决了低资源的问题,在该学习中,解决方案涉及通过基于重新的目标嵌入实体对,并通过基于分类的目标对标记的数据进行填充。但是,对这种方法的关键挑战是目标的差距,这阻止了RE模型充分利用预验证的表示中的知识。在本文中,我们旨在弥合差距,并提议使用对比度学习的一致目标对RE模型进行预处理和捕获。由于在这种表示范式中,一个关系可能很容易在表示空间中形成多个群集,因此我们进一步提出了一个多中心的对比损失,允许一个关系形成多个群集以更好地与预处理保持一致。对两个文档级RE数据集进行的实验,并重新转移,证明了我们方法的有效性。特别是,当使用1%的端任务培训数据时,我们的方法在两个数据集上的表现分别优于基于PLM的RE分类器10.5%和6.1%。

Relation extraction (RE), which has relied on structurally annotated corpora for model training, has been particularly challenging in low-resource scenarios and domains. Recent literature has tackled low-resource RE by self-supervised learning, where the solution involves pretraining the entity pair embedding by RE-based objective and finetuning on labeled data by classification-based objective. However, a critical challenge to this approach is the gap in objectives, which prevents the RE model from fully utilizing the knowledge in pretrained representations. In this paper, we aim at bridging the gap and propose to pretrain and finetune the RE model using consistent objectives of contrastive learning. Since in this kind of representation learning paradigm, one relation may easily form multiple clusters in the representation space, we further propose a multi-center contrastive loss that allows one relation to form multiple clusters to better align with pretraining. Experiments on two document-level RE datasets, BioRED and Re-DocRED, demonstrate the effectiveness of our method. Particularly, when using 1% end-task training data, our method outperforms PLM-based RE classifier by 10.5% and 6.1% on the two datasets, respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源