依赖性意识到实体一致性的自我训练

论文标题

依赖性意识到实体一致性的自我训练

Dependency-aware Self-training for Entity Alignment

论文作者

Liu, Bing, Lan, Tiancheng, Hua, Wen, Zuccon, Guido

论文摘要

旨在检测不同知识图（kgs）中实体映射（即等效实体对）的实体对齐（EA）对于kg融合至关重要。神经EA方法主导了当前的EA研究，但仍依赖于标记的映射。为了解决这个问题，一些作品探索了通过自我培训来增强EA模型的培训，这将自信地预测映射到训练数据中。尽管在某些特定的环境中可以瞥见自我训练的有效性，但我们仍然对此有非常有限的了解。原因之一是现有作品集中于设计EA模型，而仅将自我训练视为辅助工具。为了填补这一知识差距，我们将观点更改为自我训练，以阐明它。此外，现有的自我训练策略的影响有限，因为它们引入了很多假噪声或较少数量的真正正伪映射。为了改善EA的自我训练，我们建议利用实体（EA的特殊性）之间的依赖性，以抑制噪声，而不会损害真正的积极映射的回忆。通过广泛的实验，我们表明依赖的引入使EA的自我训练策略达到了新的水平。自我训练对减轻注释的依赖的价值实际上远高于所实现的价值。此外，我们建议未来关于智能数据注释的研究，以打破EA性能的上限。

Entity Alignment (EA), which aims to detect entity mappings (i.e. equivalent entity pairs) in different Knowledge Graphs (KGs), is critical for KG fusion. Neural EA methods dominate current EA research but still suffer from their reliance on labelled mappings. To solve this problem, a few works have explored boosting the training of EA models with self-training, which adds confidently predicted mappings into the training data iteratively. Though the effectiveness of self-training can be glimpsed in some specific settings, we still have very limited knowledge about it. One reason is the existing works concentrate on devising EA models and only treat self-training as an auxiliary tool. To fill this knowledge gap, we change the perspective to self-training to shed light on it. In addition, the existing self-training strategies have limited impact because they introduce either much False Positive noise or a low quantity of True Positive pseudo mappings. To improve self-training for EA, we propose exploiting the dependencies between entities, a particularity of EA, to suppress the noise without hurting the recall of True Positive mappings. Through extensive experiments, we show that the introduction of dependency makes the self-training strategy for EA reach a new level. The value of self-training in alleviating the reliance on annotation is actually much higher than what has been realised. Furthermore, we suggest future study on smart data annotation to break the ceiling of EA performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题