论文标题
发现与图形卷积网络的一致性关系:生物医学案例研究
Discovering alignment relations with Graph Convolutional Networks: a biomedical case study
论文作者
论文摘要
知识图是在数据网络中自由汇总,发布和编辑的,因此可能重叠。因此,一个关键任务属于对齐(或匹配)其内容。此任务包括在汇总的知识图内的标识,这些节点是等效,更具体或弱相关的节点。在本文中,我们建议通过(i)学习节点嵌入与图形卷积网络的嵌入在知识图中的节点,以使类似的节点在嵌入空间中具有较低的距离,并且(ii)基于嵌入的群集节点,以提示同一群体节点之间的对齐关系。我们对这种方法进行了实验,该方法是在药物基因组学领域对齐知识的现实应用,这激发了我们的研究。我们特别研究了域知识与GCN模型与以下两个焦点之间的相互作用。首先,在学习节点嵌入之前,我们应用了与域知识相关的推理规则,并测量了匹配结果的改进。其次,尽管我们的GCN模型与确切的对准关系(例如等价,弱相似性)是不可知的,但我们观察到,嵌入空间中的距离与这些不同关系的``强度''(例如,对等价的较小距离的'''强度'的距离是一致的(例如,等价的较小距离),让我们考虑在嵌入式的相互关系中考虑聚集和距离的案例。
Knowledge graphs are freely aggregated, published, and edited in the Web of data, and thus may overlap. Hence, a key task resides in aligning (or matching) their content. This task encompasses the identification, within an aggregated knowledge graph, of nodes that are equivalent, more specific, or weakly related. In this article, we propose to match nodes within a knowledge graph by (i) learning node embeddings with Graph Convolutional Networks such that similar nodes have low distances in the embedding space, and (ii) clustering nodes based on their embeddings, in order to suggest alignment relations between nodes of a same cluster. We conducted experiments with this approach on the real world application of aligning knowledge in the field of pharmacogenomics, which motivated our study. We particularly investigated the interplay between domain knowledge and GCN models with the two following focuses. First, we applied inference rules associated with domain knowledge, independently or combined, before learning node embeddings, and we measured the improvements in matching results. Second, while our GCN model is agnostic to the exact alignment relations (e.g., equivalence, weak similarity), we observed that distances in the embedding space are coherent with the ``strength'' of these different relations (e.g., smaller distances for equivalences), letting us considering clustering and distances in the embedding space as a means to suggest alignment relations in our case study.