论文标题
GPT-GNN:图神经网络的生成预训练
GPT-GNN: Generative Pre-Training of Graph Neural Networks
论文作者
论文摘要
图形神经网络(GNN)已被证明在建模图形结构数据方面具有强大的功能。但是,培训GNN通常需要大量的特定任务标记数据,这通常很昂贵。减少标签工作的一种有效方法是在没有标记的数据上预先培训表达性GNN模型,然后将学习的模型转移到只有几个标签的下游任务。在本文中,我们提出了GPT-GNN框架,以通过生成预培训来初始化GNN。 GPT-GNN引入了一个自制的归因图生成任务,以预先训练GNN,以便它可以捕获图形的结构和语义属性。我们将图生成的可能性分为两个组成部分:1)属性生成和2)边缘生成。通过对两个组件进行建模,GPT-GNN在生成过程中捕获了节点属性和图形结构之间的固有依赖关系。关于十亿个尺度开放学术图和亚马逊建议数据的全面实验表明,GPT-GNN在各种下游任务中均未预先培训高达9.1%的GPT-GNN胜过最先进的GNN模型。
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs usually requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce the labeling effort is to pre-train an expressive GNN model on unlabeled data with self-supervision and then transfer the learned model to downstream tasks with only a few labels. In this paper, we present the GPT-GNN framework to initialize GNNs by generative pre-training. GPT-GNN introduces a self-supervised attributed graph generation task to pre-train a GNN so that it can capture the structural and semantic properties of the graph. We factorize the likelihood of the graph generation into two components: 1) Attribute Generation and 2) Edge Generation. By modeling both components, GPT-GNN captures the inherent dependency between node attributes and graph structure during the generative process. Comprehensive experiments on the billion-scale Open Academic Graph and Amazon recommendation data demonstrate that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.