论文标题
用于快速学习成对内核模型的广义VEC技巧
Generalized vec trick for fast learning of pairwise kernel models
论文作者
论文摘要
成对学习对应于监督学习设置,目标是对对象对进行预测。重要的应用包括预测靶标或蛋白质 - 蛋白质相互作用或客户产品偏好。在这项工作中,我们提出了对成对内核的全面综述,这些综述旨在纳入有关对象之间关系的先验知识。具体而言,我们考虑标准,对称和反对称Kronecker产品内核,度量学习,笛卡尔,排名以及线性,多项式和高斯内核。最近,引入了o(nm + nq)时间通用的VEC技巧算法,其中N,M和Q表示与Kronecker产品内核一起训练内核方法的成对,药物和靶标的数量。这是对以前的O(n^2)训练方法的重大改进,因为在大多数现实世界中,M,q << n。在这项工作中,我们展示了如何将所有审核的内核表示为Kronecker产品的总和,从而允许使用广义的VEC技巧来加速其计算。在实验中,我们演示了引入的方法如何允许比以前可行的更大的数据集缩放成对内核,并在许多生物交互预测任务上对内核进行了广泛的比较。
Pairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. In this work, we present a comprehensive review of pairwise kernels, that have been proposed for incorporating prior knowledge about the relationship between the objects. Specifically, we consider the standard, symmetric and anti-symmetric Kronecker product kernels, metric-learning, Cartesian, ranking, as well as linear, polynomial and Gaussian kernels. Recently, a O(nm + nq) time generalized vec trick algorithm, where n, m, and q denote the number of pairs, drugs and targets, was introduced for training kernel methods with the Kronecker product kernel. This was a significant improvement over previous O(n^2) training methods, since in most real-world applications m,q << n. In this work we show how all the reviewed kernels can be expressed as sums of Kronecker products, allowing the use of generalized vec trick for speeding up their computation. In the experiments, we demonstrate how the introduced approach allows scaling pairwise kernels to much larger data sets than previously feasible, and provide an extensive comparison of the kernels on a number of biological interaction prediction tasks.