论文标题

紧凑图表的重叠空间

Overlapping Spaces for Compact Graph Representations

论文作者

Shevkunov, Kirill, Prokhorenkova, Liudmila

论文摘要

各种非平凡的空间在嵌入结构化数据(例如图形,文本或图像)中变得流行。在球形和双曲线空间之后,已经提出了更多一般的产品空间。但是,搜索产品空间的最佳配置是一种资源密集型程序,可降低该想法的实际适用性。我们概括了产品空间的概念,并引入了一个没有配置搜索问题的重叠空间。主要思想是允许在不同类型的空间之间共享坐标的子集(欧几里得,双曲线,球形)。结果,参数优化会自动学习最佳配置。此外,重叠的空间允许更紧凑的表示,因为它们的几何形状更为复杂。我们的实验证实,重叠的空间的表现优于图形嵌入任务中的竞争对手。在这里,我们考虑了两个失真设置,其目的是保留距离和排名设置,并保留相对顺序。所提出的方法有效地解决了问题,并在两种情况下都胜过竞争对手。我们还将在现实的信息检索任务中进行经验分析,在该任务中,我们通过将它们纳入DSSM来比较所有空间。在这种情况下,所提出的重叠空间始终取得了几乎最佳的结果,而无需进行任何配置调整。这可以减少训练时间,这在大规模应用中可能很重要。

Various non-trivial spaces are becoming popular for embedding structured data such as graphs, texts, or images. Following spherical and hyperbolic spaces, more general product spaces have been proposed. However, searching for the best configuration of product space is a resource-intensive procedure, which reduces the practical applicability of the idea. We generalize the concept of product space and introduce an overlapping space that does not have the configuration search problem. The main idea is to allow subsets of coordinates to be shared between spaces of different types (Euclidean, hyperbolic, spherical). As a result, parameter optimization automatically learns the optimal configuration. Additionally, overlapping spaces allow for more compact representations since their geometry is more complex. Our experiments confirm that overlapping spaces outperform the competitors in graph embedding tasks. Here, we consider both distortion setup, where the aim is to preserve distances, and ranking setup, where the relative order should be preserved. The proposed method effectively solves the problem and outperforms the competitors in both settings. We also perform an empirical analysis in a realistic information retrieval task, where we compare all spaces by incorporating them into DSSM. In this case, the proposed overlapping space consistently achieves nearly optimal results without any configuration tuning. This allows for reducing training time, which can be significant in large-scale applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源