论文标题
使用场景图的结构化基于查询的图像检索
Structured Query-Based Image Retrieval Using Scene Graphs
论文作者
论文摘要
与单个物体(例如“女人”或“摩托车”)不同,结构化查询可以捕获对象相互作用的复杂性(例如“女人骑摩托车”)。因此,使用结构化查询的检索比单个对象检索更有用,但是一个更具挑战性的问题。在本文中,我们提出了一种使用场景图嵌入的方法作为图像检索方法的基础。我们研究了如何将视觉关系从场景图中得出的方式用作结构化查询。视觉关系是场景图的定向子图,并用主题和对象作为通过谓词关系连接的节点。值得注意的是,即使在长尾可可固定数据集中发现的低至中频对象上,我们也能够获得高回忆,并发现在最佳情况下,增加视觉关系启发的损失会使我们的回忆提高10%。
A structured query can capture the complexity of object interactions (e.g. 'woman rides motorcycle') unlike single objects (e.g. 'woman' or 'motorcycle'). Retrieval using structured queries therefore is much more useful than single object retrieval, but a much more challenging problem. In this paper we present a method which uses scene graph embeddings as the basis for an approach to image retrieval. We examine how visual relationships, derived from scene graphs, can be used as structured queries. The visual relationships are directed subgraphs of the scene graph with a subject and object as nodes connected by a predicate relationship. Notably, we are able to achieve high recall even on low to medium frequency objects found in the long-tailed COCO-Stuff dataset, and find that adding a visual relationship-inspired loss boosts our recall by 10% in the best case.