curlingnet：时尚智商数据的图像和文本之间的构图学习

论文标题

curlingnet：时尚智商数据的图像和文本之间的构图学习

CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data

论文作者

Yu, Youngjae, Lee, Seunghwan, Choi, Yuncheol, Kim, Gunhee

论文摘要

我们提出了一种名为curlingnet的方法，该方法可以测量图像文本嵌入的组成的语义距离。为了学习时尚域中数据的有效图像文本组成，我们的模型提出了两个关键组件，如下所示。首先，交付使源图像在嵌入空间中的过渡。其次，扫描强调了嵌入空间中时尚图像的查询相关组件。我们利用渠道的门控机制使其成为可能。我们的单个模型优于先前最先进的图像文本组成模型，包括Tirg和Film。我们参加了ICCV 2019中的第一个时尚-IQ挑战，为此，我们的模型合奏实现了最好的表现之一。

We present an approach named CurlingNet that can measure the semantic distance of composition of image-text embedding. In order to learn an effective image-text composition for the data in the fashion domain, our model proposes two key components as follows. First, the Delivery makes the transition of a source image in an embedding space. Second, the Sweeping emphasizes query-related components of fashion images in the embedding space. We utilize a channel-wise gating mechanism to make it possible. Our single model outperforms previous state-of-the-art image-text composition models including TIRG and FiLM. We participate in the first fashion-IQ challenge in ICCV 2019, for which ensemble of our model achieves one of the best performances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题