论文标题
curlingnet:时尚智商数据的图像和文本之间的构图学习
CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data
论文作者
论文摘要
我们提出了一种名为curlingnet的方法,该方法可以测量图像文本嵌入的组成的语义距离。为了学习时尚域中数据的有效图像文本组成,我们的模型提出了两个关键组件,如下所示。首先,交付使源图像在嵌入空间中的过渡。其次,扫描强调了嵌入空间中时尚图像的查询相关组件。我们利用渠道的门控机制使其成为可能。我们的单个模型优于先前最先进的图像文本组成模型,包括Tirg和Film。我们参加了ICCV 2019中的第一个时尚-IQ挑战,为此,我们的模型合奏实现了最好的表现之一。
We present an approach named CurlingNet that can measure the semantic distance of composition of image-text embedding. In order to learn an effective image-text composition for the data in the fashion domain, our model proposes two key components as follows. First, the Delivery makes the transition of a source image in an embedding space. Second, the Sweeping emphasizes query-related components of fashion images in the embedding space. We utilize a channel-wise gating mechanism to make it possible. Our single model outperforms previous state-of-the-art image-text composition models including TIRG and FiLM. We participate in the first fashion-IQ challenge in ICCV 2019, for which ensemble of our model achieves one of the best performances.