通过元模拟对比度学习几乎没有视觉推理

论文标题

通过元模拟对比度学习几乎没有视觉推理

Few-shot Visual Reasoning with Meta-analogical Contrastive Learning

论文作者

Kim, Youngsung, Shin, Jinwoo, Yang, Eunho, Hwang, Sung Ju

论文摘要

尽管人类可以通过仅观察几个样本来解决需要逻辑推理的视觉难题，但它需要在大量数据上培训最先进的深层推理模型才能在同一任务上获得相似的性能。在这项工作中，我们建议通过诉诸类似推理来解决这样的几次（或低射击）视觉推理问题，这是人类在两组之间识别结构或关系相似性的独特能力。具体而言，给定包含相同类型的视觉推理问题的培训和测试集，我们在两个域中的元素之间提取结构关系，并在类似学习中强制执行它们尽可能相似。在假设它不影响训练和测试样本之间的关系的假设下，我们反复应用了相同问题的查询。这允许即使使用一对样本，也可以以有效的方式学习两个样本之间的关系相似性。我们在Raven数据集上验证了我们的方法，当训练数据稀缺时，它在其上胜过最先进的方法，并具有更大的收益。我们在具有不同属性的相同任务上进一步meta-learn我们的类比对比学习模型，并表明它概括为具有看不见的属性的同一视觉推理问题。

While humans can solve a visual puzzle that requires logical reasoning by observing only few samples, it would require training over large amount of data for state-of-the-art deep reasoning models to obtain similar performance on the same task. In this work, we propose to solve such a few-shot (or low-shot) visual reasoning problem, by resorting to analogical reasoning, which is a unique human ability to identify structural or relational similarity between two sets. Specifically, given training and test sets that contain the same type of visual reasoning problems, we extract the structural relationships between elements in both domains, and enforce them to be as similar as possible with analogical learning. We repeatedly apply this process with slightly modified queries of the same problem under the assumption that it does not affect the relationship between a training and a test sample. This allows to learn the relational similarity between the two samples in an effective manner even with a single pair of samples. We validate our method on RAVEN dataset, on which it outperforms state-of-the-art method, with larger gains when the training data is scarce. We further meta-learn our analogical contrastive learning model over the same tasks with diverse attributes, and show that it generalizes to the same visual reasoning problem with unseen attributes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题