从一个又一个镜头中学习

论文标题

从一个又一个镜头中学习

Learning from One and Only One Shot

论文作者

Yu, Haizi, Mineyev, Igor, Varshney, Lav R., Evans, James A.

论文摘要

人类只能从几个示例中概括，几乎没有预处理类似的任务。但是，机器学习（ML）通常需要大量数据才能学习或预先学习。由本地主义和人工通用智力的动机，我们将人类创新的先验建模为抽象的视觉任务，例如角色和涂鸦识别。这产生了一个白色框模型，该模型通过模仿人类自然地``扭曲''的对象来学习通用表现相似性。在这个认知启发的相似性空间中，仅使用最近的邻居分类，我们只需$ 1 $ - $ 10 $的示例，而没有预处理。这与使用大量预处理的几个射击学习有所不同。在MNIST，EMNIST，OMNIGLOT和QuickDraw基准测试的小型数据中，我们的表现都超过了现代神经网络和古典ML。对于无监督的学习，通过学习以$ k $ - 均方类风格的非欧国人的一般性相似性空间，我们通过将人类直觉的原型作为集群质心来实现抽象概念的多种视觉实现。

Humans can generalize from only a few examples and from little pretraining on similar tasks. Yet, machine learning (ML) typically requires large data to learn or pre-learn to transfer. Motivated by nativism and artificial general intelligence, we directly model human-innate priors in abstract visual tasks such as character and doodle recognition. This yields a white-box model that learns general-appearance similarity by mimicking how humans naturally ``distort'' an object at first sight. Using just nearest-neighbor classification on this cognitively-inspired similarity space, we achieve human-level recognition with only $1$--$10$ examples per class and no pretraining. This differs from few-shot learning that uses massive pretraining. In the tiny-data regime of MNIST, EMNIST, Omniglot, and QuickDraw benchmarks, we outperform both modern neural networks and classical ML. For unsupervised learning, by learning the non-Euclidean, general-appearance similarity space in a $k$-means style, we achieve multifarious visual realizations of abstract concepts by generating human-intuitive archetypes as cluster centroids.

下载PDF全文

下载文献需遵守相关版权规定

论文标题