课程标签：重新审视半监督学习的伪标记

论文标题

课程标签：重新审视半监督学习的伪标记

Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

论文作者

Cascante-Bonilla, Paola, Tan, Fuwen, Qi, Yanjun, Ordonez, Vicente

论文摘要

在本文中，我们在半监督学习的背景下重新审视了伪标记的想法，其中学习算法可以访问一小部分标记的样本和大量未标记的样本。伪标记是通过在未标记的设置中使用伪标记的样品，通过使用经过训练的标记样品组合和任何先前伪标记的样品的组合，并在自训练周期中重复此过程的模型来起作用。当前的方法似乎放弃了这种方法，支持一致性正则化方法，这些方法在未标记的样本上训练不同风格的自我监督损失的组合，并在标记的样本上训练标准监督损失。我们从经验上证明，伪标记实际上可以与最先进的竞争力竞争，同时在未标记的集合中更适合于分布样品更具弹性。我们确定了两个关键因素，这些因素允许伪标记以实现如此出色的结果（1）应用课程学习原理以及（2）在每个自我训练周期之前重新启动模型参数来避免概念漂移。我们仅使用4,000个标记的样品获得CIFAR-10的精度，而Imagenet-ILSVRC的TOP-1精度为68.87％，仅使用10％的标签样品。该代码可从https://github.com/uvavision/curriculum-labeling获得

In this paper we revisit the idea of pseudo-labeling in the context of semi-supervised learning where a learning algorithm has access to a small set of labeled samples and a large set of unlabeled samples. Pseudo-labeling works by applying pseudo-labels to samples in the unlabeled set by using a model trained on the combination of the labeled samples and any previously pseudo-labeled samples, and iteratively repeating this process in a self-training cycle. Current methods seem to have abandoned this approach in favor of consistency regularization methods that train models under a combination of different styles of self-supervised losses on the unlabeled samples and standard supervised losses on the labeled samples. We empirically demonstrate that pseudo-labeling can in fact be competitive with the state-of-the-art, while being more resilient to out-of-distribution samples in the unlabeled set. We identify two key factors that allow pseudo-labeling to achieve such remarkable results (1) applying curriculum learning principles and (2) avoiding concept drift by restarting model parameters before each self-training cycle. We obtain 94.91% accuracy on CIFAR-10 using only 4,000 labeled samples, and 68.87% top-1 accuracy on Imagenet-ILSVRC using only 10% of the labeled samples. The code is available at https://github.com/uvavision/Curriculum-Labeling

下载PDF全文

下载文献需遵守相关版权规定

论文标题