背景分裂：在背景的海洋中找到罕见的课程

论文标题

背景分裂：在背景的海洋中找到罕见的课程

Background Splitting: Finding Rare Classes in a Sea of Background

论文作者

Mullapudi, Ravi Teja, Poms, Fait, Mark, William R., Ramanan, Deva, Fatahalian, Kayvon

论文摘要

我们专注于训练精确的深层模型的现实问题，用于少数稀有类别的图像分类。在这些情况下，几乎所有图像都属于数据集中的背景类别（>数据集的95％是背景）。我们证明，在这种极端不平衡的情况下，在不平衡数据集中进行培训的标准微调方法和最先进的方法都不会产生准确的深层模型。我们的主要观察结果是，通过利用现有预训练模型的视觉知识，可以大大降低由于背景类别的极端失衡。具体而言，使用预训练的模型在训练过程中，背景类别将“分为较小，更相干的伪类别”。我们通过添加辅助损失来模仿现有的，预训练的图像分类模型的预测，将背景拆分融合到图像分类模型中。请注意，此过程是自动的，不需要其他手动标签。辅助损失通过要求区分以前均匀的背景实例并减少少量稀有类别的阳性来区分共享网络中继的特征表示。我们还表明，BG分裂可以与其他背景不平衡方法结合使用，以进一步提高性能。我们在iNaturalist数据集的修改版本上评估了我们的方法，在训练过程中，只有一小部分稀有类别标签（所有其他图像都标记为背景）。通过共同学习识别ImageNet类别和选择的Inturalist类别，当99.98％的数据为背景时，我们的方法的表现高于基线的42.3个地图点，而当数据的98.30％为背景时，地图高于SOTA基线。

We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories. In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background). We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance. Our key observation is that the extreme imbalance due to the background category can be drastically reduced by leveraging visual knowledge from an existing pre-trained model. Specifically, the background category is "split" into smaller and more coherent pseudo-categories during training using a pre-trained model. We incorporate background splitting into an image classification model by adding an auxiliary loss that learns to mimic the predictions of the existing, pre-trained image classification model. Note that this process is automatic and requires no additional manual labels. The auxiliary loss regularizes the feature representation of the shared network trunk by requiring it to discriminate between previously homogeneous background instances and reduces overfitting to the small number of rare category positives. We also show that BG splitting can be combined with other background imbalance methods to further improve performance. We evaluate our method on a modified version of the iNaturalist dataset where only a small subset of rare category labels are available during training (all other images are labeled as background). By jointly learning to recognize ImageNet categories and selected iNaturalist categories, our approach yields performance that is 42.3 mAP points higher than a fine-tuning baseline when 99.98% of the data is background, and 8.3 mAP points higher than SotA baselines when 98.30% of the data is background.

下载PDF全文

下载文献需遵守相关版权规定

论文标题