长尾分类通过保持好处并消除不良动量因果效应

论文标题

长尾分类通过保持好处并消除不良动量因果效应

Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect

论文作者

Tang, Kaihua, Huang, Jianqiang, Zhang, Hanwang

论文摘要

随着班级规模的增长，在许多类中维持平衡的数据集很具有挑战性，因为数据本质上是长期的。当利益样本中的样本在一个可收集单元中彼此共存，例如一个图像中的多个视觉实例。因此，长尾分类是大规模深度学习的关键。但是，现有方法主要基于缺乏基本理论的重新加权/重新采样启发式方法。在本文中，我们建立了一个因果推理框架，该框架不仅揭开了以前方法的原因，而且还提供了一种新的原理解决方案。具体而言，我们的理论表明，SGD动量本质上是长尾分类的混杂因素。一方面，它具有有害的因果作用，误导了尾巴偏向头部的尾巴。另一方面，其诱导的调解也使表示和头部预测受益。我们的框架优雅地通过追求由输入样本引起的直接因果效应来优雅地消除了动量的矛盾效应。特别是，我们在培训中使用因果干预，并在推理中采用反事实推理，以消除“不良”，同时保持“好”。我们在三个长尾视觉识别基准：长尾cifar-10/-100，用于图像分类的Imagenet-lt和LVIS（例如分割）上实现了新的最先进。

As the class size grows, maintaining a balanced dataset across many classes is challenging because the data are long-tailed in nature; it is even impossible when the sample-of-interest co-exists with each other in one collectable unit, e.g., multiple visual instances in one image. Therefore, long-tailed classification is the key to deep learning at scale. However, existing methods are mainly based on re-weighting/re-sampling heuristics that lack a fundamental theory. In this paper, we establish a causal inference framework, which not only unravels the whys of previous methods, but also derives a new principled solution. Specifically, our theory shows that the SGD momentum is essentially a confounder in long-tailed classification. On one hand, it has a harmful causal effect that misleads the tail prediction biased towards the head. On the other hand, its induced mediation also benefits the representation learning and head prediction. Our framework elegantly disentangles the paradoxical effects of the momentum, by pursuing the direct causal effect caused by an input sample. In particular, we use causal intervention in training, and counterfactual reasoning in inference, to remove the "bad" while keep the "good". We achieve new state-of-the-arts on three long-tailed visual recognition benchmarks: Long-tailed CIFAR-10/-100, ImageNet-LT for image classification and LVIS for instance segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题