重新平衡的暹罗对比度采矿，以进行长尾认可

论文标题

重新平衡的暹罗对比度采矿，以进行长尾认可

Rebalanced Siamese Contrastive Mining for Long-Tailed Recognition

论文作者

Zhong, Zhisheng, Cui, Jiequan, Li, Zeming, Lo, Eric, Sun, Jian, Jia, Jiaya

论文摘要

深度神经网络在严重的类不平衡数据集上的表现不佳。鉴于对比度学习的有希望的表现，我们提出了重新平衡的暹罗对比采矿（RESCOM）来应对识别不平衡的识别。基于数学分析和仿真结果，我们声称监督的对比学习在原始批次和暹罗批次水平上都遇到了双重平衡问题，这比长尾分类学习更为严重。在本文中，在原始批处理水平上，我们引入了级别均衡的监督对比损失，以分配不同类别的自适应权重。在暹罗批次级别，我们提出了一个级别平衡的队列，该队列维持所有类的键相同。此外，我们注意到，相对于对比度逻辑的不平衡对比损失梯度可以将其分解为阳性和负面因素，易于阳性和容易的负面因素将使对比度梯度消失。我们建议有监督的正面和负面对挖掘，以获取信息对对比度计算并改善表示形式学习的信息。最后，为了大致在两种观点之间最大化相互信息，我们提出了暹罗平衡的软性软件，并与一阶段训练的对比损失结合。广泛的实验表明，在多个长尾识别基准上，RESCON优于先前的方法。我们的代码和模型可公开可用：https：//github.com/dvlab-research/rescom。

Deep neural networks perform poorly on heavily class-imbalanced datasets. Given the promising performance of contrastive learning, we propose Rebalanced Siamese Contrastive Mining (ResCom) to tackle imbalanced recognition. Based on the mathematical analysis and simulation results, we claim that supervised contrastive learning suffers a dual class-imbalance problem at both the original batch and Siamese batch levels, which is more serious than long-tailed classification learning. In this paper, at the original batch level, we introduce a class-balanced supervised contrastive loss to assign adaptive weights for different classes. At the Siamese batch level, we present a class-balanced queue, which maintains the same number of keys for all classes. Furthermore, we note that the imbalanced contrastive loss gradient with respect to the contrastive logits can be decoupled into the positives and negatives, and easy positives and easy negatives will make the contrastive gradient vanish. We propose supervised hard positive and negative pairs mining to pick up informative pairs for contrastive computation and improve representation learning. Finally, to approximately maximize the mutual information between the two views, we propose Siamese Balanced Softmax and joint it with the contrastive loss for one-stage training. Extensive experiments demonstrate that ResCom outperforms the previous methods by large margins on multiple long-tailed recognition benchmarks. Our code and models are made publicly available at: https://github.com/dvlab-research/ResCom.

下载PDF全文

下载文献需遵守相关版权规定

论文标题