长尾对象识别的均衡损失

论文标题

长尾对象识别的均衡损失

Equalization Loss for Long-Tailed Object Recognition

论文作者

Tan, Jingru, Wang, Changbao, Li, Buyu, Li, Quanquan, Ouyang, Wanli, Yin, Changqing, Yan, Junjie

论文摘要

使用卷积神经网络（CNN）的对象识别技术取得了巨大的成功。但是，最新的对象检测方法在大型词汇和长尾数据集上的性能仍然很差，例如LVIS。在这项工作中，我们从一个新的角度分析了这个问题：一个类别的每个正样本都可以看作是其他类别的负样本，使尾巴类别获得更多令人沮丧的梯度。基于它，我们提出了一个简单但有效的损失，称为均衡损失，以解决长尾稀有类别的问题，只是简单地忽略了稀有类别的这些梯度。均衡损失可以保护稀有类别的学习免于在网络参数更新期间处于劣势。因此，该模型能够为罕见类别的对象学习更好的判别特征。与Mask R-CNN基线相比，我们的方法没有任何铃铛和口哨，在具有挑战性的LVIS基准测试中，稀有和常见类别的AP收益为4.1％和4.8％。随着有效均衡损失的利用，我们终于在2019年LVIS挑战赛中赢得了第一名。代码已在以下网址提供：

Object recognition techniques using convolutional neural networks (CNN) have achieved great success. However, state-of-the-art object detection methods still perform poorly on large vocabulary and long-tailed datasets, e.g. LVIS. In this work, we analyze this problem from a novel perspective: each positive sample of one category can be seen as a negative sample for other categories, making the tail categories receive more discouraging gradients. Based on it, we propose a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories by simply ignoring those gradients for rare categories. The equalization loss protects the learning of rare categories from being at a disadvantage during the network parameter updating. Thus the model is capable of learning better discriminative features for objects of rare classes. Without any bells and whistles, our method achieves AP gains of 4.1% and 4.8% for the rare and common categories on the challenging LVIS benchmark, compared to the Mask R-CNN baseline. With the utilization of the effective equalization loss, we finally won the 1st place in the LVIS Challenge 2019. Code has been made available at: https: //github.com/tztztztztz/eql.detectron2

下载PDF全文

下载文献需遵守相关版权规定

论文标题