论文标题
通过自适应特征选择在尺度变化方面进行更好的对象检测
Towards Better Object Detection in Scale Variation with Adaptive Feature Selection
论文作者
论文摘要
利用锥体特征表示以解决对象实例中的规模变化问题是一种普遍的做法。但是,它们中的大多数仍然仅基于单层表示,在一定范围内的量表中预测对象,从而产生较低的检测性能。为此,我们提出了一个新颖的自适应特征选择模块(AFSM),以自动学习以数据驱动方式在通道维度中融合多级表示的方式。它显着提高了具有特征金字塔结构的检测器的性能,同时引入了几乎免费的推理开销。此外,提出了一种班级感知的抽样机制(CASM),以根据每个类别的统计特征将采样比重新加权与每个训练图像进行重新加权,以解决类不平衡问题。这对于提高次要阶级的表现至关重要。实验结果证明了该方法的有效性,在VOC数据集上,83.04%的映射为15.96 fps,在Visdrone-det验证子集上分别超过其他最先进的检测器,分别在Visdrone-Det验证子集上进行了39.48%的AP。该代码可在https://github.com/zehuigong/afsm.git上找到。
It is a common practice to exploit pyramidal feature representation to tackle the problem of scale variation in object instances. However, most of them still predict the objects in a certain range of scales based solely or mainly on a single-level representation, yielding inferior detection performance. To this end, we propose a novel adaptive feature selection module (AFSM), to automatically learn the way to fuse multi-level representations in the channel dimension, in a data-driven manner. It significantly improves the performance of the detectors that have a feature pyramid structure, while introducing nearly free inference overhead. Moreover, a class-aware sampling mechanism (CASM) is proposed to tackle the class imbalance problem, by re-weighting the sampling ratio to each of the training images, based on the statistical characteristics of each class. This is crucial to improve the performance of the minor classes. Experimental results demonstrate the effectiveness of the proposed method, with 83.04% mAP at 15.96 FPS on the VOC dataset, and 39.48% AP on the VisDrone-DET validation subset, respectively, outperforming other state-of-the-art detectors considerably. The code is available at https://github.com/ZeHuiGong/AFSM.git.