SIOD：单个实例每个类别的每个图像以进行对象检测

论文标题

SIOD：单个实例每个类别的每个图像以进行对象检测

SIOD: Single Instance Annotated Per Category Per Image for Object Detection

论文作者

Li, Hanjun, Pan, Xingjia, Yan, Ke, Tang, Fan, Zheng, Wei-Shi

论文摘要

不完美数据下的对象检测最近受到了极大的关注。由于缺乏实例级注释，弱监督的对象检测（WSOD）遭受严重的本地化问题，而半监督的对象检测（SSOD）仍然具有挑战性，这是由标记和未标记数据之间的相间差异引起的。在这项研究中，我们提出了单个实例注释对象检测（SIOD），仅需要一个图像中每个现有类别的一个实例注释。 SIOD从跨任务（WSOD）或Inter-Imbage（SSOD）差异降低了差异，SIOD提供了更可靠和丰富的先验知识，以挖掘其余未标记的实例，并通过注释成本和绩效进行交易。在SIOD设置下，我们提出了一个简单而有效的框架，称为双挖掘（DMiner），该框架由基于相似性的伪标签生成模块（SPLG）和像素级组对比度学习模块（PGCL）组成。 SPLG首先从特征表示空间中挖掘出潜在实例，以减轻注释缺失问题。为了避免被不准确的伪标签误导，我们建议PGCL提高对假伪标签的容忍度。对MS Coco的广泛实验验证了SIOD设置的可行性以及所提出的方法的优越性，与基线方法相比，该方法获得了一致且显着的改进，并且通过完全监督的对象检测方法（FSOD）方法获得了可比的结果，只有40％的实例。

Object detection under imperfect data receives great attention recently. Weakly supervised object detection (WSOD) suffers from severe localization issues due to the lack of instance-level annotation, while semi-supervised object detection (SSOD) remains challenging led by the inter-image discrepancy between labeled and unlabeled data. In this study, we propose the Single Instance annotated Object Detection (SIOD), requiring only one instance annotation for each existing category in an image. Degraded from inter-task (WSOD) or inter-image (SSOD) discrepancies to the intra-image discrepancy, SIOD provides more reliable and rich prior knowledge for mining the rest of unlabeled instances and trades off the annotation cost and performance. Under the SIOD setting, we propose a simple yet effective framework, termed Dual-Mining (DMiner), which consists of a Similarity-based Pseudo Label Generating module (SPLG) and a Pixel-level Group Contrastive Learning module (PGCL). SPLG firstly mines latent instances from feature representation space to alleviate the annotation missing problem. To avoid being misled by inaccurate pseudo labels, we propose PGCL to boost the tolerance to false pseudo labels. Extensive experiments on MS COCO verify the feasibility of the SIOD setting and the superiority of the proposed method, which obtains consistent and significant improvements compared to baseline methods and achieves comparable results with fully supervised object detection (FSOD) methods with only 40% instances annotated.

下载PDF全文

下载文献需遵守相关版权规定

论文标题