论文标题
升级:半监督的3D对象检测,而无需共享原始级别未标记的场景
UpCycling: Semi-supervised 3D Object Detection without Sharing Raw-level Unlabeled Scenes
论文作者
论文摘要
半监督学习(SSL)在自动驾驶中受到越来越多的关注,以减轻3D注释的巨大负担。在本文中,我们提出了升级,这是一种用于3D对象检测的新型SSL框架,其零附加原始点云:从未标记的去识别的中间特征(即粉碎的数据)中学习以保留隐私。由于这些中间特征是由推理管道自然产生的,因此自动驾驶汽车不需要其他计算。但是,对于未标记的特征级别场景产生有效的一致性损失是一个关键的挑战。 3D对象检测的最新SSL框架在应用于中间功能时,在未标记的原始点场景的不同增强场景之间执行一致性正则化变得有害。为了解决这个问题,我们介绍了混合伪标签和特征级地面真相采样(F-GT)的新型组合,该标签(F-GT)可以安全地增强未标记的多类型3D场景特征并提供高质量的监督。我们对两个代表性3D对象检测模型进行升级:第二-iou和PV-RCNN。在广泛使用的数据集(Waymo,Kitti和Lyft)上进行的实验证明了在功能级别上应用的其他增强方法的表现是否优于其他增强方法。此外,在保留隐私权的同时,升级效果更好或与最先进的方法相比,在域适应和部分标签场景中使用原始级别未标记的数据。
Semi-supervised Learning (SSL) has received increasing attention in autonomous driving to reduce the enormous burden of 3D annotation. In this paper, we propose UpCycling, a novel SSL framework for 3D object detection with zero additional raw-level point cloud: learning from unlabeled de-identified intermediate features (i.e., smashed data) to preserve privacy. Since these intermediate features are naturally produced by the inference pipeline, no additional computation is required on autonomous vehicles. However, generating effective consistency loss for unlabeled feature-level scene turns out to be a critical challenge. The latest SSL frameworks for 3D object detection that enforce consistency regularization between different augmentations of an unlabeled raw-point scene become detrimental when applied to intermediate features. To solve the problem, we introduce a novel combination of hybrid pseudo labels and feature-level Ground Truth sampling (F-GT), which safely augments unlabeled multi-type 3D scene features and provides high-quality supervision. We implement UpCycling on two representative 3D object detection models: SECOND-IoU and PV-RCNN. Experiments on widely-used datasets (Waymo, KITTI, and Lyft) verify that UpCycling outperforms other augmentation methods applied at the feature level. In addition, while preserving privacy, UpCycling performs better or comparably to the state-of-the-art methods that utilize raw-level unlabeled data in both domain adaptation and partial-label scenarios.