Pointdistiller：结构化知识蒸馏，向有效而紧凑的3D检测

论文标题

Pointdistiller：结构化知识蒸馏，向有效而紧凑的3D检测

PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection

论文作者

Zhang, Linfeng, Dong, Runpei, Tai, Hung-Shuo, Ma, Kaisheng

论文摘要

Point Cloud表示学习中的显着突破性在现实世界应用中提高了它们的用法，例如自动驾驶汽车和虚拟现实。但是，这些应用通常不仅需要准确，而且还需要有效的3D对象检测。最近，已提出知识蒸馏作为一种有效的模型压缩技术，该技术将知识从过度参数化的教师转移到轻量级学生，并在2D视觉中实现了一致的有效性。但是，由于点云的稀疏性和不规则性，直接应用基于图像的知识蒸馏方法来点云检测器通常会导致性能不令人满意。为了填补空白，本文提出了PointDistiller，这是基于点云的3D检测的结构化知识蒸馏框架。具体而言，PointDistiller包括局部蒸馏，以动态图卷积和重量级的学习策略提取和提取点云的局部几何结构，这突出了学生对关键点或体素的学习，以提高知识蒸馏效率。对基于体素和原始点的检测器进行的广泛实验证明了我们方法在以前的七种知识蒸馏方法上的有效性。例如，我们的4倍压缩点柱学生在BEV和3D对象检测方面取得了2.8和3.4的地图改进，分别超过了其教师的0.9和1.8地图。代码已在https://github.com/runpeidong/pointDistiller上发布。

The remarkable breakthroughs in point cloud representation learning have boosted their usage in real-world applications such as self-driving cars and virtual reality. However, these applications usually have an urgent requirement for not only accurate but also efficient 3D object detection. Recently, knowledge distillation has been proposed as an effective model compression technique, which transfers the knowledge from an over-parameterized teacher to a lightweight student and achieves consistent effectiveness in 2D vision. However, due to point clouds' sparsity and irregularity, directly applying previous image-based knowledge distillation methods to point cloud detectors usually leads to unsatisfactory performance. To fill the gap, this paper proposes PointDistiller, a structured knowledge distillation framework for point clouds-based 3D detection. Concretely, PointDistiller includes local distillation which extracts and distills the local geometric structure of point clouds with dynamic graph convolution and reweighted learning strategy, which highlights student learning on the crucial points or voxels to improve knowledge distillation efficiency. Extensive experiments on both voxels-based and raw points-based detectors have demonstrated the effectiveness of our method over seven previous knowledge distillation methods. For instance, our 4X compressed PointPillars student achieves 2.8 and 3.4 mAP improvements on BEV and 3D object detection, outperforming its teacher by 0.9 and 1.8 mAP, respectively. Codes have been released at https://github.com/RunpeiDong/PointDistiller.

下载PDF全文

下载文献需遵守相关版权规定

论文标题