3D对象检测的图形神经网络中的动态边缘权重

论文标题

3D对象检测的图形神经网络中的动态边缘权重

Dynamic Edge Weights in Graph Neural Networks for 3D Object Detection

论文作者

Thakur, Sumesh, Peethambaran, Jiju

论文摘要

强大而准确的3D检测系统是自动驾驶汽车不可或缺的一部分。传统上，大多数3D对象检测算法都集中在使用体素电网或鸟视图（BEV）处理3D点云上。然而，最近的工作证明了图形神经网络（GNN）作为3D对象检测的有前途的方法。在这项工作中，我们在GNN中提出了一种基于注意力的特征聚合技术，用于检测LIDAR扫描中的对象。我们首先采用远距离感知的下采样方案，不仅可以增强算法性能，而且还保留了对象的最大几何特征，即使它们远离传感器。在GNN的每一层中，除线性转换外，还将每个节点输入特征映射到相应的较高级别的特征外，还通过指定第一个环邻域中不同节点的不同权重来通过指定不同的权重。蒙面的注意力隐含地说明了每个节点的基本邻域图结构，也消除了昂贵的矩阵操作的需求，从而提高了检测准确性而不会损害性能。 KITTI数据集上的实验表明，我们的方法可为3D对象检测产生可比的结果。

A robust and accurate 3D detection system is an integral part of autonomous vehicles. Traditionally, a majority of 3D object detection algorithms focus on processing 3D point clouds using voxel grids or bird's eye view (BEV). Recent works, however, demonstrate the utilization of the graph neural network (GNN) as a promising approach to 3D object detection. In this work, we propose an attention based feature aggregation technique in GNN for detecting objects in LiDAR scan. We first employ a distance-aware down-sampling scheme that not only enhances the algorithmic performance but also retains maximum geometric features of objects even if they lie far from the sensor. In each layer of the GNN, apart from the linear transformation which maps the per node input features to the corresponding higher level features, a per node masked attention by specifying different weights to different nodes in its first ring neighborhood is also performed. The masked attention implicitly accounts for the underlying neighborhood graph structure of every node and also eliminates the need of costly matrix operations thereby improving the detection accuracy without compromising the performance. The experiments on KITTI dataset show that our method yields comparable results for 3D object detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题