MotionNet：基于鸟类视图图的自动驾驶的联合感知和运动预测

论文标题

MotionNet：基于鸟类视图图的自动驾驶的联合感知和运动预测

MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

论文作者

Wu, Pengxiang, Chen, Siheng, Metaxas, Dimitris

论文摘要

可靠地感知环境状态的能力，尤其是物体的存在及其运动行为，对于自主驾驶至关重要。在这项工作中，我们提出了一个称为MotionNet的有效的深层模型，以共同执行3D点云的感知和运动预测。 MotionNet采用一系列LiDAR作为输入并输出鸟类视图（BEV）图，该图编码每个网格单元格中的对象类别和运动信息。 MotionNet的骨干是一种新型时空的金字塔网络，它以层次的方式提取了深空和时间特征。为了在空间和时间上强制实施预测的平稳性，运动网的训练通过新颖的空间和时间一致性损失进一步正规化。广泛的实验表明，所提出的方法总体上优于最新方法，包括最新的场景流和基于3D对象检测方法。这表明了所提出的方法的潜在值，该方法用作基于边界盒的系统的备份，并在自动驾驶中向运动计划者提供互补信息。代码可在https://github.com/pxiangwu/motionnet上找到。

The ability to reliably perceive the environmental states, particularly the existence of objects and their motion behavior, is crucial for autonomous driving. In this work, we propose an efficient deep model, called MotionNet, to jointly perform perception and motion prediction from 3D point clouds. MotionNet takes a sequence of LiDAR sweeps as input and outputs a bird's eye view (BEV) map, which encodes the object category and motion information in each grid cell. The backbone of MotionNet is a novel spatio-temporal pyramid network, which extracts deep spatial and temporal features in a hierarchical fashion. To enforce the smoothness of predictions over both space and time, the training of MotionNet is further regularized with novel spatial and temporal consistency losses. Extensive experiments show that the proposed method overall outperforms the state-of-the-arts, including the latest scene-flow- and 3D-object-detection-based methods. This indicates the potential value of the proposed method serving as a backup to the bounding-box-based system, and providing complementary information to the motion planner in autonomous driving. Code is available at https://github.com/pxiangwu/MotionNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题