端到端的上下文感知和相互作用变压器的预测

论文标题

端到端的上下文感知和相互作用变压器的预测

End-to-end Contextual Perception and Prediction with Interaction Transformer

论文作者

Li, Lingyun Luke, Yang, Bin, Liang, Ming, Zeng, Wenyuan, Ren, Mengye, Segal, Sean, Urtasun, Raquel

论文摘要

在本文中，我们解决了在3D中检测对象的问题，并在自动驾驶的背景下预测了他们的未来运动。为了实现这一目标，我们设计了一种新颖的方法，该方法明确考虑了演员之间的相互作用。为了捕获它们的时空依赖性，我们提出了一个具有新型变压器架构的复发性神经网络，我们称之为相互作用变压器。重要的是，我们的模型可以端到端训练，并实时运行。我们验证了两个具有挑战性的现实数据集：ATG4D和NUSCENES的方法。我们表明，我们的方法可以胜过两个数据集上的最新方法。特别是，我们显着改善了估计的未来轨迹之间的社会依从性，从而导致预测的参与者之间的碰撞少得多。

In this paper, we tackle the problem of detecting objects in 3D and forecasting their future motion in the context of self-driving. Towards this goal, we design a novel approach that explicitly takes into account the interactions between actors. To capture their spatial-temporal dependencies, we propose a recurrent neural network with a novel Transformer architecture, which we call the Interaction Transformer. Importantly, our model can be trained end-to-end, and runs in real-time. We validate our approach on two challenging real-world datasets: ATG4D and nuScenes. We show that our approach can outperform the state-of-the-art on both datasets. In particular, we significantly improve the social compliance between the estimated future trajectories, resulting in far fewer collisions between the predicted actors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题