用于单眼3D对象检测的增强轴向精炼网络

论文标题

用于单眼3D对象检测的增强轴向精炼网络

Reinforced Axial Refinement Network for Monocular 3D Object Detection

论文作者

Liu, Lijie, Wu, Chufan, Lu, Jiwen, Xie, Lingxi, Zhou, Jie, Tian, Qi

论文摘要

单眼3D对象检测旨在从2D输入图像中提取对象的3D位置和属性。这是一个不当的问题，由于深度不足的摄像机在信息丢失中遇到了重大困难。传统方法从空间中样本3D边界框，并推断目标对象与每个对象之间的关系，但是，在3D空间中，有效样品的概率相对较小。为了提高采样效率，我们建议从初始预测开始，并逐渐将其朝向地面真理进行完善，每个步骤中只有一个3D参数更改。这需要设计一项策略，该政策在几个步骤后获得奖励，因此我们采用强化学习来优化它。拟议的框架，增强的轴向精炼网络（RAR-NET）是一个后处理阶段，可以自由整合到现有的单眼3D检测方法中，并提高Kitti数据集的性能，并具有较小的额外计算成本。

Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image. This is an ill-posed problem with a major difficulty lying in the information loss by depth-agnostic cameras. Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space. To improve the efficiency of sampling, we propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step. This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it. The proposed framework, Reinforced Axial Refinement Network (RAR-Net), serves as a post-processing stage which can be freely integrated into existing monocular 3D detection methods, and improve the performance on the KITTI dataset with small extra computational costs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题