物体姿势估计的神经对应场

论文标题

物体姿势估计的神经对应场

Neural Correspondence Field for Object Pose Estimation

论文作者

Huang, Lin, Hodan, Tomas, Ma, Lingni, Zhang, Linguang, Tran, Luan, Twigg, Christopher, Wu, Po-Chen, Yuan, Junsong, Keskin, Cem, Wang, Robert

论文摘要

我们提出了一种估算具有单个RGB图像的可用3D模型的刚性对象的6DOF姿势的方法。与基于经典的对应方法不同，该方法可以预测输入图像的像素的3D对象坐标，该建议的方法可以预测3D对象坐标在相机Froustum中采样的3D查询点。从像素到3D点的移动是受到最新的3D重建方法的启发，可以对整个对象（包括（自我）遮挡的部分）进行推理。对于与像素对齐图像特征相关的3D查询点，我们训练完全连接的神经网络来预测：（i）相应的3D对象坐标，以及（ii）签名到对象表面的签名距离，仅针对表面附近的查询点定义。我们将该网络实现的映射称为神经通信字段。然后，通过Kabsch-Ransac算法从预测的3D-3D对应关系中稳健地估计对象姿势。所提出的方法在三个BOP数据集上实现了最先进的结果，并且在咬合挑战性的情况下表现出了优越。项目网站的网址为：linhuang17.github.io/ncf。

We propose a method for estimating the 6DoF pose of a rigid object with an available 3D model from a single RGB image. Unlike classical correspondence-based methods which predict 3D object coordinates at pixels of the input image, the proposed method predicts 3D object coordinates at 3D query points sampled in the camera frustum. The move from pixels to 3D points, which is inspired by recent PIFu-style methods for 3D reconstruction, enables reasoning about the whole object, including its (self-)occluded parts. For a 3D query point associated with a pixel-aligned image feature, we train a fully-connected neural network to predict: (i) the corresponding 3D object coordinates, and (ii) the signed distance to the object surface, with the first defined only for query points in the surface vicinity. We call the mapping realized by this network as Neural Correspondence Field. The object pose is then robustly estimated from the predicted 3D-3D correspondences by the Kabsch-RANSAC algorithm. The proposed method achieves state-of-the-art results on three BOP datasets and is shown superior especially in challenging cases with occlusion. The project website is at: linhuang17.github.io/NCF.

下载PDF全文

下载文献需遵守相关版权规定

论文标题