类别级别6D对象姿势和使用自我监督的深层变形网络的尺寸估计

论文标题

类别级别6D对象姿势和使用自我监督的深层变形网络的尺寸估计

Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks

论文作者

Lin, Jiehong, Wei, Zewei, Ding, Changxing, Jia, Kui

论文摘要

很难精确地注释对象实例及其语义在3D空间中，因此，合成数据被广泛用于这些任务，例如类别级别6D对象姿势和大小估计。然而，合成域中的简易注释带来了合成到真实（SIM2REAL）域间隙的下行效应。在这项工作中，我们的目标是在SIM2REAL，无监督的域适应范围的任务设置中解决此问题。我们提出了一种基于新颖的深层变形网络建立的方法，该方法缩短为DPDN。 DPDN学会了分类形状先验的变形特征以匹配对象观察的特征，因此能够在特征空间中建立深层对应，以直接回归对象姿势和尺寸。为了减少SIM2REAL域间隙，我们通过一致性学习在DPDN上制定了一个新颖的自我监督目标。更具体地说，我们对每个对象观察进行了两个刚性转换，并分别将它们送入DPDN以产生双重的预测集。除了平行学习之外，还采用了一个矛盾术语来保持二元预测之间的交叉一致性，以提高DPDN对姿势变化的敏感性，而单独的一致性则用于在每个学习本身内实施自我适应。我们在合成摄像头25和现实世界275数据集的两个训练集上训练DPDN；我们的结果优于Real275测试集中的现有方法，在无监督和监督的设置下。消融研究还验证了我们设计的功效。我们的代码将在https://github.com/jiehonglin/self-dpdn公开发布。

It is difficult to precisely annotate object instances and their semantics in 3D space, and as such, synthetic data are extensively used for these tasks, e.g., category-level 6D object pose and size estimation. However, the easy annotations in synthetic domains bring the downside effect of synthetic-to-real (Sim2Real) domain gap. In this work, we aim to address this issue in the task setting of Sim2Real, unsupervised domain adaptation for category-level 6D object pose and size estimation. We propose a method that is built upon a novel Deep Prior Deformation Network, shortened as DPDN. DPDN learns to deform features of categorical shape priors to match those of object observations, and is thus able to establish deep correspondence in the feature space for direct regression of object poses and sizes. To reduce the Sim2Real domain gap, we formulate a novel self-supervised objective upon DPDN via consistency learning; more specifically, we apply two rigid transformations to each object observation in parallel, and feed them into DPDN respectively to yield dual sets of predictions; on top of the parallel learning, an inter-consistency term is employed to keep cross consistency between dual predictions for improving the sensitivity of DPDN to pose changes, while individual intra-consistency ones are used to enforce self-adaptation within each learning itself. We train DPDN on both training sets of the synthetic CAMERA25 and real-world REAL275 datasets; our results outperform the existing methods on REAL275 test set under both the unsupervised and supervised settings. Ablation studies also verify the efficacy of our designs. Our code is released publicly at https://github.com/JiehongLin/Self-DPDN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题