使用RGB-D数据对上下文感知的6D姿势估算已知对象

论文标题

使用RGB-D数据对上下文感知的6D姿势估算已知对象

Context-aware 6D Pose Estimation of Known Objects using RGB-D data

论文作者

Kumar, Ankit, Shukla, Priya, Kushwaha, Vandana, Nandi, G. C.

论文摘要

6D对象姿势估计一直是计算机视觉和机器人技术领域的研究主题。许多现代世界应用，例如机器人抓紧，操纵，自动导航等，都需要场景中存在的对象的正确姿势来执行其特定任务。当对象放在混乱的场景中并且遮挡水平很高时，它变得更加困难。先前的工作试图克服这个问题，但无法实现在现实世界应用中可靠的准确性。在本文中，我们提出了一种与先前的工作不同的体系结构。它利用了我们可用的有关对象的上下文信息。我们提出的架构根据它们的类型分别对待对象；对称和非对称。由于其内在差异，与对称对象相比，更深的估计器和炼油机网络对用于非对称对象。我们的实验表明，针对先前的最新性繁殖，在LineMod数据集中的准确性约为3.2％，这被认为是遮挡和混乱的场景中姿势估计的基准。我们的结果还表明，我们获得的推理时间足以实时使用。

6D object pose estimation has been a research topic in the field of computer vision and robotics. Many modern world applications like robot grasping, manipulation, autonomous navigation etc, require the correct pose of objects present in a scene to perform their specific task. It becomes even harder when the objects are placed in a cluttered scene and the level of occlusion is high. Prior works have tried to overcome this problem but could not achieve accuracy that can be considered reliable in real-world applications. In this paper, we present an architecture that, unlike prior work, is context-aware. It utilizes the context information available to us about the objects. Our proposed architecture treats the objects separately according to their types i.e; symmetric and non-symmetric. A deeper estimator and refiner network pair is used for non-symmetric objects as compared to symmetric due to their intrinsic differences. Our experiments show an enhancement in the accuracy of about 3.2% over the LineMOD dataset, which is considered a benchmark for pose estimation in the occluded and cluttered scenes, against the prior state-of-the-art DenseFusion. Our results also show that the inference time we got is sufficient for real-time usage.

下载PDF全文

下载文献需遵守相关版权规定

论文标题