学习建筑3D形状类别的视觉政策

论文标题

学习建筑3D形状类别的视觉政策

Learning visual policies for building 3D shape categories

论文作者

Pashevich, Alexander, Kalevatykh, Igor, Laptev, Ivan, Schmid, Cordelia

论文摘要

操纵和组装任务需要根据环境和最终目标进行非平凡的行动计划。该域中的先前工作通常会从已知的原始词集中组装出特定的对象实例。相比之下，我们旨在处理各种基原始人集并构建形状类别的不同对象。给定一个类别的一个对象实例，例如拱门和二进制形状分类器，我们学习了视觉策略，以组装同一类别的其他实例。特别是，我们提出了一个拆卸程序，并学习一项国家政策，该政策在州空间中发现新的对象实例及其集会计划。然后，我们在观察空间中渲染模拟状态，并学习热图表示，以预测给定输入图像的替代作用。为了验证我们的方法，我们首先证明了其在状态空间中构建对象类别的效率。然后，我们展示了视觉策略从不同原始构建拱门的成功。此外，我们证明了（i）我们方法使用其他原始词重新组装对象的反应能力，以及（ii）我们的策略的稳健性能表现出类似于训练过程中使用的构建块的看不见的原始物质。我们的视觉组装策略没有真实图像进行训练，并在实际机器人进行评估时达到95％的成功率。

Manipulation and assembly tasks require non-trivial planning of actions depending on the environment and the final goal. Previous work in this domain often assembles particular instances of objects from known sets of primitives. In contrast, we aim to handle varying sets of primitives and to construct different objects of a shape category. Given a single object instance of a category, e.g. an arch, and a binary shape classifier, we learn a visual policy to assemble other instances of the same category. In particular, we propose a disassembly procedure and learn a state policy that discovers new object instances and their assembly plans in state space. We then render simulated states in the observation space and learn a heatmap representation to predict alternative actions from a given input image. To validate our approach, we first demonstrate its efficiency for building object categories in state space. We then show the success of our visual policies for building arches from different primitives. Moreover, we demonstrate (i) the reactive ability of our method to re-assemble objects using additional primitives and (ii) the robust performance of our policy for unseen primitives resembling building blocks used during training. Our visual assembly policies are trained with no real images and reach up to 95% success rate when evaluated on a real robot.

下载PDF全文

下载文献需遵守相关版权规定

论文标题