使用结构化过渡模型对图的动态推断

论文标题

使用结构化过渡模型对图的动态推断

Dynamic Inference on Graphs using Structured Transition Models

论文作者

Saxena, Saumya, Kroemer, Oliver

论文摘要

使机器人能够执行复杂的动态任务，例如在一次扫荡运动中拾起对象或推开墙壁以快速转弯是一个挑战性的问题。这些任务中隐含的动态交互对于成功执行此类任务至关重要。图神经网络（GNNS）提供了一种学习交互式系统动力学的原则方法，但随着相互作用数量的增加，可能会遭受缩放问题的困扰。此外，不充分探讨了使用基于GNN的模型进行最佳控制的问题。在这项工作中，我们提出了一种通过同时学习动态图结构以及系统的稳定且本地线性的前向模型来有效地学习交互系统动态的方法。动态图结构通过对图的边缘进行概率预测来编码沿轨迹的不断发展的接触模式。此外，我们在学习的图结构中引入了时间依赖性，使我们能够在执行过程中合并触点测量更新，从而实现更准确的远期预测。学到的稳定和局部线性动力学可以使用最佳控制算法（例如ILQR）进行长匹马计划和复杂交互式任务的控制。通过模拟和现实世界中的实验，我们通过使用学习的相互作用动力学来控制方法的性能，并证明对训练过程中未看到的更多对象和相互作用的概括。我们介绍了一种控制方案，该方案利用了联系测量更新，因此在执行过程中预测不准确是可靠的。

Enabling robots to perform complex dynamic tasks such as picking up an object in one sweeping motion or pushing off a wall to quickly turn a corner is a challenging problem. The dynamic interactions implicit in these tasks are critical towards the successful execution of such tasks. Graph neural networks (GNNs) provide a principled way of learning the dynamics of interactive systems but can suffer from scaling issues as the number of interactions increases. Furthermore, the problem of using learned GNN-based models for optimal control is insufficiently explored. In this work, we present a method for efficiently learning the dynamics of interacting systems by simultaneously learning a dynamic graph structure and a stable and locally linear forward model of the system. The dynamic graph structure encodes evolving contact modes along a trajectory by making probabilistic predictions over the edges of the graph. Additionally, we introduce a temporal dependence in the learned graph structure which allows us to incorporate contact measurement updates during execution thus enabling more accurate forward predictions. The learned stable and locally linear dynamics enable the use of optimal control algorithms such as iLQR for long-horizon planning and control for complex interactive tasks. Through experiments in simulation and in the real world, we evaluate the performance of our method by using the learned interaction dynamics for control and demonstrate generalization to more objects and interactions not seen during training. We introduce a control scheme that takes advantage of contact measurement updates and hence is robust to prediction inaccuracies during execution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题