语言条件的模仿学习机器人操纵任务

论文标题

语言条件的模仿学习机器人操纵任务

Language-Conditioned Imitation Learning for Robot Manipulation Tasks

论文作者

Stepputtis, Simon, Campbell, Joseph, Phielipp, Mariano, Lee, Stefan, Baral, Chitta, Amor, Heni Ben

论文摘要

模仿学习是向机器人讲授运动技能的流行方法。但是，大多数方法都专注于仅从执行跟踪（即运动轨迹和感知数据）中提取策略参数。人类专家和机器人之间没有足够的通信渠道来描述任务的关键方面，例如目标对象的特性或运动的预期形状。通过对人类教学过程的见解，我们介绍了一种将非结构化自然语言纳入模仿学习的方法。在培训时，专家可以提供示范以及口头描述，以描述潜在的意图（例如，“去大绿碗”）。然后，训练过程相互关联这两种方式，以编码语言，感知和运动之间的相关性。由此产生的语言条件策略可以在运行时根据新的人类命令和说明进行调节，从而可以对训练有素的策略进行更细粒度的控制，同时还可以减少情境歧义。我们在一组仿真实验中证明了我们的方法如何学习七度自由机器人组的语言条件操纵策略，并将结果与多种替代方法进行比较。

Imitation learning is a popular approach for teaching motor skills to robots. However, most approaches focus on extracting policy parameters from execution traces alone (i.e., motion trajectories and perceptual data). No adequate communication channel exists between the human expert and the robot to describe critical aspects of the task, such as the properties of the target object or the intended shape of the motion. Motivated by insights into the human teaching process, we introduce a method for incorporating unstructured natural language into imitation learning. At training time, the expert can provide demonstrations along with verbal descriptions in order to describe the underlying intent (e.g., "go to the large green bowl"). The training process then interrelates these two modalities to encode the correlations between language, perception, and motion. The resulting language-conditioned visuomotor policies can be conditioned at runtime on new human commands and instructions, which allows for more fine-grained control over the trained policies while also reducing situational ambiguity. We demonstrate in a set of simulation experiments how our approach can learn language-conditioned manipulation policies for a seven-degree-of-freedom robot arm and compare the results to a variety of alternative methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题