发现全州轨迹的无监督行为

论文标题

发现全州轨迹的无监督行为

Discovering Unsupervised Behaviours from Full-State Trajectories

论文作者

Grillotti, Luca, Cully, Antoine

论文摘要

提高开放式学习能力是一种有前途的方法，可以使机器人面对现实世界中无限的复杂性。在现有方法中，在这种情况下，质量多样性算法产生大量多样化和高性能技能的能力至关重要。但是，这些算法中的大多数都依赖于手工编码的行为描述符来表征多样性，因此需要有关考虑任务的先验知识。在这项工作中，我们提出了对自主机器人意识到其能力的额外分析。一种自主发现行为特征的质量多样性算法。我们在模拟机器人环境中评估了这种方法，该机器人必须自主从全州轨迹中自主发现其能力。所有算法都应用于三个任务：导航，以高速度向前移动，并执行半滚动。实验结果表明，所研究算法发现，与所有任务相对于所有任务都有多样化的解决方案收集。更具体地说，该分析方法自主发现的政策可以使机器人转向各种位置，但也以各种方式利用其腿，甚至执行半卷。

Improving open-ended learning capabilities is a promising approach to enable robots to face the unbounded complexity of the real-world. Among existing methods, the ability of Quality-Diversity algorithms to generate large collections of diverse and high-performing skills is instrumental in this context. However, most of those algorithms rely on a hand-coded behavioural descriptor to characterise the diversity, hence requiring prior knowledge about the considered tasks. In this work, we propose an additional analysis of Autonomous Robots Realising their Abilities; a Quality-Diversity algorithm that autonomously finds behavioural characterisations. We evaluate this approach on a simulated robotic environment, where the robot has to autonomously discover its abilities from its full-state trajectories. All algorithms were applied to three tasks: navigation, moving forward with a high velocity, and performing half-rolls. The experimental results show that the algorithm under study discovers autonomously collections of solutions that are diverse with respect to all tasks. More specifically, the analysed approach autonomously finds policies that make the robot move to diverse positions, but also utilise its legs in diverse ways, and even perform half-rolls.

下载PDF全文

下载文献需遵守相关版权规定

论文标题