通过多任务增强学习，演示引导的自主练习

论文标题

通过多任务增强学习，演示引导的自主练习

Demonstration-Bootstrapped Autonomous Practicing via Multi-Task Reinforcement Learning

论文作者

Gupta, Abhishek, Lynch, Corey, Kinman, Brandon, Peake, Garrett, Levine, Sergey, Hausman, Karol

论文摘要

强化学习系统有可能在非结构化环境中持续改进，从而自主收集的数据。但是，实际上，这些系统需要大量的仪器或人力干预才能在现实世界中学习。在这项工作中，我们提出了一个用于强化学习的系统，该系统利用了具有先验数据的多任务增强学习自举学习，以实现连续的自主练习，从而最大程度地减少所需的重置数量，同时能够学习暂时扩展的行为。我们展示了如何适当提供的先前数据可以帮助引导低级多任务策略和一个接一个地对这些任务进行排序的策略，以使学习能够以最小的重置进行学习。这种机制使我们的机器人系统能够在训练时间最少的人力干预练习，同时能够在测试时解决长时间的任务。我们在模拟和现实世界中展示了拟议系统对具有挑战性的厨房操作任务的功效，并证明了自主练习以解决时间扩展的问题的能力。

Reinforcement learning systems have the potential to enable continuous improvement in unstructured environments, leveraging data collected autonomously. However, in practice these systems require significant amounts of instrumentation or human intervention to learn in the real world. In this work, we propose a system for reinforcement learning that leverages multi-task reinforcement learning bootstrapped with prior data to enable continuous autonomous practicing, minimizing the number of resets needed while being able to learn temporally extended behaviors. We show how appropriately provided prior data can help bootstrap both low-level multi-task policies and strategies for sequencing these tasks one after another to enable learning with minimal resets. This mechanism enables our robotic system to practice with minimal human intervention at training time while being able to solve long horizon tasks at test time. We show the efficacy of the proposed system on a challenging kitchen manipulation task both in simulation and in the real world, demonstrating the ability to practice autonomously in order to solve temporally extended problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题