使用后继功能的多任务转移的任务重新布置

论文标题

使用后继功能的多任务转移的任务重新布置

Task Relabelling for Multi-task Transfer using Successor Features

论文作者

Balla, Martin, Perez-Liebana, Diego

论文摘要

在复杂领域的各种作品中，最近深厚的增强学习非常成功。大多数作品都关注学习解决目标任务的单一策略，但要确定，如果环境改变，则代理将无法适应它。后继功能（SFS）提出了一种机制，该机制允许学习与任何特定奖励功能无关的学习政策。在这项工作中，我们研究了如何预先培训SF，而不观察到具有资源收集，陷阱和制作的自定义环境中的任何奖励。在预训练之后，我们将SF代理暴露于各种目标任务中，并了解它们如何转移到新任务。转移是无需对SF代理的任何进一步培训而进行的，而是仅通过提供任务向量即可完成。对于培训SFS，我们提出了一种任务重新标记方法，可大大提高代理商的性能。

Deep Reinforcement Learning has been very successful recently with various works on complex domains. Most works are concerned with learning a single policy that solves the target task, but is fixed in the sense that if the environment changes the agent is unable to adapt to it. Successor Features (SFs) proposes a mechanism that allows learning policies that are not tied to any particular reward function. In this work we investigate how SFs may be pre-trained without observing any reward in a custom environment that features resource collection, traps and crafting. After pre-training we expose the SF agents to various target tasks and see how well they can transfer to new tasks. Transferring is done without any further training on the SF agents, instead just by providing a task vector. For training the SFs we propose a task relabelling method which greatly improves the agent's performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题