要求知识：培训RL代理商使用语言查询外部知识

论文标题

要求知识：培训RL代理商使用语言查询外部知识

Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language

论文作者

Liu, Iou-Jen, Yuan, Xingdi, Côté, Marc-Alexandre, Oudeyer, Pierre-Yves, Schwing, Alexander G.

论文摘要

为了解决艰巨的任务，人类提出问题以从外部来源获取知识。相比之下，经典的增强学习者缺乏这种能力，并且常常诉诸探索性行为。这会加剧，因为很少的当今环境支持查询知识。为了研究如何通过语言教代理来查询外部知识，我们首先介绍了两个新环境：基于网格世界的Q-babyai和基于文本的Q-Textworld。除了物理互动外，代理还可以查询专门针对这些环境的外部知识来源来收集信息。其次，我们提出了“寻求知识”（AFK）代理，该代理学会生成语言命令以查询有助于解决任务的有意义的知识。 AFK利用非参数记忆，指针机制和情节探索奖金来解决（1）无关的信息，（2）一个较大的查询语言空间，（3）延迟奖励有意义的查询。广泛的实验表明，AFK代理在具有挑战性的Q-Babyai和Q-Textworld环境方面优于最近的基线。

To solve difficult tasks, humans ask questions to acquire knowledge from external sources. In contrast, classical reinforcement learning agents lack such an ability and often resort to exploratory behavior. This is exacerbated as few present-day environments support querying for knowledge. In order to study how agents can be taught to query external knowledge via language, we first introduce two new environments: the grid-world-based Q-BabyAI and the text-based Q-TextWorld. In addition to physical interactions, an agent can query an external knowledge source specialized for these environments to gather information. Second, we propose the "Asking for Knowledge" (AFK) agent, which learns to generate language commands to query for meaningful knowledge that helps solve the tasks. AFK leverages a non-parametric memory, a pointer mechanism and an episodic exploration bonus to tackle (1) irrelevant information, (2) a large query language space, (3) delayed reward for making meaningful queries. Extensive experiments demonstrate that the AFK agent outperforms recent baselines on the challenging Q-BabyAI and Q-TextWorld environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题