从奖励中学习关系规则

论文标题

从奖励中学习关系规则

Learning Relational Rules from Rewards

论文作者

Puebla, Guillermo, Doumas, Leonidas A. A.

论文摘要

人类从对象及其之间的关系来感知世界。实际上，对于任何给定的对象，都有无数的关系适用于它们。认知系统如何学习哪些关系对于表征手头的任务有用？以及如何使用这些表示形式来构建关系政策以有效地与环境互动？在本文中，我们建议可以通过称为关系增强学习（RRL）的符号机器学习的子场的镜头来理解这个问题。为了证明我们的方法的潜力，我们基于RRL中开发的函数近似值构建了一个简单的关系政策学习模型。我们在三场Atari游戏中训练和测试了我们的模型，这些游戏需要考虑越来越多的潜在关系：突破，乒乓球和恶魔攻击。在每个游戏中，我们的模型都能够选择足够的关系表示并逐步构建关系策略。我们讨论了我们的模型与关系和类似推理的模型之间的关系，以及其局限性和未来研究方向。

Humans perceive the world in terms of objects and relations between them. In fact, for any given pair of objects, there is a myriad of relations that apply to them. How does the cognitive system learn which relations are useful to characterize the task at hand? And how can it use these representations to build a relational policy to interact effectively with the environment? In this paper we propose that this problem can be understood through the lens of a sub-field of symbolic machine learning called relational reinforcement learning (RRL). To demonstrate the potential of our approach, we build a simple model of relational policy learning based on a function approximator developed in RRL. We trained and tested our model in three Atari games that required to consider an increasingly number of potential relations: Breakout, Pong and Demon Attack. In each game, our model was able to select adequate relational representations and build a relational policy incrementally. We discuss the relationship between our model with models of relational and analogical reasoning, as well as its limitations and future directions of research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题