学习和理解在增强学习中隐藏参数的分离特征表示

论文标题

学习和理解在增强学习中隐藏参数的分离特征表示

Learning and Understanding a Disentangled Feature Representation for Hidden Parameters in Reinforcement Learning

论文作者

Reale, Christopher, Russell, Rebecca

论文摘要

隐藏的参数是增强学习（RL）环境中不变的潜在变量。了解隐藏的参数（如果有的话）会影响特定环境，可以帮助RL系统的开发和适当使用。我们提出了一种无监督的方法，将RL轨迹映射到特征空间中，其中距离代表隐藏参数引起的系统行为的相对差异。我们的方法通过利用基于模型的RL中使用的复发性神经网络（RNN）世界模型来解散隐藏参数的影响。首先，我们更改标准世界模型训练算法以隔离世界模型内存中的隐藏参数信息。然后，我们使用公制学习方法将RNN存储器映射到一个距离度量标准的空间中，相对于隐藏的参数近似于双仿真度量。所得的分离特征空间可用于将轨迹相互关联并分析隐藏参数。我们在三个RL环境中展示了四个隐藏参数的方法。最后，我们提出了两种方法，以帮助识别和理解隐藏参数对系统的影响。

Hidden parameters are latent variables in reinforcement learning (RL) environments that are constant over the course of a trajectory. Understanding what, if any, hidden parameters affect a particular environment can aid both the development and appropriate usage of RL systems. We present an unsupervised method to map RL trajectories into a feature space where distance represents the relative difference in system behavior due to hidden parameters. Our approach disentangles the effects of hidden parameters by leveraging a recurrent neural network (RNN) world model as used in model-based RL. First, we alter the standard world model training algorithm to isolate the hidden parameter information in the world model memory. Then, we use a metric learning approach to map the RNN memory into a space with a distance metric approximating a bisimulation metric with respect to the hidden parameters. The resulting disentangled feature space can be used to meaningfully relate trajectories to each other and analyze the hidden parameter. We demonstrate our approach on four hidden parameters across three RL environments. Finally we present two methods to help identify and understand the effects of hidden parameters on systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题