轨迹数据中的对称检测，以实现更有意义的增强学习表示形式

论文标题

轨迹数据中的对称检测，以实现更有意义的增强学习表示形式

Symmetry Detection in Trajectory Data for More Meaningful Reinforcement Learning Representations

论文作者

D'Alonzo, Marissa, Russell, Rebecca

论文摘要

了解强化学习（RL）系统的对称性的知识可用于创建低级状态空间的压缩和语义意义的表示。我们提出了一种直接从原始轨迹数据中自动检测RL对称性的方法，而无需对系统进行主动控制。我们的方法生成候选对称性并训练复发性神经网络（RNN），以区分每个候选对称性的原始轨迹和转换轨迹。 RNN判别者对每个候选者的精度揭示了该系统在该转换下的对称程度。这些信息可用于创建高级表示，这些表示对数据集级别上的所有对称性不变，并将RL行为的属性传达给用户。我们在实验中展示了两个模拟的RL用例（推动器机器人和一个无人机在风中飞行），我们的方法可以确定环境物理学和受过训练的RL策略的对称性。

Knowledge of the symmetries of reinforcement learning (RL) systems can be used to create compressed and semantically meaningful representations of a low-level state space. We present a method of automatically detecting RL symmetries directly from raw trajectory data without requiring active control of the system. Our method generates candidate symmetries and trains a recurrent neural network (RNN) to discriminate between the original trajectories and the transformed trajectories for each candidate symmetry. The RNN discriminator's accuracy for each candidate reveals how symmetric the system is under that transformation. This information can be used to create high-level representations that are invariant to all symmetries on a dataset level and to communicate properties of the RL behavior to users. We show in experiments on two simulated RL use cases (a pusher robot and a UAV flying in wind) that our method can determine the symmetries underlying both the environment physics and the trained RL policy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题