论文标题
在数据有效的增强学习中使用视图一致的动态加速表示学习
Accelerating Representation Learning with View-Consistent Dynamics in Data-Efficient Reinforcement Learning
论文作者
论文摘要
从基于图像的观察中学习信息的表示,在深度强化学习(RL)中是基本关注的。但是,数据信息仍然是该目标的重大障碍。为了克服这一障碍,我们建议通过在动力学上执行视图一致性来加速状态表示学习。首先,我们引入了多视图马尔可夫决策过程(MMDP)的形式主义,该过程包含了国家的多个观点。遵循MMDP的结构,我们的方法,视图一致的动力学(VCD)通过训练潜在空间中的视图一致动力学模型来学习状态表示,在该模型中,通过将数据增强应用于各州来生成视图。对DeepMind Control Suite和Atari-100K的经验评估证明VCD是视觉控制任务的SOTA数据效率算法。
Learning informative representations from image-based observations is of fundamental concern in deep Reinforcement Learning (RL). However, data-inefficiency remains a significant barrier to this objective. To overcome this obstacle, we propose to accelerate state representation learning by enforcing view-consistency on the dynamics. Firstly, we introduce a formalism of Multi-view Markov Decision Process (MMDP) that incorporates multiple views of the state. Following the structure of MMDP, our method, View-Consistent Dynamics (VCD), learns state representations by training a view-consistent dynamics model in the latent space, where views are generated by applying data augmentation to states. Empirical evaluation on DeepMind Control Suite and Atari-100k demonstrates VCD to be the SoTA data-efficient algorithm on visual control tasks.