在数据有效的增强学习中使用视图一致的动态加速表示学习

论文标题

在数据有效的增强学习中使用视图一致的动态加速表示学习

Accelerating Representation Learning with View-Consistent Dynamics in Data-Efficient Reinforcement Learning

论文作者

Huang, Tao, Wang, Jiachen, Chen, Xiao

论文摘要

从基于图像的观察中学习信息的表示，在深度强化学习（RL）中是基本关注的。但是，数据信息仍然是该目标的重大障碍。为了克服这一障碍，我们建议通过在动力学上执行视图一致性来加速状态表示学习。首先，我们引入了多视图马尔可夫决策过程（MMDP）的形式主义，该过程包含了国家的多个观点。遵循MMDP的结构，我们的方法，视图一致的动力学（VCD）通过训练潜在空间中的视图一致动力学模型来学习状态表示，在该模型中，通过将数据增强应用于各州来生成视图。对DeepMind Control Suite和Atari-100K的经验评估证明VCD是视觉控制任务的SOTA数据效率算法。

Learning informative representations from image-based observations is of fundamental concern in deep Reinforcement Learning (RL). However, data-inefficiency remains a significant barrier to this objective. To overcome this obstacle, we propose to accelerate state representation learning by enforcing view-consistency on the dynamics. Firstly, we introduce a formalism of Multi-view Markov Decision Process (MMDP) that incorporates multiple views of the state. Following the structure of MMDP, our method, View-Consistent Dynamics (VCD), learns state representations by training a view-consistent dynamics model in the latent space, where views are generated by applying data augmentation to states. Empirical evaluation on DeepMind Control Suite and Atari-100k demonstrates VCD to be the SoTA data-efficient algorithm on visual control tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题