内省经验重播：惊讶时回头看

论文标题

内省经验重播：惊讶时回头看

Introspective Experience Replay: Look Back When Surprised

论文作者

Kumar, Ramnath, Nagaraj, Dheeraj

论文摘要

在加强学习（RL）中，经验基于重播的抽样技术在消除虚假相关性中促进融合中起着至关重要的作用。但是，已证明广泛使用的方法，例如统一的经验重播（UER）和优先的经验重播（PER），分别具有亚最佳收敛性和高种子灵敏度。为了解决这些问题，我们提出了一种称为Intospective Experience重播（IER）的新方法，该方法有选择地样本在令人惊讶的事件之前进行样本批次。我们的方法建立在理论上声音反向体验重播（RER）技术的基础上，该技术已证明可以减少具有线性函数近似值的Q学习型算法的输出的偏差。但是，使用神经功能近似时，这种方法并不总是实用或可靠的。通过经验评估，我们证明具有神经功能近似的IER可以在大多数任务中比较可靠和出色的表现。

In reinforcement learning (RL), experience replay-based sampling techniques play a crucial role in promoting convergence by eliminating spurious correlations. However, widely used methods such as uniform experience replay (UER) and prioritized experience replay (PER) have been shown to have sub-optimal convergence and high seed sensitivity respectively. To address these issues, we propose a novel approach called IntrospectiveExperience Replay (IER) that selectively samples batches of data points prior to surprising events. Our method builds upon the theoretically sound reverse experience replay (RER) technique, which has been shown to reduce bias in the output of Q-learning-type algorithms with linear function approximation. However, this approach is not always practical or reliable when using neural function approximation. Through empirical evaluations, we demonstrate that IER with neural function approximation yields reliable and superior performance compared toUER, PER, and hindsight experience replay (HER) across most tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题