DEP-RL：在过度分流和肌肉骨骼系统中进行增强学习的具体探索

论文标题

DEP-RL：在过度分流和肌肉骨骼系统中进行增强学习的具体探索

DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems

论文作者

Schumacher, Pierre, Häufle, Daniel, Büchler, Dieter, Schmitt, Syn, Martius, Georg

论文摘要

尽管肌肉驱动的生物能够学习大量的肌肉，但能够学习灵巧运动的多样性。但是，在大型肌肉骨骼模型上的增强学习（RL）无法显示出相似的性能。我们猜测，在大型分离的动作空间中无效的探索是一个关键问题。这是由共同探索噪声策略在过度驱动系统的合成示例中不足的发现所支持的。我们确定了差异外在可塑性（DEP），这是一种来自自组织的领域的方法，能够在相互作用的几秒钟内诱导状态空间覆盖探索。通过将DEP整合到RL中，我们可以快速学习肌肉骨骼系统中的触及和运动，在样本效率和鲁棒性方面的所有被考虑的任务中的当前方法都优于当前方法。

Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by the finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题