基于模型的强化学习实体系统没有速度和加速度测量

论文标题

基于模型的强化学习实体系统没有速度和加速度测量

Model-Based Reinforcement Learning for Physical Systems Without Velocity and Acceleration Measurements

论文作者

Libera, Alberto Dalla, Romeres, Diego, Jha, Devesh K., Yerazunis, Bill, Nikovski, Daniel

论文摘要

在本文中，我们提出了一个基于高斯过程回归（GPR）的强化学习算法（RL）算法的无衍生模型学习框架。在许多机械系统中，只能通过传感仪器来测量位置。然后，我们将速度，速度和加速度集合的物理学所建议的代替代替系统状态，而是将状态定义为过去位置测量值的集合。但是，通过物理第一原理得出的运动方程不能直接应用于此框架，即速度和加速度的功能。因此，我们引入了一种新型的无衍生化物理灵感核，可以很容易地与非参数无衍生物的高斯工艺模型结合使用。在两个真实平台上执行的测试表明，所考虑的状态定义与所提出的模型相结合，改善了估计绩效和数据效率W.R.T.基于GPR的传统模型。最后，我们通过解决两个实际机器人系统的两个RL控制问题来验证所提出的框架。

In this paper, we propose a derivative-free model learning framework for Reinforcement Learning (RL) algorithms based on Gaussian Process Regression (GPR). In many mechanical systems, only positions can be measured by the sensing instruments. Then, instead of representing the system state as suggested by the physics with a collection of positions, velocities, and accelerations, we define the state as the set of past position measurements. However, the equation of motions derived by physical first principles cannot be directly applied in this framework, being functions of velocities and accelerations. For this reason, we introduce a novel derivative-free physically-inspired kernel, which can be easily combined with nonparametric derivative-free Gaussian Process models. Tests performed on two real platforms show that the considered state definition combined with the proposed model improves estimation performance and data-efficiency w.r.t. traditional models based on GPR. Finally, we validate the proposed framework by solving two RL control problems for two real robotic systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题