论文标题
随机线性二次最佳控制问题:增强学习方法
Stochastic Linear Quadratic Optimal Control Problem: A Reinforcement Learning Method
论文作者
论文摘要
本文应用了增强学习(RL)方法来解决无限的地平线连续时间随机线性二次问题,其中动态中的漂移和扩散项可能取决于状态和控制。根据Bellman的动态编程原则,提出了一种在线RL算法,以通过仅部分系统信息获得最佳控制。该算法直接计算最佳控制,而不是估计系统系数并求解相关的Riccati方程。它只需要本地轨迹信息,从而大大简化了计算处理。进行了两个数字示例,以阐明我们的理论发现。
This paper applies a reinforcement learning (RL) method to solve infinite horizon continuous-time stochastic linear quadratic problems, where drift and diffusion terms in the dynamics may depend on both the state and control. Based on Bellman's dynamic programming principle, an online RL algorithm is presented to attain the optimal control with just partial system information. This algorithm directly computes the optimal control rather than estimating the system coefficients and solving the related Riccati equation. It just requires local trajectory information, greatly simplifying the calculation processing. Two numerical examples are carried out to shed light on our theoretical findings.