论文标题

不当学习的不当控制

Improper Learning for Non-Stochastic Control

论文作者

Simchowitz, Max, Singh, Karan, Hazan, Elad

论文摘要

我们考虑了控制可能未知的线性动力学系统,具有对抗性扰动,对抗选择的凸丢失函数以及部分观察到的状态,称为非策略对照。我们基于被剥离的观测值介绍了控制器参数化,并证明将在线梯度下降应用于此参数化会产生一个新的控制器,该控制器获得了Sublinear Realist vs.与一大批闭环策略。在完全对抗的环境中,我们的控制器达到了$ \ sqrt {t} $的最佳遗憾,当该系统已知时,并且与最小二乘估计的初始阶段相结合时,当系统尚不清楚时,$ t^{2/3} $当时$ t^{2/3} $;两者都为部分观察到的设置产生了第一个sublinear遗憾。 我们的界限是与\ emph {ash aLl}稳定线性动力控制器竞争的非策略控制设置中的第一个,而不仅仅是状态反馈。此外,在包含随机组件和对抗组件的半逆转噪声的存在下,我们的控制器达到了$ \ mathrm {poly}(\ log log t)$的最佳后悔界限,而当系统已知时,$ \ sqrt {t} $ ness of Nownewness nrowness nown已知。据我们所知,这为在线线性二次高斯控制器提供了第一个端到端$ \ sqrt {t} $遗憾,并在更一般的环境中使用对抗性损失和半反向噪声。

We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states, known as non-stochastic control. We introduce a controller parametrization based on the denoised observations, and prove that applying online gradient descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies. In the fully-adversarial setting, our controller attains an optimal regret bound of $\sqrt{T}$-when the system is known, and, when combined with an initial stage of least-squares estimation, $T^{2/3}$ when the system is unknown; both yield the first sublinear regret for the partially observed setting. Our bounds are the first in the non-stochastic control setting that compete with \emph{all} stabilizing linear dynamical controllers, not just state feedback. Moreover, in the presence of semi-adversarial noise containing both stochastic and adversarial components, our controller attains the optimal regret bounds of $\mathrm{poly}(\log T)$ when the system is known, and $\sqrt{T}$ when unknown. To our knowledge, this gives the first end-to-end $\sqrt{T}$ regret for online Linear Quadratic Gaussian controller, and applies in a more general setting with adversarial losses and semi-adversarial noise.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源