从ODE拆分角度来看的随机梯度算法

论文标题

从ODE拆分角度来看的随机梯度算法

Stochastic gradient algorithms from ODE splitting perspective

论文作者

Merkulov, Daniil, Oseledets, Ivan

论文摘要

我们对随机优化提出了不同的观点，该视图可以追溯到近似ode解决方案的分裂方案。在这项工作中，我们提供了随机梯度下降方法与ODE的一阶分裂方案之间的联系。我们考虑了分裂的特殊情况，该案例灵感来自机器学习应用程序，并在全局分裂错误上得出了新的上限。我们表明，对于线性最小二乘问题，kaczmarz方法是单位批次SGD的分裂方案的限制情况。我们通过系统的经验研究来支持我们的发现，该研究表明，更准确的局部问题解决方案会导致步骤大大稳健性，并在时间上提供更好的收敛性，并在SoftMax回归问题上进行迭代。

We present a different view on stochastic optimization, which goes back to the splitting schemes for approximate solutions of ODE. In this work, we provide a connection between stochastic gradient descent approach and first-order splitting scheme for ODE. We consider the special case of splitting, which is inspired by machine learning applications and derive a new upper bound on the global splitting error for it. We present, that the Kaczmarz method is the limit case of the splitting scheme for the unit batch SGD for linear least squares problem. We support our findings with systematic empirical studies, which demonstrates, that a more accurate solution of local problems leads to the stepsize robustness and provides better convergence in time and iterations on the softmax regression problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题