论文标题
非convex c^{1+alpha}的梯度算法的收敛性函数
Convergence of Gradient Algorithms for Nonconvex C^{1+alpha} Cost Functions
论文作者
论文摘要
本文涉及随机梯度算法的收敛,并在非covex设置中具有动量项。在轻度假设下的一般框架中分析了一类随机动量方法,包括随机梯度下降,重球和Nesterov的加速梯度。基于预期梯度的收敛结果,我们通过详细讨论动量的影响和UPCROSSINGS的数量来证明几乎确定的收敛性。值得注意的是,没有对目标函数和步骤大小施加其他限制。对先前结果的另一个改进是,现有的Lipschitz条件放松到持有人连续性的状态。作为副产品,我们采用定位程序将结果扩展到随机步骤。
This paper is concerned with convergence of stochastic gradient algorithms with momentum terms in the nonconvex setting. A class of stochastic momentum methods, including stochastic gradient descent, heavy ball, and Nesterov's accelerated gradient, is analyzed in a general framework under mild assumptions. Based on the convergence result of expected gradients, we prove the almost sure convergence by a detailed discussion of the effects of momentum and the number of upcrossings. It is worth noting that there are not additional restrictions imposed on the objective function and stepsize. Another improvement over previous results is that the existing Lipschitz condition of the gradient is relaxed into the condition of Holder continuity. As a byproduct, we apply a localization procedure to extend our results to stochastic stepsizes.