论文标题
条件随机梯度下降的渐近分析
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
论文作者
论文摘要
在本文中,我们根据梯度方向的预处理研究了一类随机梯度下降(SGD)算法,称为条件SGD。使用Martingale工具的离散时间方法,我们在温和的假设下建立了迭代序列的弱收敛性,用于包括随机一阶和二阶方法在内的广泛条件矩阵。还提出了几乎确定的收敛结果,可能具有独立的兴趣。有趣的是,渐近正态性结果由随机等级性质组成,因此当调节矩阵是对逆Hessian的估计值时,该算法是渐近的最佳选择。
In this paper, we investigate a general class of stochastic gradient descent (SGD) algorithms, called Conditioned SGD, based on a preconditioning of the gradient direction. Using a discrete-time approach with martingale tools, we establish under mild assumptions the weak convergence of the rescaled sequence of iterates for a broad class of conditioning matrices including stochastic first-order and second-order methods. Almost sure convergence results, which may be of independent interest, are also presented. Interestingly, the asymptotic normality result consists in a stochastic equicontinuity property so when the conditioning matrix is an estimate of the inverse Hessian, the algorithm is asymptotically optimal.