论文标题
在收敛 - 诊断的基于随机梯度下降的台阶大小上
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
论文作者
论文摘要
恒定的阶梯大小的随机梯度下降表现出两个阶段:瞬间迭代在最佳的过程中取得快速进步,然后是固定相,在此期间,在最佳点周围迭代振荡。在本文中,我们表明,有效地检测到这种过渡并适当降低步进大小会导致快速收敛速率。我们根据连续随机梯度之间的内部产物分析了Pflug(1983)提出的经典统计检验。即使在目标函数是二次的简单情况下,我们表明该测试不能导致足够的收敛诊断。然后,我们提出了一种新颖而简单的统计程序,该程序准确地检测了平稳性,并提供了实验结果,显示了合成和实际数据集的最新性能。
Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper, we show that efficiently detecting this transition and appropriately decreasing the step size can lead to fast convergence rates. We analyse the classical statistical test proposed by Pflug (1983), based on the inner product between consecutive stochastic gradients. Even in the simple case where the objective function is quadratic we show that this test cannot lead to an adequate convergence diagnostic. We then propose a novel and simple statistical procedure that accurately detects stationarity and we provide experimental results showing state-of-the-art performance on synthetic and real-world datasets.