将镜下降作为梯度下降

论文标题

将镜下降作为梯度下降

Reparameterizing Mirror Descent as Gradient Descent

论文作者

Amid, Ehsan, Warmuth, Manfred K.

论文摘要

神经网络最近的大多数成功应用都是基于梯度下降更新的培训。但是，对于某些小型网络，当目标稀疏时，其他镜像下降更新可以更有效地学习。我们提出了一个通用框架，用于将镜像下降更新作为一组参数集的梯度下降更新。在某些情况下，可以描述为训练具有标准反向传播的修改网络。重新聚集框架的用途广泛，并且涵盖了广泛的镜像更新，即使是域受到限制的情况。我们针对更新的连续版本进行了重新聚集参数的构建。找到离散版本的一般标准以密切跟踪其连续的同行仍然是一个有趣的开放问题。

Most of the recent successful applications of neural networks have been based on training with gradient descent updates. However, for some small networks, other mirror descent updates learn provably more efficiently when the target is sparse. We present a general framework for casting a mirror descent update as a gradient descent update on a different set of parameters. In some cases, the mirror descent reparameterization can be described as training a modified network with standard backpropagation. The reparameterization framework is versatile and covers a wide range of mirror descent updates, even cases where the domain is constrained. Our construction for the reparameterization argument is done for the continuous versions of the updates. Finding general criteria for the discrete versions to closely track their continuous counterparts remains an interesting open problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题