在神经网络和非convex压缩传感中优化的放松论点

论文标题

在神经网络和非convex压缩传感中优化的放松论点

A Relaxation Argument for Optimization in Neural Networks and Non-Convex Compressed Sensing

论文作者

Welper, G.

论文摘要

在实际应用中观察到了这一点，并且在理论分析中，过度参数有助于在神经网络培训中找到良好的最小值。同样，在本文中，我们通过放松论证研究了扩大和加深的神经网络，以便扩大的网络足够丰富，可以并行运行原始网络的部分部分的$ r $副本，而不必像过度参数化的场景中实现零训练错误。部分副本可以组合成$ r^θ$的层宽度$θ$的可能方法。因此，扩大的网络可能会实现$ r^θ$随机初始化的最佳训练错误，但是尚不清楚是否可以通过梯度下降或类似的训练方法实现这一点。通过引入类似的分层结构，可以将相同的结构应用于其他优化问题。我们将此想法应用于非凸压缩感测，我们表明在某些情况下，我们可以通过解决dimension $rθ$的凸优化问题来实现$ r^θ$倍增加获得全局最佳的机会。

It has been observed in practical applications and in theoretical analysis that over-parametrization helps to find good minima in neural network training. Similarly, in this article we study widening and deepening neural networks by a relaxation argument so that the enlarged networks are rich enough to run $r$ copies of parts of the original network in parallel, without necessarily achieving zero training error as in over-parametrized scenarios. The partial copies can be combined in $r^θ$ possible ways for layer width $θ$. Therefore, the enlarged networks can potentially achieve the best training error of $r^θ$ random initializations, but it is not immediately clear if this can be realized via gradient descent or similar training methods. The same construction can be applied to other optimization problems by introducing a similar layered structure. We apply this idea to non-convex compressed sensing, where we show that in some scenarios we can realize the $r^θ$ times increased chance to obtain a global optimum by solving a convex optimization problem of dimension $rθ$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题