深度引导框架：优秀的在线学习者是良好的离线概括者

论文标题

深度引导框架：优秀的在线学习者是良好的离线概括者

The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers

论文作者

Nakkiran, Preetum, Neyshabur, Behnam, Sedghi, Hanie

论文摘要

我们提出了一个新的框架，用于推理深度学习中的概括。核心思想是将现实世界融入现实世界，在这种世界中，优化者将经验损失的随机梯度步骤置于理想的世界中，在该世界中，优化者对人口损失采取了步骤。这导致测试错误的替代分解为：（1）理想的世界测试误差加（2）两个世界之间的差距。如果差距（2）普遍很小，则将离线学习中的泛化问题减少到在线学习中的优化问题。然后，我们提供经验证据，表明在现实的深度学习环境中，特别是受监督的图像分类，世界之间的这种差距可能很小。例如，CNN在现实世界中对图像分布的概括性比MLP更好，但这是“”，因为它们对理想世界中人口损失的速度更快地优化。这表明我们的框架是理解深度学习中概括的有用工具，并为该地区的未来研究奠定了基础。

We propose a new framework for reasoning about generalization in deep learning. The core idea is to couple the Real World, where optimizers take stochastic gradient steps on the empirical loss, to an Ideal World, where optimizers take steps on the population loss. This leads to an alternate decomposition of test error into: (1) the Ideal World test error plus (2) the gap between the two worlds. If the gap (2) is universally small, this reduces the problem of generalization in offline learning to the problem of optimization in online learning. We then give empirical evidence that this gap between worlds can be small in realistic deep learning settings, in particular supervised image classification. For example, CNNs generalize better than MLPs on image distributions in the Real World, but this is "because" they optimize faster on the population loss in the Ideal World. This suggests our framework is a useful tool for understanding generalization in deep learning, and lays a foundation for future research in the area.

下载PDF全文

下载文献需遵守相关版权规定

论文标题