论文标题

神经崩溃的局限性用于理解深度学习中的概括

Limitations of Neural Collapse for Understanding Generalization in Deep Learning

论文作者

Hui, Like, Belkin, Mikhail, Nakkiran, Preetum

论文摘要

Papyan,Han和Donoho(2020)的最新工作提出了一种有趣的“神经崩溃”现象,显示了在训练的后期插值分类器的结构性。这为研究这种现象开辟了丰富的探索领域。我们的动机是研究该研究计划的上限:理解神经崩溃将使我们了解深度学习?首先,我们研究了其在概括中的作用。我们将神经崩溃的猜想改进到两个单独的猜想中:火车组上的崩溃(优化属性)和测试分布(概括属性)的崩溃。我们发现,虽然神经崩溃经常发生在火车组上,但在测试集中不会发生。因此,我们得出的结论是,神经崩溃主要是一种优化现象,并尚未与概括联系起来。其次,我们研究了神经崩溃在特征学习中的作用。我们展示了简单,现实的实验,其中训练更长会导致更糟糕的最后一层特征,如下游任务的转移表现衡量。这表明,正如先前所说的那样,对于表示学习并不总是需要神经崩溃。最后,我们给出了“级联崩溃”现象的初步证据,其中某种形式的神经塌陷不仅发生在最后一层,而且在早期的层中发生。我们希望我们的工作鼓励社区继续进行丰富的神经崩溃研究,同时也考虑其固有的局限性。

The recent work of Papyan, Han, & Donoho (2020) presented an intriguing "Neural Collapse" phenomenon, showing a structural property of interpolating classifiers in the late stage of training. This opened a rich area of exploration studying this phenomenon. Our motivation is to study the upper limits of this research program: How far will understanding Neural Collapse take us in understanding deep learning? First, we investigate its role in generalization. We refine the Neural Collapse conjecture into two separate conjectures: collapse on the train set (an optimization property) and collapse on the test distribution (a generalization property). We find that while Neural Collapse often occurs on the train set, it does not occur on the test set. We thus conclude that Neural Collapse is primarily an optimization phenomenon, with as-yet-unclear connections to generalization. Second, we investigate the role of Neural Collapse in feature learning. We show simple, realistic experiments where training longer leads to worse last-layer features, as measured by transfer-performance on a downstream task. This suggests that neural collapse is not always desirable for representation learning, as previously claimed. Finally, we give preliminary evidence of a "cascading collapse" phenomenon, wherein some form of Neural Collapse occurs not only for the last layer, but in earlier layers as well. We hope our work encourages the community to continue the rich line of Neural Collapse research, while also considering its inherent limitations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源