深层合奏起作用，但是它们有必要吗？

论文标题

深层合奏起作用，但是它们有必要吗？

Deep Ensembles Work, But Are They Necessary?

论文作者

Abe, Taiga, Buchanan, E. Kelly, Pleiss, Geoff, Zemel, Richard, Cunningham, John P.

论文摘要

结合神经网络是提高准确性的有效方法，并且通常可以匹配单个较大模型的性能。该观察结果提出了一个自然的问题：鉴于深层合奏和具有相似精度的单个神经网络之间的选择，一个比另一个更好吗？最近的工作表明，深层合奏可能会在预测能力之外提供独特的好处：即，对数据集转移的不确定性量化和鲁棒性。在这项工作中，我们证明了这些所谓的好处的局限性，并表明单个（但更大的）神经网络可以复制这些素质。首先，我们表明，通过任何度量，整体多样性都没有有意义地促进整体对分布（OOD）数据的不确定性量化，而是与单个较大模型的相对改善高度相关。其次，我们表明，合奏所提供的OOD性能是由它们的分布（IND）性能强烈决定的，并且 - 从这个意义上讲，并不表示任何“有效的鲁棒性”。虽然深层集合是实现预测能力，不确定性量化和鲁棒性改进的一种实用方法，但我们的结果表明，这些改进可以通过（较大）单个模型来复制。

Ensembling neural networks is an effective way to increase accuracy, and can often match the performance of individual larger models. This observation poses a natural question: given the choice between a deep ensemble and a single neural network with similar accuracy, is one preferable over the other? Recent work suggests that deep ensembles may offer distinct benefits beyond predictive power: namely, uncertainty quantification and robustness to dataset shift. In this work, we demonstrate limitations to these purported benefits, and show that a single (but larger) neural network can replicate these qualities. First, we show that ensemble diversity, by any metric, does not meaningfully contribute to an ensemble's uncertainty quantification on out-of-distribution (OOD) data, but is instead highly correlated with the relative improvement of a single larger model. Second, we show that the OOD performance afforded by ensembles is strongly determined by their in-distribution (InD) performance, and -- in this sense -- is not indicative of any "effective robustness". While deep ensembles are a practical way to achieve improvements to predictive power, uncertainty quantification, and robustness, our results show that these improvements can be replicated by a (larger) single model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题