G-Layers神经网络的事后校准

论文标题

G-Layers神经网络的事后校准

Post-hoc Calibration of Neural Networks by g-Layers

论文作者

Rahimi, Amir, Mensink, Thomas, Gupta, Kartik, Ajanthan, Thalaiyasingam, Sminchisescu, Cristian, Hartley, Richard

论文摘要

在将机器学习模型纳入现实世界决策系统中时，神经网络的校准是一个关键的方面，在这些系统中，决策的信心与决策本身同样重要。近年来，关于神经网络校准的研究激增，大多数作品可以归类为事后校准方法，这些方法定义为学习的方法，这些方法学习了一种额外的功能来校准已经训练的基本网络。在这项工作中，我们打算从理论的角度理解事后校准方法。尤其是，众所周知，如果达到全球最佳距离，则最大程度地减少负模样（NLL）将导致训练集的校准网络（Bishop，1994）。然而，尚不清楚以事后方式学习其他功能会导致理论意义上的校准。为此，我们证明，即使基本网络（$ f $）并不能通过优化$ g $的参数来获得校准的网络$ g \ circ f $来最大程度地减少NLL，即使基本网络（$ f $）并不能通过添加其他层（$ g $）来降低NLL的全局最佳。这不仅提供了获得校准网络的严格条件，还提供了事后校准方法的理论理由。我们对各种图像分类基准的实验证实了这一理论。

Calibration of neural networks is a critical aspect to consider when incorporating machine learning models in real-world decision-making systems where the confidence of decisions are equally important as the decisions themselves. In recent years, there is a surge of research on neural network calibration and the majority of the works can be categorized into post-hoc calibration methods, defined as methods that learn an additional function to calibrate an already trained base network. In this work, we intend to understand the post-hoc calibration methods from a theoretical point of view. Especially, it is known that minimizing Negative Log-Likelihood (NLL) will lead to a calibrated network on the training set if the global optimum is attained (Bishop, 1994). Nevertheless, it is not clear learning an additional function in a post-hoc manner would lead to calibration in the theoretical sense. To this end, we prove that even though the base network ($f$) does not lead to the global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by optimizing the parameters of $g$ one can obtain a calibrated network $g \circ f$. This not only provides a less stringent condition to obtain a calibrated network but also provides a theoretical justification of post-hoc calibration methods. Our experiments on various image classification benchmarks confirm the theory.

下载PDF全文

下载文献需遵守相关版权规定

论文标题