论文标题
有条件的熵瓶颈
The Conditional Entropy Bottleneck
论文作者
论文摘要
机器学习领域的大部分领域都表现出突出的故障模式,包括易受对抗性示例的脆弱性,不良的分布(OOD)检测,错误校准以及记住数据集随机标记的意愿。我们将这些表征为鲁棒概括的失败,它将传统的概括度量扩展为准确性或相关指标。我们假设这些失败是由于学习系统保留了太多有关培训数据的信息所致。为了检验这一假设,我们提出了评估模型质量的最低必要信息(MNI)标准。为了训练与MNI标准相对于MNI标准表现良好的模型,我们提出了一个新的目标函数,即条件熵瓶颈(CEB),该功能与信息瓶颈(IB)密切相关。我们通过将CEB模型的性能与确定性模型和变异信息瓶颈(VIB)模型进行比较,并在各种不同的数据集和鲁棒性挑战上进行了测试。我们发现有力的经验证据支持我们的假设,即MNI模型在这些强大的概括问题上有所改善。
Much of the field of Machine Learning exhibits a prominent set of failure modes, including vulnerability to adversarial examples, poor out-of-distribution (OoD) detection, miscalibration, and willingness to memorize random labelings of datasets. We characterize these as failures of robust generalization, which extends the traditional measure of generalization as accuracy or related metrics on a held-out set. We hypothesize that these failures to robustly generalize are due to the learning systems retaining too much information about the training data. To test this hypothesis, we propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model. In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB), which is closely related to the Information Bottleneck (IB). We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets and robustness challenges. We find strong empirical evidence supporting our hypothesis that MNI models improve on these problems of robust generalization.