论文标题
关于神经网络分类器的信息平面分析 - 评论
On Information Plane Analyses of Neural Network Classifiers -- A Review
论文作者
论文摘要
我们回顾了有关神经网络分类器的信息平面分析的当前文献。尽管潜在的信息瓶颈理论以及信息理论压缩与概括有联系的说法是合理的,但发现经验证据既支持又矛盾。我们回顾了这一证据,并详细分析了如何估计各自的信息数量。我们的调查表明,在信息平面中可视化的压缩不一定是信息理论,而是与潜在表示的几何压缩兼容。该见解为信息平面提供了更新的理由。 除此之外,我们还阐明了估计确定性神经网络中相互信息及其后果的问题。具体而言,我们认为,即使在馈送前向神经网络中,数据处理不平等也不需要估计相互信息。同样,尽管有良好分类性能的拟合阶段是必要(但不够),在该拟合阶段是必要(但不足),具体取决于相互信息估计的细节,此类拟合阶段在信息平面中不需要可见。
We review the current literature concerned with information plane analyses of neural network classifiers. While the underlying information bottleneck theory and the claim that information-theoretic compression is causally linked to generalization are plausible, empirical evidence was found to be both supporting and conflicting. We review this evidence together with a detailed analysis of how the respective information quantities were estimated. Our survey suggests that compression visualized in information planes is not necessarily information-theoretic, but is rather often compatible with geometric compression of the latent representations. This insight gives the information plane a renewed justification. Aside from this, we shed light on the problem of estimating mutual information in deterministic neural networks and its consequences. Specifically, we argue that even in feed-forward neural networks the data processing inequality need not hold for estimates of mutual information. Similarly, while a fitting phase, in which the mutual information between the latent representation and the target increases, is necessary (but not sufficient) for good classification performance, depending on the specifics of mutual information estimation such a fitting phase need not be visible in the information plane.