论文标题
俱乐部:相互信息的对比对数比率上限
CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
论文作者
论文摘要
相互信息(MI)最小化已在各种机器学习任务中获得了相当大的兴趣。但是,高维空间中的MI估计和最小化仍然是一个具有挑战性的问题,尤其是在仅访问样品而不是分配形式时。先前的工作主要集中于MI下限近似,这不适用于MI最小化问题。在本文中,我们提出了一种新颖的对比对数比比相互信息的上限(俱乐部)。我们对俱乐部的性质及其变异近似提供了理论分析。基于此上限,我们引入了MI最小化训练方案,并通过负抽样策略进一步加速了它。高斯分布的仿真研究显示了俱乐部的可靠估计能力。现实世界中的MI最小化实验,包括域的适应性和信息瓶颈,证明了该方法的有效性。该代码在https://github.com/linear95/club上。
Mutual information (MI) minimization has gained considerable interests in various machine learning tasks. However, estimating and minimizing MI in high-dimensional spaces remains a challenging problem, especially when only samples, rather than distribution forms, are accessible. Previous works mainly focus on MI lower bound approximation, which is not applicable to MI minimization problems. In this paper, we propose a novel Contrastive Log-ratio Upper Bound (CLUB) of mutual information. We provide a theoretical analysis of the properties of CLUB and its variational approximation. Based on this upper bound, we introduce a MI minimization training scheme and further accelerate it with a negative sampling strategy. Simulation studies on Gaussian distributions show the reliable estimation ability of CLUB. Real-world MI minimization experiments, including domain adaptation and information bottleneck, demonstrate the effectiveness of the proposed method. The code is at https://github.com/Linear95/CLUB.