论文标题
PMR:多模式学习的典型模态重新平衡
PMR: Prototypical Modal Rebalance for Multimodal Learning
论文作者
论文摘要
多模式学习(MML)旨在共同利用不同方式的共同先验,以弥补其固有的局限性。但是,现有的MML方法通常会优化不同方式的统一目标,从而导致臭名昭著的“模态失衡”问题和适得其反的MML性能。为了解决问题,一些现有方法根据融合方式调节学习速度,该方法由更好的模态主导,最终导致较差模态的有限改善。为了更好地利用多模式的特征,我们提出了典型的模态重新平衡(PMR),以对特定的缓慢学习方式刺激,而不会干扰其他方式。具体而言,我们介绍了代表每个类的一般特征的原型,以构建非参数分类器进行单模式性能评估。然后,我们试图通过增强其对原型的聚类来加速慢学习方式。此外,为了减轻主要模式的抑制作用,我们在早期训练阶段引入了基于原型的熵正则项,以防止过早收敛。此外,我们的方法仅依赖于每种方式的表示,而无需模型结构和融合方法的限制,从而使其具有巨大的应用潜力。
Multimodal learning (MML) aims to jointly exploit the common priors of different modalities to compensate for their inherent limitations. However, existing MML methods often optimize a uniform objective for different modalities, leading to the notorious "modality imbalance" problem and counterproductive MML performance. To address the problem, some existing methods modulate the learning pace based on the fused modality, which is dominated by the better modality and eventually results in a limited improvement on the worse modal. To better exploit the features of multimodal, we propose Prototypical Modality Rebalance (PMR) to perform stimulation on the particular slow-learning modality without interference from other modalities. Specifically, we introduce the prototypes that represent general features for each class, to build the non-parametric classifiers for uni-modal performance evaluation. Then, we try to accelerate the slow-learning modality by enhancing its clustering toward prototypes. Furthermore, to alleviate the suppression from the dominant modality, we introduce a prototype-based entropy regularization term during the early training stage to prevent premature convergence. Besides, our method only relies on the representations of each modality and without restrictions from model structures and fusion methods, making it with great application potential for various scenarios.