在多模式变分方法中学习更多表达性的联合分布

论文标题

在多模式变分方法中学习更多表达性的联合分布

Learning more expressive joint distributions in multimodal variational methods

论文作者

Nedelkoski, Sasho, Bogojeski, Mihail, Kao, Odej

论文摘要

数据通常是由多种模态形成的，它们共同描述了观察到的现象。对多模式数据的联合分布进行建模需要更大的表达能力来捕获高级概念并提供更好的数据表示。但是，由于缺乏近似后验的灵活性，基于变异推理的多模式生成模型受到限制，这是通过在已知的分布参数家族中搜索而获得的。我们介绍了一种使用归一化流量的多模式变异方法的表示能力的方法。它以简单的参数分布近似关节后部，然后转化为更复杂的参数分布。通过几个实验，我们证明了该模型基于对各种计算机视觉任务（例如着色，边缘和掩码检测）以及弱监督学习的各种计算机视觉任务的变异推断的最先进的多模式方法的改进。我们还表明，学习更强大的近似联合分布可以提高生成样品的质量。我们的模型代码可在https://github.com/sashonedelkoski/bpfdmvm上公开获得。

Data often are formed of multiple modalities, which jointly describe the observed phenomena. Modeling the joint distribution of multimodal data requires larger expressive power to capture high-level concepts and provide better data representations. However, multimodal generative models based on variational inference are limited due to the lack of flexibility of the approximate posterior, which is obtained by searching within a known parametric family of distributions. We introduce a method that improves the representational capacity of multimodal variational methods using normalizing flows. It approximates the joint posterior with a simple parametric distribution and subsequently transforms into a more complex one. Through several experiments, we demonstrate that the model improves on state-of-the-art multimodal methods based on variational inference on various computer vision tasks such as colorization, edge and mask detection, and weakly supervised learning. We also show that learning more powerful approximate joint distributions improves the quality of the generated samples. The code of our model is publicly available at https://github.com/SashoNedelkoski/BPFDMVM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题