一种用于变异自动编码器的稀疏启动词典模型

论文标题

一种用于变异自动编码器的稀疏启动词典模型

A Sparsity-promoting Dictionary Model for Variational Autoencoders

论文作者

Sadeghi, Mostafa, Magron, Paul

论文摘要

在概率的深层生成模型中构建潜在空间，例如变性自动编码器（VAE），对于产生更具表现力的模型和可解释的表示并避免过度拟合非常重要。实现这一目标的一种方法是对潜在变量（例如，通过拉普拉斯先验）施加稀疏性约束。但是，这种方法通常会使训练阶段变得复杂，并且它们牺牲了重建质量以促进稀疏性。在本文中，我们提出了一种简单而有效的方法，可以通过启动稀疏词典模型来构建潜在空间，该模型假设每个潜在代码可以写成词典的列的稀疏线性组合。特别是，我们利用了一种计算高效且无调的方法，该方法依赖于具有可学习差异的零均值高斯潜在方法。我们得出了训练模型的变异推理方案。语音产生建模的实验证明了所提出的方法比竞争技术的优势，因为它会促进稀疏性，同时又不降低输出语音质量。

Structuring the latent space in probabilistic deep generative models, e.g., variational autoencoders (VAEs), is important to yield more expressive models and interpretable representations, and to avoid overfitting. One way to achieve this objective is to impose a sparsity constraint on the latent variables, e.g., via a Laplace prior. However, such approaches usually complicate the training phase, and they sacrifice the reconstruction quality to promote sparsity. In this paper, we propose a simple yet effective methodology to structure the latent space via a sparsity-promoting dictionary model, which assumes that each latent code can be written as a sparse linear combination of a dictionary's columns. In particular, we leverage a computationally efficient and tuning-free method, which relies on a zero-mean Gaussian latent prior with learnable variances. We derive a variational inference scheme to train the model. Experiments on speech generative modeling demonstrate the advantage of the proposed approach over competing techniques, since it promotes sparsity while not deteriorating the output speech quality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题