火锅：一种算法，可以找到变异自动编码器潜在表示的最佳维度

论文标题

火锅：一种算法，可以找到变异自动编码器潜在表示的最佳维度

FONDUE: an algorithm to find the optimal dimensionality of the latent representations of variational autoencoders

论文作者

Bonheme, Lisa, Grzes, Marek

论文摘要

当在给定数据集上训练变异自动编码器（VAE）时，确定最佳的潜在变量数量主要是通过网格搜索来完成的：在计算时间和碳足迹方面，这是一个昂贵的过程。在本文中，我们探讨了VAE所学的数据和潜在表示的内在维度估计（IDE）。我们表明，在训练只有几步之后，VAE的平均值和采样表示形式之间的差异揭示了潜在空间中被动变量的存在，而在良好的VAE中，这表明尺寸过多。使用此属性，我们提出了火锅：一种算法，很快找到了潜在维度的数量，之后，平均值和采样表示形式开始差异（即，当引入被动变量时），提供了选择VAE和AutoCencoders的潜在尺寸的原则方法。

When training a variational autoencoder (VAE) on a given dataset, determining the optimal number of latent variables is mostly done by grid search: a costly process in terms of computational time and carbon footprint. In this paper, we explore the intrinsic dimension estimation (IDE) of the data and latent representations learned by VAEs. We show that the discrepancies between the IDE of the mean and sampled representations of a VAE after only a few steps of training reveal the presence of passive variables in the latent space, which, in well-behaved VAEs, indicates a superfluous number of dimensions. Using this property, we propose FONDUE: an algorithm which quickly finds the number of latent dimensions after which the mean and sampled representations start to diverge (i.e., when passive variables are introduced), providing a principled method for selecting the number of latent dimensions for VAEs and autoencoders.

下载PDF全文

下载文献需遵守相关版权规定

论文标题