通过结构化噪声注入分解图像生成

论文标题

通过结构化噪声注入分解图像生成

Disentangled Image Generation Through Structured Noise Injection

论文作者

Alharbi, Yazeed, Wonka, Peter

论文摘要

我们探索不同的设计选择，以将噪声注入生成的对抗网络（GAN），目的是解开潜在空间。我们不是传统的方法，而是分别通过单独的完全连接层提出了多个噪声代码。目的是将每个噪声代码的影响限制为生成图像的特定部分。我们表明，发电机网络第一层中的分离导致生成图像中的分离。通过基于网格的结构，我们在不复杂网络体系结构并无需标签的情况下实现了分离的几个方面。我们从背景样式中实现了空间分离，规模空间的分离以及前景对象的分离，从而可以对生成的图像进行细粒度的控制。例如，面部图像中的面部表情更改，鸟图像中的喙长度改变以及汽车图像中的汽车尺寸变化。从经验上讲，与FFHQ数据集上的最新方法相比，这会导致更好的分离得分。

We explore different design choices for injecting noise into generative adversarial networks (GANs) with the goal of disentangling the latent space. Instead of traditional approaches, we propose feeding multiple noise codes through separate fully-connected layers respectively. The aim is restricting the influence of each noise code to specific parts of the generated image. We show that disentanglement in the first layer of the generator network leads to disentanglement in the generated image. Through a grid-based structure, we achieve several aspects of disentanglement without complicating the network architecture and without requiring labels. We achieve spatial disentanglement, scale-space disentanglement, and disentanglement of the foreground object from the background style allowing fine-grained control over the generated images. Examples include changing facial expressions in face images, changing beak length in bird images, and changing car dimensions in car images. This empirically leads to better disentanglement scores than state-of-the-art methods on the FFHQ dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题