论文标题

使用更少的标记音频数据的指导生成对抗神经网络用于表示和高保真音频的生成

Guided Generative Adversarial Neural Network for Representation Learning and High Fidelity Audio Generation using Fewer Labelled Audio Data

论文作者

Haque, Kazi Nazmul, Rana, Rajib, Hansen, John H. L., Schuller, Björn

论文摘要

最新的生成对抗神经网络(GAN)的改善表明了它们产生更高质量样本的能力,以及学习转移学习的良好表示。基于gans的大多数表示学习方法学习表示其后场景,这可能会导致概括能力提高。但是,如果模型用于特定任务,则该模型可能会变得多余。例如,假设我们有一个庞大的未标记音频数据集,我们想从该数据集中学习一个表示形式,以便可以使用它来提高标记为小标记的音频数据集的情感识别性能。在代表学习培训期间,如果模型不知道情感后的识别任务,则可以完全忽略学习表示中与情绪相关的特征。对于任何无监督的表示学习模型,这都是一个基本挑战。在本文中,我们旨在通过提出一个新颖的GAN框架来应对这一挑战:引导生成神经网络(GGAN),该框架指导GAN专注于学习所需的表示形式,并生成较高质量的样品,以利用更少的标记样品来利用。实验结果表明,使用非常少量的标记数据作为指导,Ggan学会了明显更好的表示。

Recent improvements in Generative Adversarial Neural Networks (GANs) have shown their ability to generate higher quality samples as well as to learn good representations for transfer learning. Most of the representation learning methods based on GANs learn representations ignoring their post-use scenario, which can lead to increased generalisation ability. However, the model can become redundant if it is intended for a specific task. For example, assume we have a vast unlabelled audio dataset, and we want to learn a representation from this dataset so that it can be used to improve the emotion recognition performance of a small labelled audio dataset. During the representation learning training, if the model does not know the post emotion recognition task, it can completely ignore emotion-related characteristics in the learnt representation. This is a fundamental challenge for any unsupervised representation learning model. In this paper, we aim to address this challenge by proposing a novel GAN framework: Guided Generative Neural Network (GGAN), which guides a GAN to focus on learning desired representations and generating superior quality samples for audio data leveraging fewer labelled samples. Experimental results show that using a very small amount of labelled data as guidance, a GGAN learns significantly better representations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源