论文标题
Optimus:通过对潜在空间的预训练建模来组织句子
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
论文作者
论文摘要
经过有效培训时,变分自动编码器(VAE)既可以是强大的生成模型,也可以是自然语言的有效表示学习框架。在本文中,我们提出了第一个大规模语言VAE模型Optimus。首先在大型文本语料库中预先训练句子的通用潜在嵌入空间,然后对各种语言生成和理解任务进行微调。与GPT-2相比,Optimus可以使用潜在向量从抽象级别产生指导语言。与BERT相比,由于平稳的潜在空间结构,Optimus可以更好地概括为低资源语言理解任务。广泛的语言任务的广泛实验结果证明了擎天柱的有效性。它实现了VAE语言建模基准测试基准的新最新。我们希望我们的第一个预先训练的大VAE语言模型本身,结果可以帮助NLP社区在大规模预训练的时代更新深层生成模型的利益,并使这些原则性的方法更加实用。
When trained effectively, the Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model, Optimus. A universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks. Compared with GPT-2, Optimus enables guided language generation from an abstract level using the latent vectors. Compared with BERT, Optimus can generalize better on low-resource language understanding tasks due to the smooth latent space structure. Extensive experimental results on a wide range of language tasks demonstrate the effectiveness of Optimus. It achieves new state-of-the-art on VAE language modeling benchmarks. We hope that our first pre-trained big VAE language model itself and results can help the NLP community renew the interests of deep generative models in the era of large-scale pre-training, and make these principled methods more practical.