Optimus：通过对潜在空间的预训练建模来组织句子

论文标题

Optimus：通过对潜在空间的预训练建模来组织句子

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

论文作者

Li, Chunyuan, Gao, Xiang, Li, Yuan, Peng, Baolin, Li, Xiujun, Zhang, Yizhe, Gao, Jianfeng

论文摘要

经过有效培训时，变分自动编码器（VAE）既可以是强大的生成模型，也可以是自然语言的有效表示学习框架。在本文中，我们提出了第一个大规模语言VAE模型Optimus。首先在大型文本语料库中预先训练句子的通用潜在嵌入空间，然后对各种语言生成和理解任务进行微调。与GPT-2相比，Optimus可以使用潜在向量从抽象级别产生指导语言。与BERT相比，由于平稳的潜在空间结构，Optimus可以更好地概括为低资源语言理解任务。广泛的语言任务的广泛实验结果证明了擎天柱的有效性。它实现了VAE语言建模基准测试基准的新最新。我们希望我们的第一个预先训练的大VAE语言模型本身，结果可以帮助NLP社区在大规模预训练的时代更新深层生成模型的利益，并使这些原则性的方法更加实用。

When trained effectively, the Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model, Optimus. A universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks. Compared with GPT-2, Optimus enables guided language generation from an abstract level using the latent vectors. Compared with BERT, Optimus can generalize better on low-resource language understanding tasks due to the smooth latent space structure. Extensive experimental results on a wide range of language tasks demonstrate the effectiveness of Optimus. It achieves new state-of-the-art on VAE language modeling benchmarks. We hope that our first pre-trained big VAE language model itself and results can help the NLP community renew the interests of deep generative models in the era of large-scale pre-training, and make these principled methods more practical.

下载PDF全文

下载文献需遵守相关版权规定

论文标题