论文标题
带有牙龈CRF的潜在模板诱导
Latent Template Induction with Gumbel-CRFs
论文作者
论文摘要
学习控制句子的结构是文本生成中的一个具有挑战性的问题。现有工作要么依赖简单的确定性方法,要么基于RL的硬结构。我们探索使用结构化变分自动编码器的使用来推断潜在的模板使用软,连续放松,以利用重新聚集化进行训练。具体而言,我们提出了一个gumbel-crf,使用放松的前向后滤(FFBS)方法对CRF采样算法的连续放松。作为重新聚集梯度估计器,Gumbel-CRF比基于得分功能的估计器更稳定。作为一个结构化的推理网络,我们表明它在培训过程中学习了可解释的模板,这使我们能够在测试过程中控制解码器。我们通过实验数据到文本生成和无监督的释义生成来证明我们的方法的有效性。
Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators. As a structured inference network, we show that it learns interpretable templates during training, which allows us to control the decoder during testing. We demonstrate the effectiveness of our methods with experiments on data-to-text generation and unsupervised paraphrase generation.