通过句法角色在句子表示中朝着无监督的内容分解

论文标题

通过句法角色在句子表示中朝着无监督的内容分解

Towards Unsupervised Content Disentanglement in Sentence Representations via Syntactic Roles

论文作者

Felhi, Ghazi, Roux, Joseph Le, Seddah, Djamé

论文摘要

将神经表示与语言因素联系起来对于建立和分析人类可解释的NLP模型至关重要。在这些因素中，句法角色（例如主题，直接对象，$ \ dots $）及其实现是必不可少的标记，因为它们可以被理解为谓语结构的分解，因此可以理解为句子的含义。从引起注意的深层概率生成模型开始，我们测量了潜在变量与句法角色实现之间的相互作用，并表明可以在不监督的情况下获得句子的表示，而不同的语法角色对应于清晰识别不同的潜在变量。我们提出的概率模型是注意力驱动的变异自动编码器（Advae）。从基于变压器的机器翻译模型中汲取灵感，可以通过注意力分析潜在变量和输入令牌之间的相互作用。我们还制定了一个评估协议，以衡量有关句法角色的实现的分离。该协议基于编码器的注意力最大值和解码器的潜在变量扰动。我们对SNLI数据集的原始英语文本的实验表明，可以在不监督的情况下诱发$ \ textit {i）} $ dentangement stancactic角色，$ \ textit {ii）} $ advae分离句法角色比经典序列vaes and transformer vaes，$ \ textit vaes，$ \ textit contirate {iiii} $ connations fornational}仅通过干预相关的潜在变量来句子。我们的工作构成了无监督的可控内容生成的第一步。我们工作的代码公开可用。

Linking neural representations to linguistic factors is crucial in order to build and analyze NLP models interpretable by humans. Among these factors, syntactic roles (e.g. subjects, direct objects,$\dots$) and their realizations are essential markers since they can be understood as a decomposition of predicative structures and thus the meaning of sentences. Starting from a deep probabilistic generative model with attention, we measure the interaction between latent variables and realizations of syntactic roles and show that it is possible to obtain, without supervision, representations of sentences where different syntactic roles correspond to clearly identified different latent variables. The probabilistic model we propose is an Attention-Driven Variational Autoencoder (ADVAE). Drawing inspiration from Transformer-based machine translation models, ADVAEs enable the analysis of the interactions between latent variables and input tokens through attention. We also develop an evaluation protocol to measure disentanglement with regard to the realizations of syntactic roles. This protocol is based on attention maxima for the encoder and on latent variable perturbations for the decoder. Our experiments on raw English text from the SNLI dataset show that $\textit{i)}$ disentanglement of syntactic roles can be induced without supervision, $\textit{ii)}$ ADVAE separates syntactic roles better than classical sequence VAEs and Transformer VAEs, $\textit{iii)}$ realizations of syntactic roles can be separately modified in sentences by mere intervention on the associated latent variables. Our work constitutes a first step towards unsupervised controllable content generation. The code for our work is publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题