论文标题
句子类比:探索句子嵌入中语言关系和规律性
Sentence Analogies: Exploring Linguistic Relationships and Regularities in Sentence Embeddings
论文作者
论文摘要
尽管已经对单词矢量表示的重要属性进行了广泛的研究,但对句子向量表示的属性知之甚少。通常,通过评估它们在多大程度上表现出与单词类比中考虑的类型的关系有关的程度来评估词向量。在本文中,我们调查了通常使用的句子矢量表示空间在多大程度上反映了某些规律性。我们提出了许多方案来诱导评估数据,基于词汇类比图数据以及句子之间的语义关系。我们的实验考虑了广泛的句子嵌入方法,包括基于BERT式上下文嵌入的方法。我们发现,不同模型反映这种规律性的能力有很大差异。
While important properties of word vector representations have been studied extensively, far less is known about the properties of sentence vector representations. Word vectors are often evaluated by assessing to what degree they exhibit regularities with regard to relationships of the sort considered in word analogies. In this paper, we investigate to what extent commonly used sentence vector representation spaces as well reflect certain kinds of regularities. We propose a number of schemes to induce evaluation data, based on lexical analogy data as well as semantic relationships between sentences. Our experiments consider a wide range of sentence embedding methods, including ones based on BERT-style contextual embeddings. We find that different models differ substantially in their ability to reflect such regularities.