使用知识蒸馏进行多语言嵌入单语句子

论文标题

使用知识蒸馏进行多语言嵌入单语句子

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

论文作者

Reimers, Nils, Gurevych, Iryna

论文摘要

我们提出了一种简单有效的方法，可以将现有句子嵌入模型扩展到新语言。这允许从以前单语模型创建多语言版本。培训是基于这样的想法，即应将翻译句子映射到矢量空间中与原始句子相同的位置。我们使用原始（单语）模型来生成源语言的句子嵌入，然后在翻译句子上训练新系统以模仿原始模型。与训练多语言句子嵌入的其他方法相比，此方法具有多个优点：将相对较少样本的现有模型扩展到新语言非常容易，确保矢量空间的所需属性更容易，并且培训的硬件要求较低。我们证明了我们的方法对来自各种语言家庭的50多种语言的有效性。公开可用的代码将句子嵌入模型扩展到400多种语言。

We present an easy and efficient method to extend existing sentence embedding models to new languages. This allows to create multilingual versions from previously monolingual models. The training is based on the idea that a translated sentence should be mapped to the same location in the vector space as the original sentence. We use the original (monolingual) model to generate sentence embeddings for the source language and then train a new system on translated sentences to mimic the original model. Compared to other methods for training multilingual sentence embeddings, this approach has several advantages: It is easy to extend existing models with relatively few samples to new languages, it is easier to ensure desired properties for the vector space, and the hardware requirements for training is lower. We demonstrate the effectiveness of our approach for 50+ languages from various language families. Code to extend sentence embeddings models to more than 400 languages is publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题