探索个性化单词嵌入的价值

论文标题

探索个性化单词嵌入的价值

Exploring the Value of Personalized Word Embeddings

论文作者

Welch, Charles, Kummerfeld, Jonathan K., Pérez-Rosas, Verónica, Mihalcea, Rada

论文摘要

在本文中，我们介绍了个性化的单词嵌入，并检查其语言建模的价值。我们比较了使用个性化和通用单词表示形式时提出的预测模型的性能，并研究如何利用这些表示形式以提高性能。我们提供有关在构建个性化模型时可以更准确预测哪些单词类型的洞察力。我们的结果表明，属于特定心理语言类别的一部分单词在各种用户的表示中往往会有所不同，而将通用和个性化的单词嵌入式结合起来会产生最佳性能，而相对降低了4.7％。此外，我们表明使用个性化单词嵌入的语言模型可以有效地用于作者归因。

In this paper, we introduce personalized word embeddings, and examine their value for language modeling. We compare the performance of our proposed prediction model when using personalized versus generic word representations, and study how these representations can be leveraged for improved performance. We provide insight into what types of words can be more accurately predicted when building personalized models. Our results show that a subset of words belonging to specific psycholinguistic categories tend to vary more in their representations across users and that combining generic and personalized word embeddings yields the best performance, with a 4.7% relative reduction in perplexity. Additionally, we show that a language model using personalized word embeddings can be effectively used for authorship attribution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题