论文标题
探索密集矢量的优势,即在副外检测任务中的意图类别的一式编码
Exploring the Advantages of Dense-Vector to One-Hot Encoding of Intent Classes in Out-of-Scope Detection Tasks
论文作者
论文摘要
这项工作探讨了当需要检测出副本外(OOS)输入时,流行的一式壁炉编码方法的内在局限性。尽管最近的工作表明,当意图类别以特定领域的特定知识表示为密集媒介时,OOS检测可能会有显着改善,但我们在本文中认为,这种增长更有可能是由于密集矢量的优势在表示OOS空间的复杂性方面具有一个热的编码方法。我们首先展示密集矢量编码如何与一式式编码方法创建具有更丰富拓扑的OOS空间。然后,我们通过经验证明,使用四个标准意图分类数据集,无知识的,随机生成的意向类的密集矢量编码可以产生巨大的,超过20%的收益,超过20%的单行编码,并且表现出了以前的,域知识知识的基于知识的SOTA,其中一个数据集的SOTA。我们通过描述一种新型算法来搜索良好的密度矢量编码并呈现其使用的初始但有希望的实验结果来结束。
This work explores the intrinsic limitations of the popular one-hot encoding method in classification of intents when detection of out-of-scope (OOS) inputs is required. Although recent work has shown that there can be significant improvements in OOS detection when the intent classes are represented as dense-vectors based on domain specific knowledge, we argue in this paper that such gains are more likely due to advantages of dense-vector to one-hot encoding methods in representing the complexity of the OOS space. We start by showing how dense-vector encodings can create OOS spaces with much richer topologies than one-hot encoding methods. We then demonstrate empirically, using four standard intent classification datasets, that knowledge-free, randomly generated dense-vector encodings of intent classes can yield massive, over 20% gains over one-hot encodings, and also outperform the previous, domain knowledge-based, SOTA of one of the datasets. We finish by describing a novel algorithm to search for good dense-vector encodings and present initial but promising experimental results of its use.