Infocse：信息聚集的对比度学习句子嵌入

论文标题

Infocse：信息聚集的对比度学习句子嵌入

InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

论文作者

Wu, Xing, Gao, Chaochen, Lin, Zijia, Han, Jizhong, Wang, Zhongyuan, Hu, Songlin

论文摘要

对比度学习已在嵌入学习的句子中进行了广泛的研究，该句子假设同一句子的不同观点的嵌入更接近。这个假设带来的约束很薄弱，良好的句子表示也应该能够重建原始句子片段。因此，本文提出了一个被称为Infocse的无监督句子嵌入的信息聚集的对比学习框架。 Infocse通过引入额外的蒙版语言模型任务和一个精心设计的网络来迫使[CLS]位置的表示来汇总密集的句子信息。我们在几个基准数据集W.R.T上评估了拟议的Infocse，语义文本相似性（STS）任务。实验结果表明，Infocse的表现优于SIMCSE的BERT基数的平均长矛相关性为2.60％，而Bert-Large的平均相关性在1.77％上，在无监督的句子表示学习方法中实现了最先进的结果。我们的代码可从https://github.com/caskcsg/sentemb/tree/main/infocse获得。

Contrastive learning has been extensively studied in sentence embedding learning, which assumes that the embeddings of different views of the same sentence are closer. The constraint brought by this assumption is weak, and a good sentence representation should also be able to reconstruct the original sentence fragments. Therefore, this paper proposes an information-aggregated contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE. InfoCSE forces the representation of [CLS] positions to aggregate denser sentence information by introducing an additional Masked language model task and a well-designed network. We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task. Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60% on BERT-base, and 1.77% on BERT-large, achieving state-of-the-art results among unsupervised sentence representation learning methods. Our code are available at https://github.com/caskcsg/sentemb/tree/main/InfoCSE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题