简化和鲁棒化的负面抽样，以进行隐式协作过滤

论文标题

简化和鲁棒化的负面抽样，以进行隐式协作过滤

Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

论文作者

Ding, Jingtao, Quan, Yuhan, Yao, Quanming, Li, Yong, Jin, Depeng

论文摘要

阴性采样方法在隐式协作过滤中普遍存在，以从大量未标记的数据中获得负标签。由于使用结构复杂并忽视虚假负面实例的风险，由于负面抽样的两个主要问题，效率和有效性仍然无法完全实现。在本文中，我们首先通过经验观察到只有少数实例对模型学习至关重要，从而对负面实例提供了一种新颖的理解，而对于许多训练迭代，虚假的否定性往往具有稳定的预测。上面的发现激励我们通过从设计的内存中取样来简化模型，该记忆仅存储一些重要的候选者，更重要的是，通过赞成存储在内存中的高变化样本来解决未接触过的虚假问题，从而可以有效地采样真正的消极，以高质量的高质量地采样。两个合成数据集和三个现实世界数据集的经验结果既证明了我们的负抽样方法的鲁棒性和优势。

Negative sampling approaches are prevalent in implicit collaborative filtering for obtaining negative labels from massive unlabeled data. As two major concerns in negative sampling, efficiency and effectiveness are still not fully achieved by recent works that use complicate structures and overlook risk of false negative instances. In this paper, we first provide a novel understanding of negative instances by empirically observing that only a few instances are potentially important for model learning, and false negatives tend to have stable predictions over many training iterations. Above findings motivate us to simplify the model by sampling from designed memory that only stores a few important candidates and, more importantly, tackle the untouched false negative problem by favouring high-variance samples stored in memory, which achieves efficient sampling of true negatives with high-quality. Empirical results on two synthetic datasets and three real-world datasets demonstrate both robustness and superiorities of our negative sampling method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题