论文标题
嘈杂标签的深k-nn
Deep k-NN for Noisy Labels
论文作者
论文摘要
现代的机器学习模型通常经过训练的训练,其嘈杂标签的示例训练会损害性能且难以识别。在本文中,我们提供了一项实证研究,表明初步模型的logit层上一种简单的$ K $ neart基于邻居的过滤方法可以消除标签错误的训练数据并产生比许多最近提出的方法更准确的模型。我们还为其功效提供了新的统计保证。
Modern machine learning models are often trained on examples with noisy labels that hurt performance and are hard to identify. In this paper, we provide an empirical study showing that a simple $k$-nearest neighbor-based filtering approach on the logit layer of a preliminary model can remove mislabeled training data and produce more accurate models than many recently proposed methods. We also provide new statistical guarantees into its efficacy.