论文标题
校准学习以延迟一vs的分类器
Calibrated Learning to Defer with One-vs-All Classifiers
论文作者
论文摘要
学习推迟(L2D)框架有可能使AI系统更安全。对于给定的输入,如果人类比模型更有可能采取正确的行动,则系统可以将决定推迟给人类。我们研究L2D系统的校准,研究它们输出的概率是否合理。我们发现Mozannar&Sontag(2020)多类框架没有针对专家正确性进行校准。此外,由于其参数化为此目的而退化,因此甚至不能保证产生有效的概率。我们提出了一个基于单VS-ALL分类器的L2D系统,该系统能够产生专家正确性的校准概率。此外,我们的损失函数也是多类L2D的一致替代,例如Mozannar&Sontag(2020)。我们的实验验证了我们的系统校准不仅是我们的系统校准,而且这种好处无需准确。我们的模型的准确性总是可比(通常是优越)与Mozannar&Sontag(2020)模型的模型相比,从仇恨言论检测到星系分类到诊断皮肤病变的任务。
The learning to defer (L2D) framework has the potential to make AI systems safer. For a given input, the system can defer the decision to a human if the human is more likely than the model to take the correct action. We study the calibration of L2D systems, investigating if the probabilities they output are sound. We find that Mozannar & Sontag's (2020) multiclass framework is not calibrated with respect to expert correctness. Moreover, it is not even guaranteed to produce valid probabilities due to its parameterization being degenerate for this purpose. We propose an L2D system based on one-vs-all classifiers that is able to produce calibrated probabilities of expert correctness. Furthermore, our loss function is also a consistent surrogate for multiclass L2D, like Mozannar & Sontag's (2020). Our experiments verify that not only is our system calibrated, but this benefit comes at no cost to accuracy. Our model's accuracy is always comparable (and often superior) to Mozannar & Sontag's (2020) model's in tasks ranging from hate speech detection to galaxy classification to diagnosis of skin lesions.