论文标题

评估概率分类器:可靠性图和得分分解。

Evaluating probabilistic classifiers: Reliability diagrams and score decompositions revisited

论文作者

Dimitriadis, Timo, Gneiting, Tilmann, Jordan, Alexander I.

论文摘要

如果在可靠性图中在视觉上检查的前观察到的频率匹配预测的概率,则概率预测或概率分类器是可靠或校准的。在不可避免的,临时的实施决策下,缺乏稳定性,绘制可靠性图的经典结合方法和计数方法受到了阻碍。在这里,我们介绍了Corp方法,该方法以自动化的方式生成了统计上一致,最佳归纳和可重现的可靠性图。 CORP基于非参数等量回归,并通过池 - 粘附 - 侵略器(PAV)算法实现 - 基本上,Corp可靠性图显示了PAV-(RE)校准的预测概率的图。 Corp方法允许通过重新采样技术或渐近理论进行不确定性定量,提供了一种新的错误校准数值度量,并提供了基于Corp的Brier分数分解,该分解将其推广到任何适当的评分规则。我们预计,对PAV算法的明智用途可以改进诊断的工具,并推断出非常广泛的统计和机器学习方法。

A probability forecast or probabilistic classifier is reliable or calibrated if the predicted probabilities are matched by ex post observed frequencies, as examined visually in reliability diagrams. The classical binning and counting approach to plotting reliability diagrams has been hampered by a lack of stability under unavoidable, ad hoc implementation decisions. Here we introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way. CORP is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm - essentially, the CORP reliability diagram shows the graph of the PAV- (re)calibrated forecast probabilities. The CORP approach allows for uncertainty quantification via either resampling techniques or asymptotic theory, furnishes a new numerical measure of miscalibration, and provides a CORP based Brier score decomposition that generalizes to any proper scoring rule. We anticipate that judicious uses of the PAV algorithm yield improved tools for diagnostics and inference for a very wide range of statistical and machine learning methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源