论文标题

测量确定性绩效评估指标的班级不平衡敏感性

Measuring Class-Imbalance Sensitivity of Deterministic Performance Evaluation Metrics

论文作者

Ahmadzadeh, Azim, Angryk, Rafal A.

论文摘要

班级不平衡问题是许多现实世界中的机器学习任务的固有,尤其是对于罕见的事实分类问题。尽管数据不平衡的影响和处理是广为人知的,但度量标准对阶级失衡的敏感性的程度几乎没有引起关注。结果,敏感的指标通常被忽略,而其敏感性可能仅是边缘的。在本文中,我们介绍了一个直观的评估框架,该框架量化了指标对类失衡的敏感性。此外,我们揭示了一个有趣的事实,即指标的敏感性存在对数行为,这意味着较高的失衡比与指标的较低灵敏度有关。我们的框架建立了对阶级不平衡对指标的影响的直观理解。我们认为这可以帮助避免许多常见的错误,特别是强调和错误的假设,即在不同的级别不平衡比率下所有指标的数量都是可比的。

The class-imbalance issue is intrinsic to many real-world machine learning tasks, particularly to the rare-event classification problems. Although the impact and treatment of imbalanced data is widely known, the magnitude of a metric's sensitivity to class imbalance has attracted little attention. As a result, often the sensitive metrics are dismissed while their sensitivity may only be marginal. In this paper, we introduce an intuitive evaluation framework that quantifies metrics' sensitivity to the class imbalance. Moreover, we reveal an interesting fact that there is a logarithmic behavior in metrics' sensitivity meaning that the higher imbalance ratios are associated with the lower sensitivity of metrics. Our framework builds an intuitive understanding of the class-imbalance impact on metrics. We believe this can help avoid many common mistakes, specially the less-emphasized and incorrect assumption that all metrics' quantities are comparable under different class-imbalance ratios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源