论文标题
您的分类器实际上是有偏见的吗?伯恩斯坦边界的不确定性下测量公平性
Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds
论文作者
论文摘要
大多数NLP数据集未用受保护的属性(例如性别)注释,因此难以使用标准的公平度量(例如,均等机会)来衡量分类偏差。但是,用受保护的属性手动注释一个大数据集是缓慢而昂贵的。我们可以注释它们的子集并使用该样本来估计偏见,而不是注释所有示例吗?虽然可以这样做,但该带注释的样本的越小,我们越确定估计值接近真正的偏差。在这项工作中,我们建议使用伯恩斯坦边界来表示偏见估计作为置信区间的不确定性。我们提供了经验证据,表明95%的置信区间以这种方式得出了真正的偏见。在量化这种不确定性时,我们称之为伯恩斯坦结合的不公平性的方法有助于防止分类器在没有足够的证据来提出任何要求时被认为是偏见或无偏见的。我们的发现表明,目前用于测量特定偏见的数据集太小了,无法确定偏见,除了在大多数严重的情况下。例如,考虑一个共同参考的分辨率系统,该系统在性别型典型句子上要准确5% - 为了声称它具有95%的置信度,我们需要一个特定于偏见的数据集,该数据集比最大的可用的Winobias大3.8倍。
Most NLP datasets are not annotated with protected attributes such as gender, making it difficult to measure classification bias using standard measures of fairness (e.g., equal opportunity). However, manually annotating a large dataset with a protected attribute is slow and expensive. Instead of annotating all the examples, can we annotate a subset of them and use that sample to estimate the bias? While it is possible to do so, the smaller this annotated sample is, the less certain we are that the estimate is close to the true bias. In this work, we propose using Bernstein bounds to represent this uncertainty about the bias estimate as a confidence interval. We provide empirical evidence that a 95% confidence interval derived this way consistently bounds the true bias. In quantifying this uncertainty, our method, which we call Bernstein-bounded unfairness, helps prevent classifiers from being deemed biased or unbiased when there is insufficient evidence to make either claim. Our findings suggest that the datasets currently used to measure specific biases are too small to conclusively identify bias except in the most egregious cases. For example, consider a co-reference resolution system that is 5% more accurate on gender-stereotypical sentences -- to claim it is biased with 95% confidence, we need a bias-specific dataset that is 3.8 times larger than WinoBias, the largest available.