通过人类注释与虚假相关性的稳健性

论文标题

通过人类注释与虚假相关性的稳健性

Robustness to Spurious Correlations via Human Annotations

论文作者

Srivastava, Megha, Hashimoto, Tatsunori, Liang, Percy

论文摘要

机器学习系统的可靠性批判性地假设功能和标签之间的关联在培训和测试分布之间保持相似。但是，诸如混杂因素之类的未测量变量打破了这种假设 - 在训练时间时特征和标签之间的有用相关性可能在测试时变得无用甚至有害。例如，高肥胖通常可以预测心脏病，但是这种关系可能不适合肥胖率较低和心脏病率较高的吸烟者。我们提出了一个框架，可以通过利用人类的因果关系知识来使模型具有强大的虚假相关性。具体而言，我们使用人类注释来增强每个培训示例的潜在变量（即心脏病不足患者可能是吸烟者），从而将问题减少到协变量转移问题。然后，我们在未测量变量（UV-DRO）上引入了一个新的分布强大的优化目标，以控制可能的测试时间偏移。从经验上讲，我们在旋转困扰的数字识别任务上显示了5-10％的进步，而分析纽约警察局警察停止被位置混淆的任务为1.5-5％。

The reliability of machine learning systems critically assumes that the associations between features and labels remain similar between training and test distributions. However, unmeasured variables, such as confounders, break this assumption---useful correlations between features and labels at training time can become useless or even harmful at test time. For example, high obesity is generally predictive for heart disease, but this relation may not hold for smokers who generally have lower rates of obesity and higher rates of heart disease. We present a framework for making models robust to spurious correlations by leveraging humans' common sense knowledge of causality. Specifically, we use human annotation to augment each training example with a potential unmeasured variable (i.e. an underweight patient with heart disease may be a smoker), reducing the problem to a covariate shift problem. We then introduce a new distributionally robust optimization objective over unmeasured variables (UV-DRO) to control the worst-case loss over possible test-time shifts. Empirically, we show improvements of 5-10% on a digit recognition task confounded by rotation, and 1.5-5% on the task of analyzing NYPD Police Stops confounded by location.

下载PDF全文

下载文献需遵守相关版权规定

论文标题