在未观察到的混杂下的预测算法的强大设计和评估

论文标题

在未观察到的混杂下的预测算法的强大设计和评估

Robust Design and Evaluation of Predictive Algorithms under Unobserved Confounding

论文作者

Rambachan, Ashesh, Coston, Amanda, Kennedy, Edward

论文摘要

预测算法为在人类决策者做出的选择中选择性地观察到结果的环境中的结果决定。我们提出了一个统一的框架，用于选择性观察到的数据中预测算法的可靠设计和评估。我们对未选择的单位和选定单元之间的平均结果有条件的有条件和确定的滋扰参数对结果的平均值进行了一般性的假设，从而使流行的经验策略正式化，以推出缺失的数据，例如代理结果和仪器变量。我们为大量预测性能估计的界限开发了辩护的机器学习估计器，例如结果的有条件可能性，预测算法的均方误差，对/假阳性率以及许多其他假设。在澳大利亚大型金融机构的行政数据集中，我们说明了对未观察的混杂的假设如何导致默认风险预测的有意义变化，并评估跨敏感群体的信用评分。

Predictive algorithms inform consequential decisions in settings where the outcome is selectively observed given choices made by human decision makers. We propose a unified framework for the robust design and evaluation of predictive algorithms in selectively observed data. We impose general assumptions on how much the outcome may vary on average between unselected and selected units conditional on observed covariates and identified nuisance parameters, formalizing popular empirical strategies for imputing missing data such as proxy outcomes and instrumental variables. We develop debiased machine learning estimators for the bounds on a large class of predictive performance estimands, such as the conditional likelihood of the outcome, a predictive algorithm's mean square error, true/false positive rate, and many others, under these assumptions. In an administrative dataset from a large Australian financial institution, we illustrate how varying assumptions on unobserved confounding leads to meaningful changes in default risk predictions and evaluations of credit scores across sensitive groups.

下载PDF全文

下载文献需遵守相关版权规定

论文标题