论文标题
样本外评分和因果估计器的自动选择
Out-of-sample scoring and automatic selection of causal estimators
论文作者
论文摘要
最近,已经发布和开源了许多有条件平均治疗效果(CATE)和仪器变量(IV)问题的因果估计量,从而估算了随机治疗(例如A/B测试)和用户选择对目标结果的颗粒状影响。但是,此类模型的实际应用使Ben妨碍了缺乏有效的方法来对此类模型的性能进行评分,以便为给定的应用程序选择最佳的模型。我们通过为CATE案例和仪器变量问题的重要子集提出新颖的评分方法来解决这一差距,即仪器变量是客户加入产品功能的那些差距,而治疗方法是客户选择使用该功能的方法。能够从样本中得分的模型性能得分使我们能够将高参数优化方法应用于因果模型选择和调整。我们在依赖Dowhy和Econml库的开源软件包中实现了这一点,以实现因果推理模型(还包括转换后的结果模型实现),以及用于超参数优化的FLAML和用于因果模型中使用的组件模型。我们在合成数据上证明,在随机的CATE和IV情况下,优化提出的分数是选择模型及其超参数值的可靠方法,其估计值接近真正的影响。此外,我们提供了将这些方法应用于Wise的真实客户数据的检查。
Recently, many causal estimators for Conditional Average Treatment Effect (CATE) and instrumental variable (IV) problems have been published and open sourced, allowing to estimate granular impact of both randomized treatments (such as A/B tests) and of user choices on the outcomes of interest. However, the practical application of such models has ben hampered by the lack of a valid way to score the performance of such models out of sample, in order to select the best one for a given application. We address that gap by proposing novel scoring approaches for both the CATE case and an important subset of instrumental variable problems, namely those where the instrumental variable is customer acces to a product feature, and the treatment is the customer's choice to use that feature. Being able to score model performance out of sample allows us to apply hyperparameter optimization methods to causal model selection and tuning. We implement that in an open source package that relies on DoWhy and EconML libraries for implementation of causal inference models (and also includes a Transformed Outcome model implementation), and on FLAML for hyperparameter optimization and for component models used in the causal models. We demonstrate on synthetic data that optimizing the proposed scores is a reliable method for choosing the model and its hyperparameter values, whose estimates are close to the true impact, in the randomized CATE and IV cases. Further, we provide examles of applying these methods to real customer data from Wise.