论文标题
与因果不变转化的分布概括
Out-of-distribution Generalization with Causal Invariant Transformations
论文作者
论文摘要
在现实世界应用中,学习在分布(OOD)数据方面表现良好的模型非常重要且可取。最近,因果关系已成为解决OOD泛化问题的强大工具,其想法依赖于跨感兴趣领域的因果机制。为了利用普遍未知的因果机制,现有作品采用了因果特征的线性形式,或者需要足够多的多样化的培训领域,这些培训领域通常在实践中是限制性的。在这项工作中,我们消除了这些假设并解决了OOD问题,而无需明确恢复因果特征。我们的方法是基于修改非因果特征但没有变化的因果部分的转换,可以从先验知识中获得,也可以从多域情景中的培训数据中学到。在不变的因果机制的设置下,我们从理论上表明,如果所有这些转换都可用,那么我们可以仅使用单个域数据来学习在整个域上的最小最佳模型。注意到了解一组这些因果不变转化可能是不切实际的,我们进一步表明,仅知道这些转换的一部分就足够了。根据理论发现,提出了一个正规培训程序以提高OOD的概括能力。对合成数据集和实际数据集的广泛实验结果验证了所提出的算法的有效性,即使只有少数因果不变转换。
In real-world applications, it is important and desirable to learn a model that performs well on out-of-distribution (OOD) data. Recently, causality has become a powerful tool to tackle the OOD generalization problem, with the idea resting on the causal mechanism that is invariant across domains of interest. To leverage the generally unknown causal mechanism, existing works assume a linear form of causal feature or require sufficiently many and diverse training domains, which are usually restrictive in practice. In this work, we obviate these assumptions and tackle the OOD problem without explicitly recovering the causal feature. Our approach is based on transformations that modify the non-causal feature but leave the causal part unchanged, which can be either obtained from prior knowledge or learned from the training data in the multi-domain scenario. Under the setting of invariant causal mechanism, we theoretically show that if all such transformations are available, then we can learn a minimax optimal model across the domains using only single domain data. Noticing that knowing a complete set of these causal invariant transformations may be impractical, we further show that it suffices to know only a subset of these transformations. Based on the theoretical findings, a regularized training procedure is proposed to improve the OOD generalization capability. Extensive experimental results on both synthetic and real datasets verify the effectiveness of the proposed algorithm, even with only a few causal invariant transformations.