论文标题

发现可靠的因果规则

Discovering Reliable Causal Rules

论文作者

Budhathoki, Kailash, Boley, Mario, Vreeken, Jilles

论文摘要

我们研究得出政策或规则的问题,即在复杂系统上制定时会导致预期的结果。没有执行受控实验的能力,必须从对系统行为的过去观察结果中推断出这些规则。这是一个具有挑战性的问题,有两个原因:首先,观察效应通常是基本因果效应的代表性,因为它们因存在混杂因素而偏向。其次,对规则效应的天真经验估计具有很大的差异,因此,它们的最大化会导致随机结果。 为了解决这些问题,首先,我们通过观察数据来衡量规则的因果效应---调整潜在混杂因素的影响。重要的是,我们提供了一个图形标准,在该标准下,因果关系发现可能是可能的。此外,要从样本中发现可靠的因果规则,我们提出了一个保守而一致的因果效应估计器,并得出了一种有效且精确的算法,从而最大程度地提高了估计量。在合成数据上,提出的估计器比天真估计器更快地收敛到地面真理,即使在小样本量下,也可以恢复相关的因果规则。在各种现实世界数据集上进行的广泛实验表明,所提出的算法是有效的,并且发现了有意义的规则。

We study the problem of deriving policies, or rules, that when enacted on a complex system, cause a desired outcome. Absent the ability to perform controlled experiments, such rules have to be inferred from past observations of the system's behaviour. This is a challenging problem for two reasons: First, observational effects are often unrepresentative of the underlying causal effect because they are skewed by the presence of confounding factors. Second, naive empirical estimations of a rule's effect have a high variance, and, hence, their maximisation can lead to random results. To address these issues, first we measure the causal effect of a rule from observational data---adjusting for the effect of potential confounders. Importantly, we provide a graphical criteria under which causal rule discovery is possible. Moreover, to discover reliable causal rules from a sample, we propose a conservative and consistent estimator of the causal effect, and derive an efficient and exact algorithm that maximises the estimator. On synthetic data, the proposed estimator converges faster to the ground truth than the naive estimator and recovers relevant causal rules even at small sample sizes. Extensive experiments on a variety of real-world datasets show that the proposed algorithm is efficient and discovers meaningful rules.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源