论文标题
独立性分类测试的蒙特卡洛比较
A Monte Carlo comparison of categorical tests of independence
论文作者
论文摘要
$ x^2 $和$ g^2 $测试是测试两个分类变量独立性的最常用测试。但是,据我们所知,没有人对它们进行了广泛的比较,并最终回答了要使用的问题以及何时使用的问题。此外,它们在零频率的情况下的适用性已辩论,并提出了(非参数)置换测试。在这项工作中,我们进行了广泛的蒙特卡洛模拟研究,试图回答上述两个点。不出所料,在大型样品尺寸的情况下($> 1,000美元),$ x^2 $和$ g^2 $是无法区分的。但是,在小样本尺寸的情况下($ \ \ leq 1,000 $),我们提供了支持使用$ x^2 $测试的有力证据,无论无条件独立案件的零频率如何。另外,我们建议将基于置换的$ g^2 $测试用于测试有条件独立性,而计算上更昂贵。 $ g^2 $测试表现出劣质性能,其使用应受到限制。
The $X^2$ and $G^2$ tests are the most frequently applied tests for testing the independence of two categorical variables. However, no one, to the best of our knowledge has compared them, extensively, and ultimately answer the question of which to use and when. Further, their applicability in cases with zero frequencies has been debated and (non parametric) permutation tests are suggested. In this work we perform extensive Monte Carlo simulation studies attempting to answer both aforementioned points. As expected, in large sample sized cases ($>1,000$) the $X^2$ and $G^2$ are indistinguishable. In the small sample sized cases ($\leq 1,000$) though, we provide strong evidence supporting the use of the $X^2$ test regardless of zero frequencies for the case of unconditional independence. Also, we suggest the use of the permutation based $G^2$ test for testing conditional independence, at the cost of being computationally more expensive. The $G^2$ test exhibited inferior performance and its use should be limited.