论文标题
部分可观测时空混沌系统的无模型预测
Inferring independent sets of Gaussian variables after thresholding correlations
论文作者
论文摘要
我们考虑测试从数据中选择的一组高斯变量是否与其余变量无关。我们假设该集合是通过一种非常简单的方法选择的,该方法通常在科学学科中使用:我们选择了一组变量,该变量与集合之外的所有变量相关性均低于某些阈值。与选择性推断中的其他设置不同,在这种情况下,未能说明选择步骤的导致过度保守(与抗保守性)结果过于保守。我们提出的测试适当说明了从数据中选择一组变量的事实,因此并非过分保守。为了开发我们的测试,我们条件是选择选择导致相关变量集。为了实现计算障碍,我们根据随机变量组之间的规范相关性开发了调节事件的新表征。在模拟研究和基因共表达网络的分析中,我们表明我们的方法比忽略选择效果的``幼稚''方法具有更高的功率。
We consider testing whether a set of Gaussian variables, selected from the data, is independent of the remaining variables. We assume that this set is selected via a very simple approach that is commonly used across scientific disciplines: we select a set of variables for which the correlation with all variables outside the set falls below some threshold. Unlike other settings in selective inference, failure to account for the selection step leads, in this setting, to excessively conservative (as opposed to anti-conservative) results. Our proposed test properly accounts for the fact that the set of variables is selected from the data, and thus is not overly conservative. To develop our test, we condition on the event that the selection resulted in the set of variables in question. To achieve computational tractability, we develop a new characterization of the conditioning event in terms of the canonical correlation between the groups of random variables. In simulation studies and in the analysis of gene co-expression networks, we show that our approach has much higher power than a ``naive'' approach that ignores the effect of selection.