论文标题
成本感知的广义$α$ - 用于多个假设测试
Cost-aware Generalized $α$-investing for Multiple Hypothesis Testing
论文作者
论文摘要
我们考虑了通过非平凡数据收集成本进行连续多个假设检验的问题。例如,在进行生物学实验以鉴定疾病过程的差异表达基因时,就会出现此问题。这项工作建立在广义的$α$ investing框架上,该框架可以控制顺序测试设置中的错误发现率。我们对$α$ wealth的长期渐近行为进行了理论分析,该行为激发了$α$ investing决策规则中样本量的考虑。我们将测试过程作为具有自然的游戏,我们构建了一条决策规则,该规则优化了预期的$α$ - 智慧奖励(ERO),并为每个测试提供最佳样本量。经验结果表明,与其他方法相比,$ n $ n $是样本量的$ n = 1 $,正确拒绝的ERO决策规则正确拒绝了错误的虚假假设。当样本量不是固定的成本认识时,ERO使用零假设上的先验将样本预算自适应地分配给每个测试。我们将成本认识的ERO投资扩展到有限的Horizon测试,这使决策规则能够以非侧重的方式分配样本。最后,对生物实验的真实数据集的经验测试表明,成本意识的ERO平衡样品分配给单个测试,以与在多个测试中分配样品的分配。
We consider the problem of sequential multiple hypothesis testing with nontrivial data collection costs. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes of a disease process. This work builds on the generalized $α$-investing framework which enables control of the false discovery rate in a sequential testing setting. We make a theoretical analysis of the long term asymptotic behavior of $α$-wealth which motivates a consideration of sample size in the $α$-investing decision rule. Posing the testing process as a game with nature, we construct a decision rule that optimizes the expected $α$-wealth reward (ERO) and provides an optimal sample size for each test. Empirical results show that a cost-aware ERO decision rule correctly rejects more false null hypotheses than other methods for $n=1$ where $n$ is the sample size. When the sample size is not fixed cost-aware ERO uses a prior on the null hypothesis to adaptively allocate of the sample budget to each test. We extend cost-aware ERO investing to finite-horizon testing which enables the decision rule to allocate samples in a non-myopic manner. Finally, empirical tests on real data sets from biological experiments show that cost-aware ERO balances the allocation of samples to an individual test against the allocation of samples across multiple tests.