学习一维亚策略以随后推断随机点产品图

论文标题

学习一维亚策略以随后推断随机点产品图

Learning 1-Dimensional Submanifolds for Subsequent Inference on Random Dot Product Graphs

论文作者

Trosset, Michael W., Gao, Mingyue, Tang, Minh, Priebe, Carey E.

论文摘要

随机点产品图（RDPG）是网络的生成模型，在该模型中，顶点对应于潜在欧几里得空间中的位置，而边缘概率由潜在位置的点产物确定。我们考虑将潜在位置随机从潜在空间的未知$ 1 $维度的子手机中随机采样的RDPG。原则上，受限制的推论，即利用亚曼叶结构的程序，应该比不受限制的推论更有效。但是，尚不清楚如何在子曼属尚不清楚的情况下进行限制推断。我们认为，流形学习的技术可以用来学习未知的子手机，足以实现受限制的推论。为了说明，我们使用完整的顶点来推断潜在的结构，测试有关小型顶点的弗雷奇手段的$ 1 $ - 和$ 2 $样本假设。我们提出了测试统计数据，该测试统计数据使用从估计的潜在位置构建的邻域图上使用最短的路径距离来估算未知$ 1 $维度的子手机上的弧形长度。与ISOMAP的常规应用不同，估计的潜在位置不在于感兴趣的子手机上。我们将ISOMAP的现有收敛结果扩展到此设置，并使用它们来证明，随着辅助顶点的数量增加，我们的测试的功率会收敛到已知子序列时的相应测试功能。最后，我们将方法应用于研究果蝇幼虫蘑菇体连接的推论问题。单变量学习的多种测试拒绝（$ p <0.05 $），而多变量环境空间测试没有（$ p \ gg0.05 $），说明了以后推断的识别和利用低维结构的价值。

A random dot product graph (RDPG) is a generative model for networks in which vertices correspond to positions in a latent Euclidean space and edge probabilities are determined by the dot products of the latent positions. We consider RDPGs for which the latent positions are randomly sampled from an unknown $1$-dimensional submanifold of the latent space. In principle, restricted inference, i.e., procedures that exploit the structure of the submanifold, should be more effective than unrestricted inference; however, it is not clear how to conduct restricted inference when the submanifold is unknown. We submit that techniques for manifold learning can be used to learn the unknown submanifold well enough to realize benefit from restricted inference. To illustrate, we test $1$- and $2$-sample hypotheses about the Fréchet means of small communities of vertices, using the complete set of vertices to infer latent structure. We propose test statistics that deploy the Isomap procedure for manifold learning, using shortest path distances on neighborhood graphs constructed from estimated latent positions to estimate arc lengths on the unknown $1$-dimensional submanifold. Unlike conventional applications of Isomap, the estimated latent positions do not lie on the submanifold of interest. We extend existing convergence results for Isomap to this setting and use them to demonstrate that, as the number of auxiliary vertices increases, the power of our test converges to the power of the corresponding test when the submanifold is known. Finally, we apply our methods to an inference problem that arises in studying the connectome of the Drosophila larval mushroom body. The univariate learnt manifold test rejects ($p<0.05$), while the multivariate ambient space test does not ($p\gg0.05$), illustrating the value of identifying and exploiting low-dimensional structure for subsequent inference.

下载PDF全文

下载文献需遵守相关版权规定

论文标题