论文标题

通过变量EM的组成数据的潜在网络估计和可变选择

Latent Network Estimation and Variable Selection for Compositional Data via Variational EM

论文作者

Osborne, Nathan, Peterson, Christine B., Vannucci, Marina

论文摘要

网络估计和可变选择在统计文献中已经进行了广泛的研究,但直到最近才同时解决了这两个挑战。在本文中,我们试图开发一种新的方法,以同时估计网络相互作用和与计数数据相关协变量的关联,特别是针对具有固定总和约束的组合数据。我们使用具有潜在层的分层贝叶斯模型,并采用尖峰和单杆先验进行边缘和协变量选择。对于后推理,我们开发了一种具有期望最大化步骤的新型变异推理方案,以实现有效的估计。通过模拟研究,我们证明了所提出的模型在网络恢复的准确性方面优于现有方法。我们通过应用于微生物组数据的应用显示了模型的实际实用性。人类微生物组已被证明有助于人体的许多功能,并且还与多种疾病有关。在我们的应用中,我们寻求更好地了解微生物与相关协变量之间的相互作用,以及微生物之间的相互作用。我们提供了我们的算法的Python实现,称为SINC(同时推断网络和协变量),在线获得。

Network estimation and variable selection have been extensively studied in the statistical literature, but only recently have those two challenges been addressed simultaneously. In this paper, we seek to develop a novel method to simultaneously estimate network interactions and associations to relevant covariates for count data, and specifically for compositional data, which have a fixed sum constraint. We use a hierarchical Bayesian model with latent layers and employ spike-and-slab priors for both edge and covariate selection. For posterior inference, we develop a novel variational inference scheme with an expectation maximization step, to enable efficient estimation. Through simulation studies, we demonstrate that the proposed model outperforms existing methods in its accuracy of network recovery. We show the practical utility of our model via an application to microbiome data. The human microbiome has been shown to contribute to many of the functions of the human body, and also to be linked with a number of diseases. In our application, we seek to better understand the interaction between microbes and relevant covariates, as well as the interaction of microbes with each other. We provide a Python implementation of our algorithm, called SINC (Simultaneous Inference for Networks and Covariates), available online.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源