论文标题

在边缘采样后,使用贝叶斯方法重建图形统计

Using a Bayesian approach to reconstruct graph statistics after edge sampling

论文作者

Arnold, Naomi A., Mondragon, Raul J., Clegg, Richard G.

论文摘要

通常,由于大小过高或限制了数据收集API,因此不可能使用完整的网络数据集并需要采样。与Twitter API限制一致的一种采样类型是统一的边缘采样。在本文中,我们提出了一种从边缘采样网络中恢复两个基本网络属性的方法:学位分布和三角计数(我们估计网络的总数以及与每个边缘相关的计数)。我们使用贝叶斯方法,并显示了构建先验的一系列方法,该方法不需要关于原始网络的假设。我们的方法对两个合成和三个真实数据集进行了测试,这些数据集具有不同的大小,度分布,程度度相关性和三角计数分布。

Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and three real datasets with diverse sizes, degree distributions, degree-degree correlations and triangle count distributions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源