论文标题
皮毛小球:用于图抽样的Python库
Little Ball of Fur: A Python Library for Graph Sampling
论文作者
论文摘要
采样图是数据挖掘的重要任务。在本文中,我们描述了小毛皮的小球,其中包括二十多种图形采样算法。我们的目标是在单个简化的框架中,使许多专业人士,研究人员和学生可以使用基于节点,边缘和探索的网络抽样技术。我们创建了此框架,重点关注一个连贯的应用程序公共接口,该应用程序具有方便的设计,通用输入数据需求和合理的算法基线设置。在这里,我们详细介绍了这些框架的这些设计基础,其中包括说明性代码片段。我们通过估计社交网络和Web图的各种全球统计数据来显示库的实际可用性。实验表明,小皮毛球可以加快节点和整个图形嵌入技术的速度,并且蒸馏特征的预测价值有轻度恶化。
Sampling graphs is an important task in data mining. In this paper, we describe Little Ball of Fur a Python library that includes more than twenty graph sampling algorithms. Our goal is to make node, edge, and exploration-based network sampling techniques accessible to a large number of professionals, researchers, and students in a single streamlined framework. We created this framework with a focus on a coherent application public interface which has a convenient design, generic input data requirements, and reasonable baseline settings of algorithms. Here we overview these design foundations of the framework in detail with illustrative code snippets. We show the practical usability of the library by estimating various global statistics of social networks and web graphs. Experiments demonstrate that Little Ball of Fur can speed up node and whole graph embedding techniques considerably with mildly deteriorating the predictive value of distilled features.