论文标题
Slim:移动性数据的可扩展链接
SLIM: Scalable Linkage of Mobility Data
论文作者
论文摘要
我们提出了一种可扩展的解决方案,可使用其时空信息链接跨移动性数据集的实体。在许多应用程序中,这是一个基本问题,例如将用户身份链接到安全性,了解基于位置服务的隐私限制或从多个城市规划来源中生成统一数据集。这种集成的数据集对于服务提供商进行优化其服务并改善商业智能也是必不可少的。在本文中,我们首先提出了基于移动性的表示和相似性计算实体的相似性计算。然后开发一个有效的匹配过程来识别最终链接对,并具有自动化的机制,以决定何时停止链接。我们使用基于局部敏感的散列方法(LSH)方法来扩展过程,该方法可显着降低候选对匹配的匹配。为了实现我们技术在实践中的有效性和效率,我们引入了一种称为Slim的算法。在实验评估中,Slim在精确和召回方面胜过两种现有的最新方法。此外,基于LSH的方法带来了两到四个数量级的速度。
We present a scalable solution to link entities across mobility datasets using their spatio-temporal information. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning. Such integrated datasets are also essential for service providers to optimise their services and improve business intelligence. In this paper, we first propose a mobility based representation and similarity computation for entities. An efficient matching process is then developed to identify the final linked pairs, with an automated mechanism to decide when to stop the linkage. We scale the process with a locality-sensitive hashing (LSH) based approach that significantly reduces candidate pairs for matching. To realize the effectiveness and efficiency of our techniques in practice, we introduce an algorithm called SLIM. In the experimental evaluation, SLIM outperforms the two existing state-of-the-art approaches in terms of precision and recall. Moreover, the LSH-based approach brings two to four orders of magnitude speedup.