论文标题

具有可扩展流张量分解的建模和采矿多光值图

Modeling and Mining Multi-Aspect Graphs With Scalable Streaming Tensor Decomposition

论文作者

Gujral, Ekta

论文摘要

几乎每个现实世界的应用程序域都出现了图形,从在线社交网络一直到健康数据和电影收视率模式。通常,从某种意义上说,这种现实世界的图表是大而动态的,它们会随着时间的流逝而发展。此外,图表通常包含多方面的信息,即在社交​​网络中,我们可以在节点之间具有“交流方式”,例如谁的消息,谁呼叫谁以及谁评论谁的时间表等。 我们如何从此类多光值图中建模和开采有用的模式,例如该图中的节点社区?当要处理的数据量非常大时,我们如何确定这些图中的动态模式,如何处理流数据?为了回答这些问题,在本论文中,我们提出了基于新颖的张量方法来挖掘静态和动态多相图图。通常,张量是对矩阵的高阶概括,它可以代表高维多相关数据,例如时间不断发展的网络,协作网络和时空数据,例如脑电图(EEG)脑测量值。 该论文是在两个协同的推力中组织的:首先,我们专注于静态多相图,其目标是通过利用数据中的张量结构来识别节点之间的相干群落和模式。其次,随着我们的图形动态发展,我们专注于处理数据中的流媒体更新,而无需重新计算分解,但会逐步更新现有结果。

Graphs emerge in almost every real-world application domain, ranging from online social networks all the way to health data and movie viewership patterns. Typically, such real-world graphs are big and dynamic, in the sense that they evolve over time. Furthermore, graphs usually contain multi-aspect information i.e. in a social network, we can have the "means of communication" between nodes, such as who messages whom, who calls whom, and who comments on whose timeline and so on. How can we model and mine useful patterns, such as communities of nodes in that graph, from such multi-aspect graphs? How can we identify dynamic patterns in those graphs, and how can we deal with streaming data, when the volume of data to be processed is very large? In order to answer those questions, in this thesis, we propose novel tensor-based methods for mining static and dynamic multi-aspect graphs. In general, a tensor is a higher-order generalization of a matrix that can represent high-dimensional multi-aspect data such as time-evolving networks, collaboration networks, and spatio-temporal data like Electroencephalography (EEG) brain measurements. The thesis is organized in two synergistic thrusts: First, we focus on static multi-aspect graphs, where the goal is to identify coherent communities and patterns between nodes by leveraging the tensor structure in the data. Second, as our graphs evolve dynamically, we focus on handling such streaming updates in the data without having to re-compute the decomposition, but incrementally update the existing results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源