论文标题
具有分层算法的多个网络基于模型的聚类
Model-based clustering of multiple networks with a hierarchical algorithm
论文作者
论文摘要
该论文解决了将多个网络集群的问题(不共享相同的顶点)分成具有相似拓扑的网络组的问题。提出了一种基于统计模型的方法基于随机块模型的有限混合物。通过最大化综合分类可能性标准获得聚类。这是由层次集聚算法完成的,该算法始于Singleton簇,并依次合并网络簇。因此,计算一系列嵌套聚类,可以通过为网络集合提供有价值的见解来表示。使用贝叶斯框架,由于达到最佳数量的簇数时,算法会停止以自动化的方式进行模型选择。仔细实施时,该算法在计算上是有效的。簇的聚合需要一种克服随机块模型的标签开关问题并匹配网络标签的方法。为了解决这个问题,根据相关随机块模型的图形比较提出了一个新工具。基于合成数据评估聚类方法。一组生态网络的应用说明了获得结果的解释性。
The paper tackles the problem of clustering multiple networks, directed or not, that do not share the same set of vertices, into groups of networks with similar topology. A statistical model-based approach based on a finite mixture of stochastic block models is proposed. A clustering is obtained by maximizing the integrated classification likelihood criterion. This is done by a hierarchical agglomerative algorithm, that starts from singleton clusters and successively merges clusters of networks. As such, a sequence of nested clusterings is computed that can be represented by a dendrogram providing valuable insights on the collection of networks. Using a Bayesian framework, model selection is performed in an automated way since the algorithm stops when the best number of clusters is attained. The algorithm is computationally efficient, when carefully implemented. The aggregation of clusters requires a means to overcome the label-switching problem of the stochastic block model and to match the block labels of the networks. To address this problem, a new tool is proposed based on a comparison of the graphons of the associated stochastic block models. The clustering approach is assessed on synthetic data. An application to a set of ecological networks illustrates the interpretability of the obtained results.