论文标题

评估和组合多个数据可视化的光谱方法

A Spectral Method for Assessing and Combining Multiple Data Visualizations

论文作者

Ma, Rong, Sun, Eric D., Zou, James

论文摘要

降低和数据可视化旨在将高维数据集投射到低维空间,同时捕获数据中的内在结构。它是现代数据科学不可或缺的一部分,并且已经开发了许多维度降低和可视化算法。但是,不同的算法具有自己的优势和劣势,因此评估其相对性能至关重要,并利用和结合其个人优势。在本文中,我们提出了一种有效的光谱方法,用于评估和组合由不同算法产生的给定数据集的多个可视化方法。所提出的方法提供了一种定量度量 - 可视化特征库 - 可视化的相对性能,用于保留每个数据点周围的结构。然后,它利用特征库来获得共识可视化,在捕获基本的真实数据结构时,它具有更大的提高(质量优于单个可视化的质量。}我们的方法是灵活的,并且可以用作任何可视化的包装器。我们分析了来自不同应用程序的多个模拟和现实世界数据集,以证明特征科方在评估可视化和提议共识可视化的优势方面的有效性。此外,我们基于一般统计框架建立了对方法的严格理论理由,从而产生了共识可视化的经验成功背后的基本原理以及实际的指导。

Dimension reduction and data visualization aim to project a high-dimensional dataset to a low-dimensional space while capturing the intrinsic structures in the data. It is an indispensable part of modern data science, and many dimensional reduction and visualization algorithms have been developed. However, different algorithms have their own strengths and weaknesses, making it critically important to evaluate their relative performance for a given dataset, and to leverage and combine their individual strengths. In this paper, we propose an efficient spectral method for assessing and combining multiple visualizations of a given dataset produced by diverse algorithms. The proposed method provides a quantitative measure -- the visualization eigenscore -- of the relative performance of the visualizations for preserving the structure around each data point. Then it leverages the eigenscores to obtain a consensus visualization, which has much improved { quality over the individual visualizations in capturing the underlying true data structure.} Our approach is flexible and works as a wrapper around any visualizations. We analyze multiple simulated and real-world datasets from diverse applications to demonstrate the effectiveness of the eigenscores for evaluating visualizations and the superiority of the proposed consensus visualization. Furthermore, we establish rigorous theoretical justification of our method based on a general statistical framework, yielding fundamental principles behind the empirical success of consensus visualization along with practical guidance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源