论文标题

Stull:无偏见的在线抽样,用于视觉探索大型时空数据

STULL: Unbiased Online Sampling for Visual Exploration of Large Spatiotemporal Data

论文作者

Wang, Guizhen, Guo, Jingjing, Tang, Mingjie, Neto, José Florencio de Queiroz, Yau, Calvin, Daghistani, Anas, Karimzadeh, Morteza, Aref, Walid G., Ebert, David S.

论文摘要

在线采样支持的视觉分析越来越重要,因为它允许用户探索以交互速度可接受的大概答案的大型数据集。但是,现有的在线时空抽样技术通常是偏见的,因为大多数研究人员主要致力于减少计算潜伏期。偏见的采样方法选择具有不平等概率的数据,并产生与确切数据分布相匹配的结果,导致最终用户进行错误的解释。在本文中,我们提出了一种新的方法来对大型时空数据进行公正的在线抽样。提出的方法可确保选择与用户多维查询的规格的每个点相同的概率。为了实现无偏的采样以实现准确的代表性交互式可视化,我们设计了一个新的数据指数和相关的样品检索计划。我们提出的采样方法适用于各种视觉分析任务,例如,运行时空数据的总查询的任务。广泛的实验证实了我们的方法比最先进的空间在线抽样技术的优越性,表明在同一计算时间内,我们方法中生成的数据采样至少要准确50%,以表示数据的实际空间分布并启用近似可视化的可视化,以表现出更接近的可视化外观。

Online sampling-supported visual analytics is increasingly important, as it allows users to explore large datasets with acceptable approximate answers at interactive rates. However, existing online spatiotemporal sampling techniques are often biased, as most researchers have primarily focused on reducing computational latency. Biased sampling approaches select data with unequal probabilities and produce results that do not match the exact data distribution, leading end users to incorrect interpretations. In this paper, we propose a novel approach to perform unbiased online sampling of large spatiotemporal data. The proposed approach ensures the same probability of selection to every point that qualifies the specifications of a user's multidimensional query. To achieve unbiased sampling for accurate representative interactive visualizations, we design a novel data index and an associated sample retrieval plan. Our proposed sampling approach is suitable for a wide variety of visual analytics tasks, e.g., tasks that run aggregate queries of spatiotemporal data. Extensive experiments confirm the superiority of our approach over a state-of-the-art spatial online sampling technique, demonstrating that within the same computational time, data samples generated in our approach are at least 50% more accurate in representing the actual spatial distribution of the data and enable approximate visualizations to present closer visual appearances to the exact ones.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源