论文标题

Accustripes:自适应套筒,以进行单变量数据分布的视觉比较

AccuStripes: Adaptive Binning for the Visual Comparison of Univariate Data Distributions

论文作者

Heim, Anja, Gröller, Eduard, Heinzl, Christoph

论文摘要

在许多科学学科中,了解和比较数据的分布(例如它们的模式,形状或离群值)是一个普遍的挑战。通常,使用直方图或密度图的并排比较来解决这一挑战。但是,比较多密度图在精神上是苛刻的。均匀的直方图通常代表分布,因为缺少值,离群值或模式被相等大小的分组隐藏。在本文中,介绍了一种新型的概述可视化,用于比较单变量数据分布:Accustripes(即累积条纹)是一种新的视觉隐喻编码数据分布的累积,该数据分布根据适应性bining使用不规则宽度的颜色编码条。我们提供了有关纳宁挑战的详细见解。具体而言,我们探索了不同的自适应式概念,例如贝叶斯块Binning和Jenks自然断裂,以计算binning边界的计算,以尽可能准确地表示数据集的能力。此外,我们讨论了与设计相比可视化的设计所产生的问题:为了比较许多分布,它们的累积表示形式在堆叠模式下相互绘制。根据我们的发现,我们提出了三种不同的布局,以比较多个分布。使用分子方法的统计评估研究了Accustripes的有用性。使用集群分析中的相似性度量,这是显示的,该方法从统计学上产生了最佳的分组结果。通过一项用户研究,我们评估了哪个在视觉上以最直观的形式和调查的分布表示哪种分布,哪种布局允许用户以最轻松的方式比较许多分布。

Understanding and comparing distributions of data (e.g., regarding their modes, shapes, or outliers) is a common challenge in many scientific disciplines. Typically, this challenge is addressed using side-by-side comparisons of histograms or density plots. However, comparing multiple density plots is mentally demanding. Uniform histograms often represent distributions imprecisely since missing values, outliers, or modes are hidden by a grouping of equal size. In this paper, a novel type of overview visualization for the comparison of univariate data distributions is presented: AccuStripes (i.e., accumulated stripes) is a new visual metaphor encoding accumulations of data distributions according to adaptive binning using color coded stripes of irregular width. We provide detailed insights about challenges of binning. Specifically, we explore different adaptive binning concepts such as Bayesian Blocks binning and Jenks Natural Breaks binning for the computation of binning boundaries, in terms of their capabilities to represent the datasets as accurately as possible. In addition, we discuss issues arising with the representation of designs for the comparative visualization of distributions: To allow for a comparison of many distributions, their accumulated representations are plotted below each other in a stacked mode. Based on our findings, we propose three different layouts for comparative visualization of multiple distributions. The usefulness of AccuStripes is investigated using a statistical evaluation of the binning methods. Using a similarity metric from cluster analysis, it is shown, which binning method statistically yields the best grouping results. Through a user study we evaluate, which binning strategy visually represents the distribution in the most intuitive form and investigate, which layout allows the user the comparison of many distributions in the most effortless way.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源