Lumáwig：拓扑数据分析中的尺寸零瓶颈距离计算的有效算法

论文标题

Lumáwig：拓扑数据分析中的尺寸零瓶颈距离计算的有效算法

LUMÁWIG: An Efficient Algorithm for Dimension Zero Bottleneck Distance Computation in Topological Data Analysis

论文作者

Ignacio, Paul Samuel, Bulauan, Jay-Anne, Uminsky, David

论文摘要

在轻微扰动下持续图的稳定性是拓扑数据分析在探索现实世界数据中的有效性和日益普及的关键特征。这种稳定性的核心是使用瓶颈距离，该距离需要在图之间进行匹配点。然而，在实践研究中使用该指标的数量很少，而且很少是由于计算障碍物，尤其是在零维数中，计算成本随数据大小的增长而爆炸。我们提出了Lumáwig，这是一种新型的有效算法，用于计算两个持久图之间的尺寸零瓶颈距离零瓶颈距离，该图表的运行速度明显更快，并且相对于原始算法的输出而言，与其他任何可用算法相比，相对于原始算法的输出的距离明显较高。我们绕过了瓶颈距离的先前实现中压倒性的匹配问题，并证明可以从少数匹配案例中恢复零维瓶颈距离。我们表明，如经验测试所示，Lumáwig通常具有线性复杂性。我们还提出了一个应用程序，该应用程序利用尺寸零持续图和瓶颈距离来产生用于分类任务的功能。

Stability of persistence diagrams under slight perturbations is a key characteristic behind the validity and growing popularity of topological data analysis in exploring real-world data. Central to this stability is the use of Bottleneck distance which entails matching points between diagrams. Use of this metric in practical studies has, however, been few and sparingly because of the computational obstruction, especially in dimension zero where the computational cost explodes with the growth of data size. We present LUMÁWIG, a novel efficient algorithm to compute dimension zero bottleneck distance between two persistent diagrams which runs significantly faster and provides significantly sharper approximates with respect to the output of the original algorithm than any other available algorithm. We bypass the overwhelming matching problem in previous implementations of the bottleneck distance, and prove that the zero dimensional bottleneck distance can be recovered from a very small number of matching cases. We show that LUMÁWIG generally enjoys linear complexity as shown by empirical tests. We also present an application that leverages dimension zero persistence diagrams and the bottleneck distance to produce features for classification tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题