论文标题
用于拓扑数据分析的ChristOffel-Darboux内核
The Christoffel-Darboux kernel for topological data analysis
论文作者
论文摘要
持续的同源性已被广泛用于研究$ \ mathbb {r}^n $中点云的拓扑。标准方法对离群值非常敏感,它们的计算复杂性严重取决于数据点的数量。在本文中,我们使用ChristOffel-Darboux内核理论为点云介绍了一个新颖的持久模块。该模块对数据中的(统计)异常值具有鲁棒性,并且可以在数据点数量中进行线性计算。我们用$ \ mathbb {r}^n $中的各种数值示例,以$ n = 1、2、3 $中的各种数值示例说明了新模块的好处和局限性。我们的工作扩展了Christoffel-Darboux内核在统计数据分析和几何推理的最新应用(Lasserre,Pauwels和Putinar,2022年)。在那里,这些内核用于构建多项式,其级别集在精确的意义上捕获了点云的几何形状。我们表明,与Wasserstein距离相对于该多项式的巨型集合过滤相关的持久同源性是稳定的。此外,我们表明,使用Basu&Karisani(2022)的算法,可以在环境尺寸$ n $中以单独的指数时间在环境尺寸$ n $中单独计算这种过滤的同源性。
Persistent homology has been widely used to study the topology of point clouds in $\mathbb{R}^n$. Standard approaches are very sensitive to outliers, and their computational complexity depends badly on the number of data points. In this paper we introduce a novel persistence module for a point cloud using the theory of Christoffel-Darboux kernels. This module is robust to (statistical) outliers in the data, and can be computed in time linear in the number of data points. We illustrate the benefits and limitations of our new module with various numerical examples in $\mathbb{R}^n$, for $n=1, 2, 3$. Our work expands upon recent applications of Christoffel-Darboux kernels in the context of statistical data analysis and geometric inference (Lasserre, Pauwels and Putinar, 2022). There, these kernels are used to construct a polynomial whose level sets capture the geometry of a point cloud in a precise sense. We show that the persistent homology associated to the sublevel set filtration of this polynomial is stable with respect to the Wasserstein distance. Moreover, we show that the persistent homology of this filtration can be computed in singly exponential time in the ambient dimension $n$, using a recent algorithm of Basu & Karisani (2022).