论文标题

连续聚类和设施位置问题的近似算法

Approximation Algorithms for Continuous Clustering and Facility Location Problems

论文作者

Chakrabarty, Deeparnab, Negahbani, Maryam, Sarkar, Ankita

论文摘要

我们考虑了基于中心的聚类问题的近似性,其中要聚类的点位于公制空间中,并且没有指定候选中心。我们称此类问题为“连续”,以区分指定候选中心的“离散”聚类。对于许多目标,人们可以将连续案例减少到离散案例中,并使用$α$ -Approximation算法来使离散案例获得连续情况的$βα$ - Appproximation,其中$β$取决于目标:例如对于$ k $ -Median,$β= 2 $,对于$ K $ -Means,$β= 4 $。我们激励的问题是,$β$的差距是否是固有的,还是有没有简单地减少离散案例的算法更好?在最近的2021年苏打水论文中,Cohen-Addad,Karthik和Lee分别证明了$ 2 $的$ 2 $和$ 4 $硬度,对于连续$ k $ -Median和$ k $ -Means,即使中心$ k $ $ k $是常数的,即恒定$ k $的离散案例在多头可以解决,因此在某些制度中,$β$损失似乎不可避免。 在本文中,我们通过圆形或切割框架进行连续聚类。对于四个连续的聚类问题,我们的表现要优于离散案例的减少。值得注意的是,对于问题$λ$ -UFL,其中$β= 2 $,而离散案例的硬度为$ 1.27 $,我们获得了连续案例的近似值为$ 2.32 <2 \ times 1.27 $。同样,对于连续$ k $ - 英镑,离散案例最著名的近似值为$ 9 $,我们获得了$ 32 <4 \ times 9 $的近似值。关键的挑战是,包括最新群体在内的分散聚类的大多数算法都取决于在连续情况下变为无限大小的线性程序。为了克服这一点,我们为连续案例设计了新的线性程序,这与圆形或切割框架相应。

We consider the approximability of center-based clustering problems where the points to be clustered lie in a metric space, and no candidate centers are specified. We call such problems "continuous", to distinguish from "discrete" clustering where candidate centers are specified. For many objectives, one can reduce the continuous case to the discrete case, and use an $α$-approximation algorithm for the discrete case to get a $βα$-approximation for the continuous case, where $β$ depends on the objective: e.g. for $k$-median, $β= 2$, and for $k$-means, $β= 4$. Our motivating question is whether this gap of $β$ is inherent, or are there better algorithms for continuous clustering than simply reducing to the discrete case? In a recent SODA 2021 paper, Cohen-Addad, Karthik, and Lee prove a factor-$2$ and a factor-$4$ hardness, respectively, for continuous $k$-median and $k$-means, even when the number of centers $k$ is a constant. The discrete case for a constant $k$ is exactly solvable in polytime, so the $β$ loss seems unavoidable in some regimes. In this paper, we approach continuous clustering via the round-or-cut framework. For four continuous clustering problems, we outperform the reduction to the discrete case. Notably, for the problem $λ$-UFL, where $β= 2$ and the discrete case has a hardness of $1.27$, we obtain an approximation ratio of $2.32 < 2 \times 1.27$ for the continuous case. Also, for continuous $k$-means, where the best known approximation ratio for the discrete case is $9$, we obtain an approximation ratio of $32 < 4 \times 9$. The key challenge is that most algorithms for discrete clustering, including the state of the art, depend on linear programs that become infinite-sized in the continuous case. To overcome this, we design new linear programs for the continuous case which are amenable to the round-or-cut framework.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源