论文标题
具有多种颜色的公平聚类
Fair Clustering with Multiple Colors
论文作者
论文摘要
给出了一个公平的聚类实例,一个数据集$ a $,其中每个点都分配了一些颜色。颜色对应于各种受保护的属性,例如性别,种族或年龄。公平的聚类是一个实例,群集中的点成员与点的颜色不相关。 特别令人感兴趣的是所有颜色平等代表的情况。如果我们有两种颜色,Chierrichetti,Kumar,Lattanzi和Vassilvitskii(NIPS 2017)表明,各种$ K $ clustering目标都承认了恒定的因子近似值。从那时起,许多后续工作一直试图将此结果扩展到多色案例,尽管到目前为止,唯一已知的结果要么导致无构型因子近似,因此仅适用于特殊的聚类目标,例如$ k $ - 中心,产量双列术近似值,或要求$ k $是恒定的。 在本文中,我们将简单地从无约束的$ k $ -clustering降低到公平的$ k $ - 集群,用于各种聚类目标,包括$ k $ -Median,$ k $ -means和$ k $ center。减少仅在近似保证中失去一个恒定因素,标志着许多问题的第一个真实常数因子近似。
A fair clustering instance is given a data set $A$ in which every point is assigned some color. Colors correspond to various protected attributes such as sex, ethnicity, or age. A fair clustering is an instance where membership of points in a cluster is uncorrelated with the coloring of the points. Of particular interest is the case where all colors are equally represented. If we have exactly two colors, Chierrichetti, Kumar, Lattanzi and Vassilvitskii (NIPS 2017) showed that various $k$-clustering objectives admit a constant factor approximation. Since then, a number of follow up work has attempted to extend this result to a multi-color case, though so far, the only known results either result in no-constant factor approximation, apply only to special clustering objectives such as $k$-center, yield bicrititeria approximations, or require $k$ to be constant. In this paper, we present a simple reduction from unconstrained $k$-clustering to fair $k$-clustering for a large range of clustering objectives including $k$-median, $k$-means, and $k$-center. The reduction loses only a constant factor in the approximation guarantee, marking the first true constant factor approximation for many of these problems.