论文标题

在基于图的数据聚类中的2个俱乐部:理论和算法工程

On 2-Clubs in Graph-Based Data Clustering: Theory and Algorithm Engineering

论文作者

Figiel, Aleksander, Himmel, Anne-Sophie, Nichterlein, André, Niedermeier, Rolf

论文摘要

将图表编辑为群集的不相交联合是基于图的数据聚类中的标准优化任务。在这里,补充了群集应为集团的经典作品,我们专注于应为2个clubs的簇,即直径二的子图。这自然会导致两个NP问题2-club群集编辑(允许的编辑操作是边缘插入和边缘删除)和2-club群集顶点删除(允许的编辑操作是顶点删除)。回答文献中的一个空旷的问题,我们表明2-club群集编辑在边缘修改数量方面是w [2] - hard,因此与经典集群编辑问题的固定参数障碍性结果对比(考虑集团而不是2个clubs)。然后重点关注2-club群集顶点删除,很容易被认为是固定参数可拖动的,我们表明,在标准的复杂性理论假设下,当由Vertex Deletions的数量参数化时,它没有多项式大小的问题内核。然而,我们制定了几种有效的数据减少和修剪规则,从而产生竞争性求解器,在大多数情况下,在建立的生物测试数据集中,显然超过了标准的CPLEX求解器。

Editing a graph into a disjoint union of clusters is a standard optimization task in graph-based data clustering. Here, complementing classic work where the clusters shall be cliques, we focus on clusters that shall be 2-clubs, that is, subgraphs of diameter two. This naturally leads to the two NP-hard problems 2-Club Cluster Editing (the allowed editing operations are edge insertion and edge deletion) and 2-Club Cluster Vertex Deletion (the allowed editing operations are vertex deletions). Answering an open question from the literature, we show that 2-Club Cluster Editing is W[2]-hard with respect to the number of edge modifications, thus contrasting the fixed-parameter tractability result for the classic Cluster Editing problem (considering cliques instead of 2-clubs). Then focusing on 2-Club Cluster Vertex Deletion, which is easily seen to be fixed-parameter tractable, we show that under standard complexity-theoretic assumptions it does not have a polynomial-size problem kernel when parameterized by the number of vertex deletions. Nevertheless, we develop several effective data reduction and pruning rules, resulting in a competitive solver, clearly outperforming a standard CPLEX solver in most instances of an established biological test data set.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源