论文标题
迈向无监督的离群模型选择
Toward Unsupervised Outlier Model Selection
论文作者
论文摘要
如今,文献中不乏离群检测算法,但是对无监督的离群模型选择(UOMS)的补充和关键问题已经大大研究了。在这项工作中,我们提出了选举,一种选择有效候选模型的新方法,即一个离群检测算法及其超参数,可以在没有任何标签的新数据集中使用。选举本身是基于元学习。在历史数据集上转移与新数据集的先验知识(例如模型性能),以促进UOM。独特的是,它采用了基于性能的数据集相似性度量,该度量比过去使用的其他措施更为直接,更直接和目标驱动。选择适应性地搜索类似的历史数据集,因此,它可以按需提供输出,能够适应不同的时间预算。广泛的实验表明,选择明显优于广泛的基本UOMS基线,包括没有模型选择(始终使用相同的流行模型,例如Iforest)以及基于元用力的最新选择策略。
Today there exists no shortage of outlier detection algorithms in the literature, yet the complementary and critical problem of unsupervised outlier model selection (UOMS) is vastly understudied. In this work we propose ELECT, a new approach to select an effective candidate model, i.e. an outlier detection algorithm and its hyperparameter(s), to employ on a new dataset without any labels. At its core, ELECT is based on meta-learning; transferring prior knowledge (e.g. model performance) on historical datasets that are similar to the new one to facilitate UOMS. Uniquely, it employs a dataset similarity measure that is performance-based, which is more direct and goal-driven than other measures used in the past. ELECT adaptively searches for similar historical datasets, as such, it can serve an output on-demand, being able to accommodate varying time budgets. Extensive experiments show that ELECT significantly outperforms a wide range of basic UOMS baselines, including no model selection (always using the same popular model such as iForest) as well as more recent selection strategies based on meta-features.