论文标题

使用动态项目排序进行有效的约束模式挖掘,以解释分类

Efficient Constrained Pattern Mining Using Dynamic Item Ordering for Explainable Classification

论文作者

Iwashita, Hiroaki, Takagi, Takuya, Suzuki, Hirofumi, Goto, Keisuke, Ohori, Kotaro, Arimura, Hiroki

论文摘要

在过去的几年中,学习可解释的分类模型一直引起了很多关注。发现可以突出两个类之间差异的简洁和对比模式非常重要。这种模式对人类专家有用,可用于构建强大的分类器。在本文中,我们考虑在监督环境中各种约束下,从高维数据集中挖掘最小的新兴模式。我们专注于扩展名,其中模式可以包含指定不存在项目的负面项目。在这种情况下,数据库变得高度致密,并且使采矿更具挑战性,因为流行的模式挖掘技术(例如FP-Tree和发生)无法有效地工作。为了应对这种难度,我们提出了一种有效的算法,用于通过结合两种技术来挖掘最小的新出现模式:在模式搜索过程中的动态变量 - 订购以增强修剪效果,以及使用基于指针的动态数据结构,称为舞蹈链接,称为舞蹈链接,以有效地维持出现列表。基准数据集的实验表明,基于LCM,我们的算法对新兴模式挖掘方法实现了显着的加速,这是LCM,这是一种使用静态变量订购的非常快速的深度优先频繁的项目集矿工。

Learning of interpretable classification models has been attracting much attention for the last few years. Discovery of succinct and contrasting patterns that can highlight the differences between the two classes is very important. Such patterns are useful for human experts, and can be used to construct powerful classifiers. In this paper, we consider mining of minimal emerging patterns from high-dimensional data sets under a variety of constraints in a supervised setting. We focus on an extension in which patterns can contain negative items that designate the absence of an item. In such a case, a database becomes highly dense, and it makes mining more challenging since popular pattern mining techniques such as fp-tree and occurrence deliver do not efficiently work. To cope with this difficulty, we present an efficient algorithm for mining minimal emerging patterns by combining two techniques: dynamic variable-ordering during pattern search for enhancing pruning effect, and the use of a pointer-based dynamic data structure, called dancing links, for efficiently maintaining occurrence lists. Experiments on benchmark data sets showed that our algorithm achieves significant speed-ups over emerging pattern mining approach based on LCM, a very fast depth-first frequent itemset miner using static variable-ordering.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源